Speech to Text in Subtitle Edit 3.6.5 and forward

3rd Aug 2022 12:48 #1
loninappleton

View Profile

View Forum Posts

Private Message
Member

Join Date
Jun 2005

Location
USA
I've recently seen a tutorial from Youtube on the Subtitle Edit ability to translate speech to text.
It requires a number of steps and Youtube videos go too fast for any sort of comprehension for me.
Is there a printed how-to for this feature?

Like the guy says in the tutorial, it looks like a major advance and would avoid me going
to YouTube Create to get my subtitle out of a video mkv etc. for fine tuning.

Quote
3rd Aug 2022 22:46 #2
netmask56

View Profile

View Forum Posts

Private Message
Member

Join Date
Sep 2005

Location
Sydney, Australia
I have to agree with you - there always should be a text transcript of all Youtube "How to" video's and podcasts and news reports. There are many people with hearing defects and sometimes comprehension issues when a printed version is unavailable. IMO pretty basic like you learn in management school, if you are giving a technical lecture full of 3 letter acronyms, all the acronyms used should be spelled out in full and even written on a whiteboard behind or near the speaker.

SONY 75" Full array 200Hz LED TV, Yamaha A1070 amp, Zidoo UHD3000, BeyonWiz PVR V2 (Enigma2 clone), Chromecast, Windows 11 Professional, QNAP NAS TS851

Quote
3rd Aug 2022 23:51 #3
loninappleton

View Profile

View Forum Posts

Private Message
Member

Join Date
Jun 2005

Location
USA
@Netmask56

Thanks for the supportive comments on the issue. Seeing mouse "twiddling" while some point of explanation is going on makes me want to tear something. That's a polite way of putting it. The youtuber who does these, David M., seems to be the only one dedicated to Subtitle Edit You Tube tutorials. Simply put he should be told to have a care in how his presentation appears to others. I'll not do it or try to because he provides a valuable service.

I made some progress today by starting up David M.s Speech to Text tutorial with a piece of cereal box over the upper portion of the screen while I read the CC captions of instruction. I made it through a few steps and this is something valuable for me to learn so I'll keep at it. But still, screen shots in sequence with text description is about the only thing I can _tolerate_. And I mean this twiddling crap is that bad and insufferably annoying to the point of turning it off.

My request is that someone step in to fill this void.

I'll make a separate post on my questions that come up during the tutorial. I have used SE a bit so it's not totally new, just this speech to text element.

[Note I say David M. because I'm prone to mispell his name.]

link is here: https://www.youtube.com/watch?v=jd19iOWpj_4&t=307s

Last edited by loninappleton; 3rd Aug 2022 at 23:56.

Quote
4th Aug 2022 15:37 #4
pcspeak

View Profile

View Forum Posts

Private Message
Member

Join Date
Apr 2007

Location
Australia
This command will download Closed Captions Only (SEdit-CCs.vtt).

Code:

yt-dlp.exe "https://www.youtube.com/watch?v=jd19iOWpj_4&t=307s" --sub-langs en --write-auto-subs -o SEdit-CCs

Which I already did.
SEdit-CCs.7z

Cheers
Last edited by pcspeak; 4th Aug 2022 at 16:21.
Quote
4th Aug 2022 16:30 #5
loninappleton

View Profile

View Forum Posts

Private Message
Member

Join Date
Jun 2005

Location
USA
This is a very good tip I wasn't aware of. Thanks for making it.

Quote
4th Aug 2022 19:45 #6
loninappleton

View Profile

View Forum Posts

Private Message
Member

Join Date
Jun 2005

Location
USA
Originally Posted by pcspeak

This command will download Closed Captions Only (SEdit-CCs.vtt).

Code:

yt-dlp.exe "https://www.youtube.com/watch?v=jd19iOWpj_4&t=307s" --sub-langs en --write-auto-subs -o SEdit-CCs

Which I already did.

[Attachment 66210 - Click to enlarge]

Cheers

Thanks I'll look it over with gratitude. It was an easy download. From our previous chat (I think) I did complete the screen shot sequence for getting into You Tube Create to make an SRT file. But I saw no further interest on that so it's just for my use. This Subtitle Edit feature could replace such a procedure.
Quote
5th Aug 2022 01:14 #7
loninappleton

View Profile

View Forum Posts

Private Message
Member

Join Date
Jun 2005

Location
USA
I'll step away from David M. stuff for a while just to ask if anyone but he has ever worked with the speech to text setup in Subtitle Edit?

If so please respond.

There are questions that the David M. video can't answer. This is one: In going to try to download the English model needed for
VOSK (I think) for two days I have just gotten the screen message that it won't complete and to try again later. This happens
each time. So I'm wondering if anyone else has used this "game changer" feature with any success in even completing setup.
But I've just encountered barriers in the setup in SE I cannot get past.

Quote
5th Aug 2022 07:21 #8
ChasVideo

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2015
Just some ideas. Adobe Premiere pro has speech to text that works in multiple languages. There is a free one week trial and one month costs $32. I paid for one month and I think I can get all the videos I want translated within that time. I can always pay for another month in the future. Kdenlive also has speech to text and is free. But it requires Python for the speech to text. I have not been able to make the Python installation and kdenlive work together. I have another thread asking for help with that but no help yet. I tried the trial of Subtitle Edit and was not impressed. It does not let you try foreign languages with the trial.

Quote
5th Aug 2022 11:54 #9
loninappleton

View Profile

View Forum Posts

Private Message
Member

Join Date
Jun 2005

Location
USA
Thanks for the reply.
I have no knowledge of advanced professional tools like Adobe nor the ability to pay for long term use. It would take me the free month to figure it out is my take on it.

Also I want to get the free tools working to show this can be done outside of Adobe, Sonix or the other pay for tools but within
Windows 7.

What is needed, if this SE is really the "game changer" it is touted to be for this process of audio to text, is a more simple and text oriented explanation of the steps.

Quote
5th Aug 2022 20:13 #10
pcspeak

View Profile

View Forum Posts

Private Message
Member

Join Date
Apr 2007

Location
Australia
Firstly, I created a WAV file from the video using this command.

2wav.cmd

Code:

ffmpeg -i "The Fifth Element.mp4" -c:v none -c:s none -threads 0 -y "The Fifth Element.wav"

I downloaded the portable version of SE366 and extracted the files to a folder.
https://github.com/SubtitleEdit/subtitleedit/releases/download/3.6.6/SE366.zip

Then I ran SubtitleEdit.exe and did this: (See attached video)

You have to wait.... and wait.... and wait... and.... for the process to complete.
The output .SRT needed some work. I copied 'The Fifth Element.srt' -> 'The Fifth Element.txt'
Then opened ''The Fifth Element.txt' with a word processor and spellchecked. That fixed a lot of the errors. Saved as a .TXT file (not .DOC)
Renamed to .SRT. Lots more work with the file needs to be done.

The diction on my test clip was quite clear, with no odd accents. The results were not great for me. ymmv.
Certainly better than nothing, but I'm wondering if going through Google would be the better way.

My screen cap is a bit ordinary.
Cheers.

Attached Files

SubEdit366.mp4 (1.80 MB, 71 views)
Last edited by pcspeak; 5th Aug 2022 at 20:20. Reason: Doubled up on attachment. Oops!
Quote
5th Aug 2022 23:22 #11
loninappleton

View Profile

View Forum Posts

Private Message
Member

Join Date
Jun 2005

Location
USA
Hello pcspeak and all.

I know you are trying to help but I have no knowledge of command line code and so cannot follow much of it. Second I want to
stay in the SE program proper in order to make a guide for anyone I know that wants to make subtitles for ( in my case )
classic plays in the public domain previously Proshot from major theatres. Some of these are on disc and some come
from elsewhere.

My question was why I cannot download the proper language template to SE. Could be 3.6.5 or 3.6.6 etc.

I don't see where this language template would be picked up by your code, but I can't use code. I'm just
trying to get a method for using SE in the way the program provides.

I have loaded a clip with wave form
I have gone to SE > Video > Video/audio to text
in that I've followed the prompts to download ffmpeg
and the prompt to download VOSK 'models' in English. (English becomes visible under the three dots)

That's where the error box pops up and says try again later.
If I can get a good looking screen shot where everything is visible.... ok got it.

From this point I have made no progress. Things like VOSK and the Kali Speech recognition engine are installed
as the setup goes on. They are just terms I do not know the meaning of yet.

Attached Thumbnails

Last edited by loninappleton; 5th Aug 2022 at 23:31.

Quote
5th Aug 2022 23:41 #12
loninappleton

View Profile

View Forum Posts

Private Message
Member

Join Date
Jun 2005

Location
USA
https://www.youtube.com/watch?v=39mP3JMjNao

I was wrong. There is more than one how-to for video/audio to text in SE. This one seems to be a bit more straighforward
but I still follow the text rather than the screen. I mention this for a reason. This guy shows a prompt to
download libvosk. I don't know how to proceed.

But this tutorial is short and I'm give a one time through.

Quote
5th Aug 2022 23:54 #13
loninappleton

View Profile

View Forum Posts

Private Message
Member

Join Date
Jun 2005

Location
USA
The author of this tutorial is Yosef K and he lost me on this one after some initial steps. These Youtube tutes with the tiny screens and tinier mouse pointers need a different format for presentation. Some people can follow them. I have difficulty to the point of giving up. Anyway the download problem is still there. Yosef K mentioned the possibility of bugs in the version he had.

Quote
9th Aug 2022 21:50 #14
TrueCrime

View Profile

View Forum Posts

Private Message
Member

Join Date
Aug 2022
If i understand your question correctly, youre trying to figure out how to use the text to speech feature? If so, it's easy

On a blank "new project" click the video tab
From there click "audio/video to text"
In the models folder, click the 3 dots to download your language
On the input files click add to choose your media file ( note you have 2 option, default is video, but your can click the media type in bottom right and switch it to audio. This allows for quicker upload/processing)

Depending on video length, it will will start processing

Quote
9th Aug 2022 21:55 #15
loninappleton

View Profile

View Forum Posts

Private Message
Member

Join Date
Jun 2005

Location
USA
Ok, when I get on this again I'll look for the "Add" which I don't recall seeing.

Thanks for answering. I don't know how I'll get past selecting the 'model for English download part,
but I'll try it out again.

I copied the steps to paper so I can view them at the same time.

Last edited by loninappleton; 9th Aug 2022 at 22:21.

Quote
9th Aug 2022 23:07 #16
loninappleton

View Profile

View Forum Posts

Private Message
Member

Join Date
Jun 2005

Location
USA
One followup on this.
I'm unclear as to where the completed video/audio to text comes down to the PC. Is it found in downloads in the normal way or captured by Subtitle Edit to the related video/audio clip in some manner? Is it SRT? Things like that.

Quote
11th Aug 2022 02:00 #17
loninappleton

View Profile

View Forum Posts

Private Message
Member

Join Date
Jun 2005

Location
USA
Originally Posted by pcspeak

Firstly, I created a WAV file from the video using this command.

2wav.cmd

Code:

ffmpeg -i "The Fifth Element.mp4" -c:v none -c:s none -threads 0 -y "The Fifth Element.wav"

I downloaded the portable version of SE366 and extracted the files to a folder.
https://github.com/SubtitleEdit/subtitleedit/releases/download/3.6.6/SE366.zip

Then I ran SubtitleEdit.exe and did this: (See attached video)

You have to wait.... and wait.... and wait... and.... for the process to complete.
The output .SRT needed some work. I copied 'The Fifth Element.srt' -> 'The Fifth Element.txt'
Then opened ''The Fifth Element.txt' with a word processor and spellchecked. That fixed a lot of the errors. Saved as a .TXT file (not .DOC)
Renamed to .SRT. Lots more work with the file needs to be done.

The diction on my test clip was quite clear, with no odd accents. The results were not great for me. ymmv.
Certainly better than nothing, but I'm wondering if going through Google would be the better way.

My screen cap is a bit ordinary.
Cheers.

Sorry for the long quote. The clip you made I'm just viewing now-- the one that shows the process (like the Youtube ones). And I've done the steps
up through seeing the three dots. Still the same. I am not getting that next set of menus. In using the steps from the other user, I tried to
get through this without adding the mkv first, and an error box pops up saying "No Video" and I can't continue. Or I get the message about
"try Again Later" in trying to get the model.

But just to try something else, I used the option top screen left to download VOSK from Alpha Cepei It's a huge zip file of an accurate
english model. I unzipped that but don't know it's usage. It offers the option to open with a program. I selected Subtitle Edit. That opened
with some code in the subtitle window but I don't know it's usage.

To sum up I've made no progress. But I'll run the vid you made all the way through. Something is preventing me from getting past
the three dot opens the procedure screen. It just isn't there either with MKV loaded or not.
Quote
11th Aug 2022 02:17 #18
loninappleton

View Profile

View Forum Posts

Private Message
Member

Join Date
Jun 2005

Location
USA
From the video clip I see the full form open at three dots-- whatevs. That full form with the box below with Add etc I am not seeing. Perhaps some error checking is needed to see if I have things installed. I know I did the ffmpeg step and libvosk. Where should I find those things located to be in the right place? How can I error check this or start over?

What I saw in the vid was that the EN model appears on the form-- looks like the same name as what I downloaded from alphaceiphi.

Quote
11th Aug 2022 13:36 #19
loninappleton

View Profile

View Forum Posts

Private Message
Member

Join Date
Jun 2005

Location
USA
I did a reinstall and have seen some progress. Perhaps I missed a prompt at download libvosk. I now have the add box on screen and since no
prompt for ffmpeg I'll assume that is installed correcrtly. I clearly missed something previously so apologies for all the delays and circular questions.

I'll review the videoclip again and follow my notes from the instruction above.

On making those screen tutorials, can you change your screen resolution to Mr. Magoo size type to make following easier?

Quote
13th Aug 2022 00:14 #20
loninappleton

View Profile

View Forum Posts

Private Message
Member

Join Date
Jun 2005

Location
USA
I've made a new screen shot. It shows that I was able to add my clip into the processing area. I hit "Generate" to avoid the errors at the three dots which fails to complete the download.

As you can see a similar message that a download doesn't complete but just says to try later repeats in the Add area.

I had downloaded a model at Alphacephai which may be the wrong one. Where does Subtitle Edit look for the model-- where is the model supposed to land if it downloads? And which is the right model? The one I got just by guessing is called vosk-model-en-us-0.22.zip and is something over 2 gigabytes or something in size unzipped.

That business with WAV files and such-- just too confusing to watch a mouse pointer.

I have a system that works through Google Create. I can deal with that without command line or WAV files etc. I'll come back to this later.

Quote
20th Aug 2022 17:33 #21
loninappleton

View Profile

View Forum Posts

Private Message
Member

Join Date
Jun 2005

Location
USA
I'm still thinking about why the routine in Subtitle Edit for audio to video to speech fails to complete.
The error is generated by SE not Windows 7 Pro which I have used only for many years. Can a problem be traced to how SE handles
the Windows 7 Pro OS?

In my defense, I have able to download Teseract within the SE program with no problem.

For the future, since I have gone to Alphacepei several times to download the model, can this be provided
as an option so that SE can find it?

Quote
20th Aug 2022 17:39 #22
loninappleton

View Profile

View Forum Posts

Private Message
Member

Join Date
Jun 2005

Location
USA
I'm still thinking about why the routine in Subtitle Edit for audio to video to speech fails to complete. The error is generated by SE not Windows 7 Pro which I have used only for many years. Can a problem be traced to how SE handles the old Windows 7 Pro OS compared to Win10 as example? Also in looking at Alphacephei there seems to be Linux style code and refs to Github. But I cannot diagnose things like that.

In my defense, I am able to download Teseract within the SE program with no problem.

For the future, since I have gone to Alphacepei several times to download the model, can that .zip file once unpacked be used to manually install at the proper folder so that SE can find it. The 'download portion' of the routine is the part that fails and gives the error.

Quote
20th Aug 2022 18:03 #23
loninappleton

View Profile

View Forum Posts

Private Message
Member

Join Date
Jun 2005

Location
USA
Progress.

I went to Subtitle Edit and registred, then downloaded the newest 3.6.7 SE.

On this one I got a help message displayed in the screenshot attached. I don't know anything about secure server settings and the like but at least the error is explained a bit.

And I can put the 128 mb version of English I have where it needs to go but haven't done that yet.

Attached Thumbnails

Quote
21st Aug 2022 01:02 #24
loninappleton

View Profile

View Forum Posts

Private Message
Member

Join Date
Jun 2005

Location
USA
Success with initial manual install.

With the help of that message above I was able to manually install the lgraph version of the vosk. Once all of this is done the program works fine. And I've experimented with selecting the larger vosk which after unzipping the file is 2.Gb. I'm hoping that that is more accurate. Errors are common but some of the things I do are in dialect which may cause more errors.

Quote
7th Dec 2022 15:16 #25
loninappleton

View Profile

View Forum Posts

Private Message
Member

Join Date
Jun 2005

Location
USA
Originally Posted by pcspeak

This command will download Closed Captions Only (SEdit-CCs.vtt).

Code:

yt-dlp.exe "https://www.youtube.com/watch?v=jd19iOWpj_4&t=307s" --sub-langs en --write-auto-subs -o SEdit-CCs

Which I already did.

[Attachment 66210 - Click to enlarge]

Cheers

@pcspeak

Can this command to yt-dlp be made in Powershell, or it is entered at a CMD prompt? It's just now that I'm reviewing this thread for subtitling projects to do found in Youtube. I know next to nothing of either Powershell or the CMD prompt either.
Quote

Speech to Text in Subtitle Edit 3.6.5 and forward

Thread Tools

Search Thread

Similar Threads

Adobe Premiere and speech to text transcription

Edit Video from Reserve to Forward

Subtitle edit, warning subtitle contains negative timing codes fix please

Best text-to-speech voice?

Pixelated and laggy video on Subtitle Workshop and Subtitle Edit