Subtitle Edit fine tuning General Settings

15th Apr 2023 18:19 #1
loninappleton

View Profile

View Forum Posts

Private Message
Member

Join Date
Jun 2005

Location
USA
I was wondering who can say what General or master setting may interfere with timing subs-- like introducing a minimum time between subs after a hand edit is made in Subtitle Edit. I've seen the General Settings screen, I just don't know much about handling what it says or does.

This is not the only curiosity I've found in fine tuning my subtitle made from a VOSK voice to text which usually requires a lot of corrections in SE.

Other thing was, unbeknownst to me, a 13 second error would crop up-- always 13 seconds which had to readjust in a previous duration field. I try to be careful with the timings so I don't know if the program's various calculations is making adjustments for minimums and such. The General Settings Minimum space between subs default is pretty small.

For some of this I've learned to use the F2 subtitle- as- text option to visually inspect what's going on.

SE is just about all I know. I tried Aegis Sub long ago but doubt it would make these things easier since SE is updated all the time.

If it's a data entry error, I don't know how to correct what I'm doing.

[edit] One thing that may need tuning is frame rate where the default is set to 23.975 vs the needed frame rate for this job at 25 fps.

Last edited by loninappleton; 15th Apr 2023 at 18:33. Reason: additional

Quote
16th Apr 2023 00:15 #2
pcspeak

View Profile

View Forum Posts

Private Message
Member

Join Date
Apr 2007

Location
Australia
I used SE 3.6.12 Portable.
https://github.com/SubtitleEdit/subtitleedit/releases/download/3.6.12/SE3612.zip
I unpacked it to a folder on my second drive.
D:\SE3612\

My suggestion to you for a better initial result:
Try creating your subtitles using 'Video/Audio to text (Whisper)'
It gave me a far better result than Vosk on a test video, with British accents.
I downloaded the Whisper 'medium.en' model which is 1.42 GB
The 'base.en' model was not nearly as effective, go for the medium model.
I've not tried the 'large' model, which is 2.88 GB.
One downloads Whisper models using the same method as used with Vosk.

Example of the following line in the video.
Speaker's head turned aside.
It's a barman asking a question of a patron. "Same again, Jack?"
THE question mark is missed by both methods.

The Whisper model did a very good job.
The Vosk model was off with timing and 'Count of events'. Line breaks could also have been better.

WHISPER (medium.en)

31
00:01:33,001 --> 00:01:37,000
Same again Jack.

32
00:01:37,001 --> 00:01:41,000
Yeah we need to escalate our plans.

VOSK (vosk-model-en-us-0.22)

37
00:01:35,520 --> 00:01:36,520
Swimming anger.

38
00:01:39,030 --> 00:01:40,800
Yeah we need to
escalate our plans.

I've no idea how the "Swimming anger" comes about.
I got the same result with two Vosk models.

ymmv

Cheers.

Quote
16th Apr 2023 09:03 #3
loninappleton

View Profile

View Forum Posts

Private Message
Member

Join Date
Jun 2005

Location
USA
It's interesting to hear this is a VOSK problem.

I've attempted to use Whisper several times. It depends, I guess, on the install method or some such. I've even asked SE about it directly. What happens is I have the install made to get Whisper going in the SE submenu, then in execution it just loops. It's a known issue not sorted by Whisper, Github, or however development works. This happens on old gear I have and a current (relatively speaking) Ryzen 2400g as well.

I'm accepting the rough nature of VOSK. What I'm asking about is why SE makes what I see as strange changes to a completed duration field how to say 'after the fact' of doing the line and moving on. It's something I have to inspect and alter not for the same line but again and again as the corrections continue and backtracking visually to see what happened. It could be some error I'm making repeatedly but if I cannot catch it.

Quote
20th Apr 2023 21:21 #4
VoodooFX

View Profile

View Forum Posts

Private Message
Video Damager

Join Date
Oct 2021

Location
At Doom9
Originally Posted by loninappleton

I've even asked SE about it directly. What happens is I have the install made to get Whisper going in the SE submenu, then in execution it just loops. It's a known issue not sorted by Whisper, Github, or however development works.

Whisper bugs are not SE bugs. Maybe you are using some very small model, what Whisper version do you use?
Try faster-whisper from here: https://github.com/Purfview/whisper-standalone-win

VOSK is no match for Whisper.

InpaintDelogo - advanced logo removal & hardcoded subtitles extraction
Standalone Faster-Whisper - Portable AI auto-transcription-translation

Quote
20th Apr 2023 22:47 #5
loninappleton

View Profile

View Forum Posts

Private Message
Member

Join Date
Jun 2005

Location
USA
Originally Posted by VoodooFX

Originally Posted by loninappleton

I've even asked SE about it directly. What happens is I have the install made to get Whisper going in the SE submenu, then in execution it just loops. It's a known issue not sorted by Whisper, Github, or however development works.

Whisper bugs are not SE bugs. Maybe you are using some very small model, what Whisper version do you use?
Try faster-whisper from here: https://github.com/Purfview/whisper-standalone-win

VOSK is no match for Whisper.

Yes I fully agree it is a Whisper problem. Pretty sure I put in the Medium 1.8Gb version of the model.

If your standalone version at Github can be handled simply I'd certainly try it. What I found previously with instructions for making/using a scratch version was it was beyond my grasp. No You Tube tutorials please. I've mentioned here often I cannot follow flea-sized pointers on those. Perhaps a guide can be made without Github desktop showing up and the rest of it in screen shot format. It would certainly be of value.

Quote
20th Apr 2023 23:22 #6
VoodooFX

View Profile

View Forum Posts

Private Message
Video Damager

Join Date
Oct 2021

Location
At Doom9
Originally Posted by loninappleton

If your standalone version at Github can be handled simply I'd certainly try it.

Usage is very simple, probably all info you would need is fitted in the repo's front page.

Originally Posted by loninappleton

What I found previously with instructions for making/using a scratch version was it was beyond my grasp. No You Tube tutorials please. I've mentioned here often I cannot follow flea-sized pointers on those. Perhaps a guide can be made without Github desktop showing up and the rest of it in screen shot format. It would certainly be of value.

I don't understand what issues you had, post on GitHub or here with your current issue[s].

InpaintDelogo - advanced logo removal & hardcoded subtitles extraction
Standalone Faster-Whisper - Portable AI auto-transcription-translation

Quote
21st Apr 2023 14:39 #7
loninappleton

View Profile

View Forum Posts

Private Message
Member

Join Date
Jun 2005

Location
USA
Repos and such I recall from practicing and failing with Linux.
I will ask while I'm on this why Github seems inscrutible to the casual user?. No doubt it makes sense to the regulars, so when you say ask
at Github I do not recall seeing a forum layout that can be easily accessed.

I'm willing to do things that are carefully explained. What I can't stand is seeing one error message after another saying 'everything you know is wrong--
permission denied.' What is needed (and perhaps the gitbub tool has it is) everything that auto-installs that is needed for a finished a nd error checked executable, preferably with drag and drop. That's not too much to ask any more.

Quote
21st Apr 2023 14:53 #8
loninappleton

View Profile

View Forum Posts

Private Message
Member

Join Date
Jun 2005

Location
USA
I am on the trail of Fast Whisper. I will install it to Win10 which I set up primarily to run Whisper in Subtitle Edit. As we've said, that just isn't working on a Ryzen 3 2400g using an Asus Prime A320M-K and the onboard VGA. Are there an Troubleshpooting checls to do with that?

Anything about command line executions I'll have to take step by step.

Perhaps you can use that "code" format with something to cut and paste.

Thanks for the replies on this. If once established it can be used by others like me who want the function but don't know a path to get there.

Quote
21st Apr 2023 18:03 #9
VoodooFX

View Profile

View Forum Posts

Private Message
Video Damager

Join Date
Oct 2021

Location
At Doom9
Originally Posted by loninappleton

at Github I do not recall seeing a forum layout that can be easily accessed.

"Issues" or "Discussions" are places to ask/post something on GitHub, look at this screenshot:

Originally Posted by loninappleton

Anything about command line executions I'll have to take step by step.

Check this guide on Youtube: https://www.youtube.com/watch?v=A3nwRCV-bTU

InpaintDelogo - advanced logo removal & hardcoded subtitles extraction
Standalone Faster-Whisper - Portable AI auto-transcription-translation

Quote
21st Apr 2023 23:01 #10
loninappleton

View Profile

View Forum Posts

Private Message
Member

Join Date
Jun 2005

Location
USA
Oh no, no dreaded Youtube guides for me.

So far, I have located Whisper-Fast, downloaded the over a gigabyte of that to Win10 on a Ryzen PC, unzipped it and ran install.

I don't see a clear direction from there -- no .exe 's that are easily identifyable but loads of dlls. So if there's a command line routine,
best I should be sure to have Whisper model and whatevs installed for this. I don't know how to procede. In truth I'd much rather use the
auto installs in Subtitle Edit.

Is there some core logic problem such as others have Intel kernel and it Whisper runs but AMD has not? I just don't get it.

Quote
21st Apr 2023 23:32 #11
loninappleton

View Profile

View Forum Posts

Private Message
Member

Join Date
Jun 2005

Location
USA
I'm giving the Whisper installed to SE another try. I set the model as small. I extracted a .WAV version of the audio in Audacity since I understand that to have the most detail and is used in stand alone uses. Volume is set all the way up but I have no speaker attached-- whatever takes place is internal.

What I see at bottom of the form is a solid green 'progress bar' all the way to the limit. In other words, this Whisper install is not showing any progress or how long it might take. The WAV I have is 1 hr 34 mins b/c it's a piece I want to use. I see a small amount of disk activity on the little SSD I have on this build.

What do you see in the progress bar at screen bottom when running audio to text in Subtitle Edit ? A screen shot would help. I can make one if needed but it doesn't show anything.

[edit] I shut it down. It did not close properly either-- screen froze and I had to go to restart the PC.

Last edited by loninappleton; 22nd Apr 2023 at 01:38.

Quote
22nd Apr 2023 01:31 #12
pcspeak

View Profile

View Forum Posts

Private Message
Member

Join Date
Apr 2007

Location
Australia
Perhaps start with a shorter clip.
You mostly want to see if things work.

The following will do that.

Code:

ffmpeg.exe -i "MY Video.mp4" -ss 00:14:00 -to 00:17:00 -c copy -y "My Video-3min.mp4"

Change the video name to suit.
Change the start & stop times to suit. Just some video with a few lines of dialogue.

Start SE - 'Video/Audio to text (Whisper)...
Locate and load the video. (DON'T point to an audio file, SE will extract the audio from the video in the format the program wants.)
Generate.

A 90 minute video on my PC takes about 4 hours using the 1.4 Gig Whisper model.

Cheers
Quote
22nd Apr 2023 04:52 #13
VoodooFX

View Profile

View Forum Posts

Private Message
Video Damager

Join Date
Oct 2021

Location
At Doom9
Originally Posted by loninappleton

So far, I have located Whisper-Fast, downloaded the over a gigabyte of that to Win10 on a Ryzen PC, unzipped it and ran install.

If you don't have dedicated GPU then smaller CPU only release is enough for you.
There is no install to run, it's ready to use after unzipping.

Originally Posted by loninappleton

I don't see a clear direction from there -- no .exe 's that are easily identifyable but loads of dlls.

Then what you have "installed" if you don't see .exe file?

Originally Posted by loninappleton

In truth I'd much rather use the auto installs in Subtitle Edit.

For OpenAI-Whisper and Faster-Whisper there is no "auto installs" in Subtitle Edit.

Originally Posted by loninappleton

So if there's a command line routine,
best I should be sure to have Whisper model and whatevs installed for this. I don't know how to procede.

Why you refuse to see Youtube guide? Google for non-youtube guide then.

Last edited by VoodooFX; 22nd Apr 2023 at 05:08.

InpaintDelogo - advanced logo removal & hardcoded subtitles extraction
Standalone Faster-Whisper - Portable AI auto-transcription-translation

Quote
22nd Apr 2023 05:07 #14
VoodooFX

View Profile

View Forum Posts

Private Message
Video Damager

Join Date
Oct 2021

Location
At Doom9
Originally Posted by loninappleton

I'm giving the Whisper installed to SE another try. I set the model as small. I extracted a .WAV version of the audio in Audacity...

Don't extract audio, just feed video as is [or cut a shorter sample like pcspeak wrote].
Btw, some Whisper's stuff doesnt work properly in v3.6.12, you need latest SE beta, it's portable.

Originally Posted by loninappleton

What do you see in the progress bar at screen bottom when running audio to text in Subtitle Edit ?

There is no progress bar in SE for Whisper, if you want to see progress then you need to run Whisper in command-line.

Last edited by VoodooFX; 22nd Apr 2023 at 05:46.

InpaintDelogo - advanced logo removal & hardcoded subtitles extraction
Standalone Faster-Whisper - Portable AI auto-transcription-translation

Quote
22nd Apr 2023 05:37 #15
VoodooFX

View Profile

View Forum Posts

Private Message
Video Damager

Join Date
Oct 2021

Location
At Doom9
Originally Posted by pcspeak

SE will extract the audio from the video in the format the program wants.

This is a bit of a problem if you use Faster-Whisper, as results from internal Faster-Whisper's PyAV are ~significantly better [~"medium vs large models" difference].
So, in CLI you can expect better results than in SE.

Last edited by VoodooFX; 22nd Apr 2023 at 05:57.

InpaintDelogo - advanced logo removal & hardcoded subtitles extraction
Standalone Faster-Whisper - Portable AI auto-transcription-translation

Quote
22nd Apr 2023 11:54 #16
loninappleton

View Profile

View Forum Posts

Private Message
Member

Join Date
Jun 2005

Location
USA
I will let this go for a while. Too many things are unclear and just leads to frustration for me. Things about GPU card, no GPU card-- doesn't run on S 3.6.12 etc. I will wait for the next regular release of SE and see what happens. But in conclusion, if the routine of audio to voice in SE locks up everything, I see that as a major fix needed.

Thanks to all who have answered.

Quote
22nd Apr 2023 14:05 #17
VoodooFX

View Profile

View Forum Posts

Private Message
Video Damager

Join Date
Oct 2021

Location
At Doom9
Originally Posted by loninappleton

I will wait for the next regular release of SE and see what happens.

Nothing will happen, you'll still be frustrated.

Originally Posted by loninappleton

I see that as a major fix needed.

This one is easy, get better CPU/GPU.

InpaintDelogo - advanced logo removal & hardcoded subtitles extraction
Standalone Faster-Whisper - Portable AI auto-transcription-translation

Quote
23rd Apr 2023 14:10 #18
loninappleton

View Profile

View Forum Posts

Private Message
Member

Join Date
Jun 2005

Location
USA
I will review the guidelines given.

But on the original question on SE, in the General Settings, what would happen if I set everything to zero?
In other words I think that the Gen settings are adding time (space between subs as example) after I am manually adjusting the speech to audio. The corrections I make from a VOSK translate are line by line and detailed. I can't pinpoint it, but I think I am doing the same dialog lines repeatedly b/c of some changes made in a recalculation by the General Settings.

Who can say one way or the other if this is the case?

Quote
27th Apr 2023 22:09 #19
loninappleton

View Profile

View Forum Posts

Private Message
Member

Join Date
Jun 2005

Location
USA
Originally Posted by pcspeak

Perhaps start with a shorter clip.
You mostly want to see if things work.

The following will do that.

Code:

ffmpeg.exe -i "MY Video.mp4" -ss 00:14:00 -to 00:17:00 -c copy -y "My Video-3min.mp4"

Change the video name to suit.
Change the start & stop times to suit. Just some video with a few lines of dialogue.

For this code I guess it's for Whisper Fast, is the code simply entered at CMD?

-----

Start SE - 'Video/Audio to text (Whisper)...
Locate and load the video. (DON'T point to an audio file, SE will extract the audio from the video in the format the program wants.)
Generate.

A 90 minute video on my PC takes about 4 hours using the 1.4 Gig Whisper model.

Cheers

I made this long quote as I am doing a test with SE Whisper.

OK, start the audio to text first....

at the new window Press ADD and go to the folder/file

Select the file

--> The clip is about 7 mins.

A brief glimpse of a load activity down by the (inactive) green progress bar.

I'll give it a few hours and see if anything completes.

This is done on a Ryzen 2400g in Windows 10 as required for Whisper on the small model. The audio should be clear English in a simple documentary format.

Is Vloume an issue on the PC? There's no speaker on that computer setup.
Quote
27th Apr 2023 23:07 #20
loninappleton

View Profile

View Forum Posts

Private Message
Member

Join Date
Jun 2005

Location
USA
Success.

I have my short clip in SRT for the small model.

I'm doing a similar one from the same group in the medium model.

A bit of text info will say it is unpacking the the audio nearly too fast to even see.

the 'transcribing text' message stays on screen after the routine runs where a box comes up middle of screen to say the
piece has been saved.

Quote
28th Apr 2023 11:44 #21
loninappleton

View Profile

View Forum Posts

Private Message
Member

Join Date
Jun 2005

Location
USA
an update.

After using the small model on a same size and type video as above I tried the medium size model. That failed to complete even running overnight. I am using the same video clip again on the smaller model to see if that completes.

[edit]

Time passes.

Once again the small model completes

Last edited by loninappleton; 28th Apr 2023 at 15:11.

Quote
2nd May 2023 14:35 #22
loninappleton

View Profile

View Forum Posts

Private Message
Member

Join Date
Jun 2005

Location
USA
Originally Posted by VoodooFX

Originally Posted by loninappleton

at Github I do not recall seeing a forum layout that can be easily accessed.

"Issues" or "Discussions" are places to ask/post something on GitHub, look at this screenshot:

Originally Posted by loninappleton

Anything about command line executions I'll have to take step by step.

Check this guide on Youtube: https://www.youtube.com/watch?v=A3nwRCV-bTU

the link is 15 seconds long. I avoided it for the reasons stated earlier but that is not the issue.
An alternative might be to use Powershell (which I know how to find and load) since the interface is different and may offer
a better path to use code strings.

Quote
4th May 2023 00:26 #23
loninappleton

View Profile

View Forum Posts

Private Message
Member

Join Date
Jun 2005

Location
USA
Originally Posted by loninappleton

Anything about command line executions I'll have to take step by step.

Check this guide on Youtube: https://www.youtube.com/watch?v=A3nwRCV-bTU[/QUOTE]

I see now that whatever came up on Yt as 15 secs was not the item you had linked to-- was next in line or something. I may view the longer one if I can do it on audio.

I have remembered enough CMD commands to put The Whisper-Faster folder in C;\
and navigated to it in CMD to check it.

I put a small sample m4v in root c:\ but not inside the Whisper folder.

Quote

Subtitle Edit fine tuning General Settings

Thread Tools

Search Thread

Similar Threads

Formatting the subtitle breaks in Subtitle Edit

Subtitle edit, warning subtitle contains negative timing codes fix please

This dummy can't sync subs with Subtitle Edit or Subtitle Workshop :-(

Pixelated and laggy video on Subtitle Workshop and Subtitle Edit

Pixelated and laggy video on Subtitle Workshop and Subtitle Edit