I was wondering who can say what General or master setting may interfere with timing subs-- like introducing a minimum time between subs after a hand edit is made in Subtitle Edit. I've seen the General Settings screen, I just don't know much about handling what it says or does.
This is not the only curiosity I've found in fine tuning my subtitle made from a VOSK voice to text which usually requires a lot of corrections in SE.
Other thing was, unbeknownst to me, a 13 second error would crop up-- always 13 seconds which had to readjust in a previous duration field. I try to be careful with the timings so I don't know if the program's various calculations is making adjustments for minimums and such. The General Settings Minimum space between subs default is pretty small.
For some of this I've learned to use the F2 subtitle- as- text option to visually inspect what's going on.
SE is just about all I know. I tried Aegis Sub long ago but doubt it would make these things easier since SE is updated all the time.
If it's a data entry error, I don't know how to correct what I'm doing.
[edit] One thing that may need tuning is frame rate where the default is set to 23.975 vs the needed frame rate for this job at 25 fps.
+ Reply to Thread
Results 1 to 23 of 23
-
Last edited by loninappleton; 15th Apr 2023 at 18:33. Reason: additional
-
I used SE 3.6.12 Portable.
https://github.com/SubtitleEdit/subtitleedit/releases/download/3.6.12/SE3612.zip
I unpacked it to a folder on my second drive.
D:\SE3612\
My suggestion to you for a better initial result:
Try creating your subtitles using 'Video/Audio to text (Whisper)'
It gave me a far better result than Vosk on a test video, with British accents.
I downloaded the Whisper 'medium.en' model which is 1.42 GB
The 'base.en' model was not nearly as effective, go for the medium model.
I've not tried the 'large' model, which is 2.88 GB.
One downloads Whisper models using the same method as used with Vosk.
Example of the following line in the video.
Speaker's head turned aside.
It's a barman asking a question of a patron. "Same again, Jack?"
THE question mark is missed by both methods.
The Whisper model did a very good job.
The Vosk model was off with timing and 'Count of events'. Line breaks could also have been better.
WHISPER (medium.en)
31
00:01:33,001 --> 00:01:37,000
Same again Jack.
32
00:01:37,001 --> 00:01:41,000
Yeah we need to escalate our plans.
37
00:01:35,520 --> 00:01:36,520
Swimming anger.
38
00:01:39,030 --> 00:01:40,800
Yeah we need to
escalate our plans.
I got the same result with two Vosk models.
ymmv
Cheers. -
It's interesting to hear this is a VOSK problem.
I've attempted to use Whisper several times. It depends, I guess, on the install method or some such. I've even asked SE about it directly. What happens is I have the install made to get Whisper going in the SE submenu, then in execution it just loops. It's a known issue not sorted by Whisper, Github, or however development works. This happens on old gear I have and a current (relatively speaking) Ryzen 2400g as well.
I'm accepting the rough nature of VOSK. What I'm asking about is why SE makes what I see as strange changes to a completed duration field how to say 'after the fact' of doing the line and moving on. It's something I have to inspect and alter not for the same line but again and again as the corrections continue and backtracking visually to see what happened. It could be some error I'm making repeatedly but if I cannot catch it. -
Whisper bugs are not SE bugs. Maybe you are using some very small model, what Whisper version do you use?
Try faster-whisper from here: https://github.com/Purfview/whisper-standalone-win
VOSK is no match for Whisper. -
Yes I fully agree it is a Whisper problem. Pretty sure I put in the Medium 1.8Gb version of the model.
If your standalone version at Github can be handled simply I'd certainly try it. What I found previously with instructions for making/using a scratch version was it was beyond my grasp. No You Tube tutorials please. I've mentioned here often I cannot follow flea-sized pointers on those. Perhaps a guide can be made without Github desktop showing up and the rest of it in screen shot format. It would certainly be of value. -
-
Repos and such I recall from practicing and failing with Linux.
I will ask while I'm on this why Github seems inscrutible to the casual user?. No doubt it makes sense to the regulars, so when you say ask
at Github I do not recall seeing a forum layout that can be easily accessed.
I'm willing to do things that are carefully explained. What I can't stand is seeing one error message after another saying 'everything you know is wrong--
permission denied.' What is needed (and perhaps the gitbub tool has it is) everything that auto-installs that is needed for a finished a nd error checked executable, preferably with drag and drop. That's not too much to ask any more. -
I am on the trail of Fast Whisper. I will install it to Win10 which I set up primarily to run Whisper in Subtitle Edit. As we've said, that just isn't working on a Ryzen 3 2400g using an Asus Prime A320M-K and the onboard VGA. Are there an Troubleshpooting checls to do with that?
Anything about command line executions I'll have to take step by step.
Perhaps you can use that "code" format with something to cut and paste.
Thanks for the replies on this. If once established it can be used by others like me who want the function but don't know a path to get there. -
"Issues" or "Discussions" are places to ask/post something on GitHub, look at this screenshot:
Check this guide on Youtube: https://www.youtube.com/watch?v=A3nwRCV-bTU -
Oh no, no dreaded Youtube guides for me.
So far, I have located Whisper-Fast, downloaded the over a gigabyte of that to Win10 on a Ryzen PC, unzipped it and ran install.
I don't see a clear direction from there -- no .exe 's that are easily identifyable but loads of dlls. So if there's a command line routine,
best I should be sure to have Whisper model and whatevs installed for this. I don't know how to procede. In truth I'd much rather use the
auto installs in Subtitle Edit.
Is there some core logic problem such as others have Intel kernel and it Whisper runs but AMD has not? I just don't get it. -
I'm giving the Whisper installed to SE another try. I set the model as small. I extracted a .WAV version of the audio in Audacity since I understand that to have the most detail and is used in stand alone uses. Volume is set all the way up but I have no speaker attached-- whatever takes place is internal.
What I see at bottom of the form is a solid green 'progress bar' all the way to the limit. In other words, this Whisper install is not showing any progress or how long it might take. The WAV I have is 1 hr 34 mins b/c it's a piece I want to use. I see a small amount of disk activity on the little SSD I have on this build.
What do you see in the progress bar at screen bottom when running audio to text in Subtitle Edit ? A screen shot would help. I can make one if needed but it doesn't show anything.
[edit] I shut it down. It did not close properly either-- screen froze and I had to go to restart the PC.Last edited by loninappleton; 22nd Apr 2023 at 01:38.
-
Perhaps start with a shorter clip.
You mostly want to see if things work.
The following will do that.
Code:ffmpeg.exe -i "MY Video.mp4" -ss 00:14:00 -to 00:17:00 -c copy -y "My Video-3min.mp4"
Change the start & stop times to suit. Just some video with a few lines of dialogue.
Start SE - 'Video/Audio to text (Whisper)...
Locate and load the video. (DON'T point to an audio file, SE will extract the audio from the video in the format the program wants.)
Generate.
A 90 minute video on my PC takes about 4 hours using the 1.4 Gig Whisper model.
Cheers -
If you don't have dedicated GPU then smaller CPU only release is enough for you.
There is no install to run, it's ready to use after unzipping.
Then what you have "installed" if you don't see .exe file?
For OpenAI-Whisper and Faster-Whisper there is no "auto installs" in Subtitle Edit.
Why you refuse to see Youtube guide? Google for non-youtube guide then.Last edited by VoodooFX; 22nd Apr 2023 at 05:08.
-
Don't extract audio, just feed video as is [or cut a shorter sample like pcspeak wrote].
Btw, some Whisper's stuff doesnt work properly in v3.6.12, you need latest SE beta, it's portable.
There is no progress bar in SE for Whisper, if you want to see progress then you need to run Whisper in command-line.Last edited by VoodooFX; 22nd Apr 2023 at 05:46.
-
Last edited by VoodooFX; 22nd Apr 2023 at 05:57.
-
I will let this go for a while. Too many things are unclear and just leads to frustration for me. Things about GPU card, no GPU card-- doesn't run on S 3.6.12 etc. I will wait for the next regular release of SE and see what happens. But in conclusion, if the routine of audio to voice in SE locks up everything, I see that as a major fix needed.
Thanks to all who have answered. -
-
I will review the guidelines given.
But on the original question on SE, in the General Settings, what would happen if I set everything to zero?
In other words I think that the Gen settings are adding time (space between subs as example) after I am manually adjusting the speech to audio. The corrections I make from a VOSK translate are line by line and detailed. I can't pinpoint it, but I think I am doing the same dialog lines repeatedly b/c of some changes made in a recalculation by the General Settings.
Who can say one way or the other if this is the case? -
I made this long quote as I am doing a test with SE Whisper.
OK, start the audio to text first....
at the new window Press ADD and go to the folder/file
Select the file
--> The clip is about 7 mins.
A brief glimpse of a load activity down by the (inactive) green progress bar.
I'll give it a few hours and see if anything completes.
This is done on a Ryzen 2400g in Windows 10 as required for Whisper on the small model. The audio should be clear English in a simple documentary format.
Is Vloume an issue on the PC? There's no speaker on that computer setup. -
Success.
I have my short clip in SRT for the small model.
I'm doing a similar one from the same group in the medium model.
A bit of text info will say it is unpacking the the audio nearly too fast to even see.
the 'transcribing text' message stays on screen after the routine runs where a box comes up middle of screen to say the
piece has been saved. -
an update.
After using the small model on a same size and type video as above I tried the medium size model. That failed to complete even running overnight. I am using the same video clip again on the smaller model to see if that completes.
[edit]
Time passes.
Once again the small model completesLast edited by loninappleton; 28th Apr 2023 at 15:11.
-
-
Check this guide on Youtube: https://www.youtube.com/watch?v=A3nwRCV-bTU[/QUOTE]
I see now that whatever came up on Yt as 15 secs was not the item you had linked to-- was next in line or something. I may view the longer one if I can do it on audio.
I have remembered enough CMD commands to put The Whisper-Faster folder in C;\
and navigated to it in CMD to check it.
I put a small sample m4v in root c:\ but not inside the Whisper folder.
Similar Threads
-
Formatting the subtitle breaks in Subtitle Edit
By loninappleton in forum SubtitleReplies: 0Last Post: 18th Aug 2022, 15:11 -
Subtitle edit, warning subtitle contains negative timing codes fix please
By jraju in forum Newbie / General discussionsReplies: 1Last Post: 16th Dec 2019, 18:52 -
This dummy can't sync subs with Subtitle Edit or Subtitle Workshop :-(
By inspector_plodder in forum SubtitleReplies: 6Last Post: 24th Aug 2018, 18:33 -
Pixelated and laggy video on Subtitle Workshop and Subtitle Edit
By Valerc in forum SubtitleReplies: 5Last Post: 15th Jul 2018, 11:37 -
Pixelated and laggy video on Subtitle Workshop and Subtitle Edit
By Valerc in forum Newbie / General discussionsReplies: 2Last Post: 13th Jun 2018, 14:15