Hi everyone,
First of all, i am thrilled how powerful subtitle edit is.
I am generating subtitles for tv shows with Whisper
Engine: Purviews Faster Whisper XXL
Model: large v3 (3.1gb)
It works very well (even it takes 3x the time of the Video) but unfortunately i have the following problem:
All videos are from the same tv show. So there is no technical and no contextual difference.
In most of the videos (not every Video) the subtitle generation starts after some minutes.
Very often the first minutes of the episodes are without subtitle.
Why does this happen?
Does anyone have a solution for this Problem?
It is really annoying to do the subtitles for several minutes by Hand.
+ Reply to Thread
Results 1 to 10 of 10
-
-
I have been using Whisper AI command line version (NOT through Subtitle Edit) for over two years and I have never had this problem. So it is possible that this is a bug with Purview version.
You can try downgrade the accuracy by using the medium model and see if this helps. Large model possibly needs a heavier demand on the resources.
I do get the creation of the subtitles too early and for longer time and I just make adjustment using Subtitle Edit Synchronization option to make it start when the audio actually starts. -
Hi, thank You for the reply.
Of course my notebook is old and maybe this is the Problem.
What do you mean the subtitle started to early. And what did you do?
As you dont use subtitle edit, how do you use Whisper. Maybe an alternative for my usecase -
If your notebook is old, then how come you managed to use the model large?
If you did the transcription just using the notebook CPU then no wonder it took you 3 times the time duration of the video
Usually model large works better using a GPU of at least 12GB VRAM.
A movie like an opera of 3 hours is done in about 1 hour.
Something doesn't seem right here.
My new PC has GPU of 8GB VRAM and if I use model large then the transcription crashes.
Anyway, if you want to install Whisper AI then here is the link
https://github.com/openai/whisper
By starting too early I mean the timestamp is created at 00:00:10.000 and the actual audio starts at 00:02:20.000 so I use Subtitle Edit Synchronization feature to change the start time to 00:02:20.000 which means I add 00:02:10.000 to the first subtitle. Be careful not to add this to the rest of the subtitles but only to the first line. If you notice more lines have incorrect sync then you can repeat the process. -
Why i am using the large model? I thought it might be better in terms of creating the subtitles? I thought it has a quality meaning instead of Performance.
So large and small models is only relevant for Performance?
So maybe the beginning of the subtitles is missing because of Performance? That is interesting.
Is there anyway where i can See which engine, f.e. purview is the best for my Videos?
Language is Not english so its not that easy for it -
model large is considered to be the most accurate way to transcribe audio and if you don't have a GPU of 12GB VRAM then I don't understand how did you manage to transcribe with model large. Try model medium and see if that helps you might also get shorter transcription time than 3X
What language is used in your videos? English gives the best accuracy but with other languages you also get very good results.
If you are only interested in watching subtitles then you might find Google Chrome live captions helpful as it can show subtitles "on the fly" but the txt is images so you can't save the text. Also it doesn't produce timestamps just a flow of text which is not the most accurate but can be helpful.
Check this link
https://forum.videohelp.com/threads/418144-Tip-How-to-Transcribe-and-Translate-Your-Vi...ry#post2773267 -
Whenever I ran Large_v3 the program suggested I use v2. So that's what I run now.
-
So, These are the options for me.
Language is turkish so which i should take?
On the fly is not an Option.Last edited by Holsteiner22; 1st May 2025 at 14:08.
-
4GB VRAM should be enough for large model and Faster-Whisper-XXL, but it depends from GPU model.
Use "--vad_method pyannote_v3", or better "--ff_vocal_extract mb-roformer --vad_method pyannote_v3" and not in SE if you drop audio files there.Last edited by VoodooFX; 5th May 2025 at 11:56.
Similar Threads
-
Improvements to Whisper in Subtitle Edit
By loninappleton in forum SubtitleReplies: 20Last Post: 15th Jan 2025, 17:40 -
Subtitle Edit and Whisper
By koberulz in forum SubtitleReplies: 23Last Post: 13th Jan 2025, 02:26 -
Subtitle Edit using whisper no English
By Albertos22 in forum Newbie / General discussionsReplies: 2Last Post: 9th Sep 2023, 12:08 -
Subtitle Edit hangs in long Whisper speech to text transfer
By loninappleton in forum SubtitleReplies: 25Last Post: 19th Jun 2023, 22:51 -
Whisper engines in Subtitle Edit
By loninappleton in forum SubtitleReplies: 0Last Post: 16th May 2023, 23:20