Hello,
I use SubtitleEdit 4.0.13 (latest version to date) and I see that there are different profiles available for download regarding Whisper (Audio to Text).
For example:
-medium.en (1.42GB)
-medium.en_q5_0 (539MB)
-large-v3-turbo_q5_0 (547MB)
I would like to know which one gives the best results in terms of recognition ? (process time does not matter for me).
I also see that there is a choice between different "engines":
-OpenAI
-Purfview's Faster-Whisper-XXL
-CPP (by default)
-CPP cuBLAS
-const-me
-stable-ts
-WhisperX
and I don't know which one I should choose (as I don't know what it means).
This is important because with CCP (selected by default), I get 2 times the same sentences (on some films) in the generated subtitle file... so this cost me a lot of work to correct the subtitle file
Thank you
		
			+ Reply to Thread
			
		
		
		
			
	
	
				Results 1 to 14 of 14
			
		- 
	Last edited by Nounours18200; 9th Sep 2025 at 08:47. 
- 
	I run faster-whisper-xxl in stand alone mode. When I first started using it I ran with Large V3. Strange as it is that version gave me a message to the effect that I would get better results if I use Large V2. So that's all I use now. 
- 
	The best and most accurate results is when you use the large model. 
 BUT you will need a GPU of at least 12GB VRAM otherwise it will crash.
 I have a PC with 8GB VRAM and the best model I can use is the medium.
 The link below gives you an idea about the model vs, GPU VRAM
 See Available models and languages section.
 
 I don't use SubtitleEdit for transcription. I use OpenAI (command line) because it has more flexible options to try and get bet results.
 Tip: Always do transcription of the language and NOT translation to English, unless the audio is English.
 https://github.com/openai/whisper
- 
	for me faster-whisper-xxl and large_v3 model on a Nvidia RTX 2070 Super with 8GB VRAM works well. 
- 
	OK I see that most of you use the Faster Whisper XXL engine: it is the one that I finally chose, just before opening this post. It worls well whereas CPP gave me a bullshit transcription with doubled lines... 
 
 On my main PC no problem (it is very powerful), but I have faced a non-ending task when I launched this process on a Virtual Machine on my NAS.
 
 The RAM of the VM is OK (16Gb), but the GPU is more or less nothing on a NAS: so your remarks explain why the process never terminates. I suppose that I will have to use a smaller model.
 
 I still have the question regarding the difference between:
 -"medium" and "* medium.en" : they have the same size (1.5GB) so what is the difference ?
 -some models have the "distil" label at the beginning of their name: what does it mean ??
 
 Thank you very much,
- 
	The difference between models medium and medium.en is that medium.en is preferable to use when the audio is in the English language. 
 You should try both and see which one is is more accurate.
 As for distil I have no idea what it means.
- 
	My thoughts. This is the batch file I call when using: 
 https://github.com/Purfview/whisper-standalone-win/releases/tag/Faster-Whisper-XXL
 which-model.cmdI've put in comments to remind myself why I mostly use large-v2.Code:@echo off set which-model=1 echo 1. large-v2 (Default) - the best overall. (slowest) echo 2. distil-large-v3.5 - 3-4 times the speed of large-v2. (95%-98% the accuracy of v2) echo 3. large-v3 - No echo 4. large-v3-turbo - No echo 5. medium.en - OK, --sentence parameter does not work too well. echo 6. small.en - almost as accurate as medium. --sentence parmeter works well. echo 7. tiny.en - fast, fair accuracy. Good for short videos that are to be discarded. set /p which-model=Which model? (1,2,3,4,5,6,7) if %which-model% equ 1 set model=large-v2 if %which-model% equ 2 set model=distil-large-v3.5 if %which-model% equ 3 set model=large-v3 if %which-model% equ 4 set model=large-v3-turbo if %which-model% equ 5 set model=medium.en if %which-model% equ 6 set model=small.en if %which-model% equ 7 set model=tiny.en echo.  
 Just my choices. ymmv.
- 
	Thank you very much to all of you : I have most of my answers ! 
 
 To date, it appears that the "Purfview's Whisper XXL" engine is the one that provide the best results by far, compared to the others.
 Particularly CPP (activated by default) generates double lines: almost impossible to use.
 
 My usage is mainly on old Black & White films, but CPP is bad also with more usual movies.
 
 Thank you all again.
- 
	@Nounours18200 
 You may like to try this batch file to create subtitles using 'faster-whisper-XXL' from the link I posted above.
 I've found the initial prompt helps - a lot.
 Subtitle Edit Beta is used, in batch file mode, to tidy up.
 https://github.com/SubtitleEdit/subtitleedit/releases
 (No install needed, just extract the content of the beta version to anywhere.)
 
 I tested on B/W movies downloaded from YouTube.
 
 The paths in BOLD will need to be changed.Code:@echo off setlocal enabledelayedexpansion chcp 65001 >nul set iprompt1=This audio is from a black and white movie, featuring classic dialogue style with clear enunciation and sometimes formal or vintage language. set iprompt2=Background noise may include film grain static or mono soundtrack typical of early cinema. "D:\Whisper-XXL\faster-whisper-xxl.exe" "c:\a\my movie.mp4" ^ --model_dir "D:\Whisper-XXL\_models" --model large-v2 ^ --initial_prompt "!iprompt1! !iprompt2!" --reprompt true --prompt_reset_on_temperature 0.5 ^ --vad_filter true --vad_method pyannote_v3 --task transcribe --sentence --language en --best_of 8 --beam_size 8 --verbose true -o source "D:\SubtitleEditBeta\SubtitleEdit.exe" /convert "c:\a\my movie.srt" srt /SplitLongLines /SplitLongLines echo All done. Press any key to Exit. &pause>nul 
 I've set up the batch file to make it simple to change the '--initial_prompt'. Tweak as you desire for each movie.
 
 I know a number of the whisper parameters are the defaults.
 Cheers.Last edited by pcspeak; 12th Sep 2025 at 03:23. Reason: Clarity 
- 
	Thank you pcspeak, but I don't understand why you suggest these modifications ? I am sorry... 
- 
	@Nounours18200 
 OK. All good. 
 Other forum members reading this thread, and are using faster-whisper-XXL, may find it of use.
 
 
 Cheers.
- 
	Thanks: faster-whisper-XXL gives good results on the old B&W films I work on. 
 
 Thank you.
- 
	
- 
	I have been using Whisper, but the subs never looked good enough for my taste. 
 
 I just tried to use this:
 
 https://www.assemblyai.com/
 
 When you subscribe, the first 456h are free, which will last me a long time, I find subs 99.9% of the time on https://www.opensubtitles.org/en/search/subs or https://www.addic7ed.com/shows.php.
 
 I needed to build a C project for that, and results are very good, much better than with Whisper.
Similar Threads
- 
  Subtitle Edit and WhisperBy koberulz in forum SubtitleReplies: 23Last Post: 13th Jan 2025, 02:26
- 
  How I use whisper-faster on my machineBy pcspeak in forum SubtitleReplies: 24Last Post: 30th Oct 2023, 12:25
- 
  could some one recommend voice changer human voice to teddy voiceBy jraju in forum Newbie / General discussionsReplies: 0Last Post: 6th Aug 2023, 07:06
- 
  Unusual behavior in Subtitle Edit Whisper voice to music transitionsBy loninappleton in forum SubtitleReplies: 8Last Post: 6th Jul 2023, 02:51
- 
  Voice recognition and transcription to textBy JosephTocco in forum Newbie / General discussionsReplies: 8Last Post: 27th Jul 2021, 13:42


 
		
		 View Profile
				View Profile
			 View Forum Posts
				View Forum Posts
			 Private Message
				Private Message
			 
 
			
			

 Quote
 Quote