This was all tested on Windows 10 x64.
I have added subtitles to approximately 20 videos using this method.
The brand of video card and the amount of memory on it is important.
For me, my nVidia card with 4 Gig of memory works with the medium model.
For the large model I have to revert to using the computer's cpu. (--device=cpu)
If your computer only has 4 Gig of memory, then whisper will probbly fail.
Test each option. The medium model using the computer's cpu should work on most newer computers
Download this and extract it to an empty folder. (e.g. d:\whisp\whisper-faster.exe)
https://github.com/Purfview/whisper-standalone-win/releases/download/faster-whisper/Wh...ter_r145.3.zip
Download this and extract it to the same folder.
https://github.com/Purfview/whisper-standalone-win/releases/download/libs/cuBLAS.and.cuDNN.7z
Add a copy of ffmpeg.exe to the same folder.
https://www.gyan.dev/ffmpeg/builds/ffmpeg-git-essentials.7z (x64 only)
If it DOES NOT EXIST - whisper-faster will download the model required, and place it in the correct sub-folder.
The default in the following batch file is for the tiny model. You just want to get it to work. Worry about larger models later.
Wait for the download of the model to finish. The transcription will start automatically after the download.
I put the accuracy at 90-95 percent.Code::1. :: WITH an 8Gig nVidia video card using onboard gpu. :: whisper-faster.exe "my movie.mkv" --language=English --model=large-v2 --output_format srt :2. :: WITHOUT an 8Gig nVidia video card. (runs on system memory. slower than GPU) :: whisper-faster.exe "my movie.mkv" --device=cpu --language=English --model=large-v2 --output_format srt :3. :: WITH an 4Gig nVidia video card. :: whisper-faster.exe "my movie.mkv" --language=English --model=medium --output_format srt :4. :: WITHOUT an 4Gig nVidia video card. (runs on system memory. slower than GPU) :: whisper-faster.exe "my movie.mkv" --device=cpu --language=English --model=medium --output_format srt :5. :: The tiny model to get things working. whisper-faster.exe "my movie.mkv" --language=English --model=tiny --output_format srt :6. :: The tiny model using the computer's cpu to get things working. Try this if the one above (:5.) fails. ::whisper-faster.exe "my movie.mkv" --device=cpu --language=English --model=tiny --output_format srt pause
I feed each output srt file to Subtitle Edit.
Tools/Fix common errors
then
Tools/Break/Split long lines
then
Spell check.
The saved result is close enough for me. The Start timing marks can be off. I don't care.
The whole process, once I got it working, is excellent.
Cheers.
+ Reply to Thread
Results 1 to 25 of 25
-
Last edited by pcspeak; 18th Aug 2023 at 19:44. Reason: Punctuation and clarity
-
You can fit large model in 4GB VRAM with "-ct=int8".
It doesn't use ffmpeg in anyway.
You don't need to specify it because it's default setting.
You can check if "-bs=5" improves accuracy.Last edited by VoodooFX; 20th Aug 2023 at 14:36.
-
You can fit large model in 4GB VRAM with "-ct=int8".It doesn't use ffmpeg in anyway.
https://superuser.com/questions/1778870/how-do-i-use-ffmpeg-and-openai-whisper-to-tran...-a-rtmp-stream
Earlier renditions of OpenAI's Whisper needed ffmpeg.
I'm a belt and braces type. Having ffmpeg in the folder or in my %Path% doesn't break anything.
--output_format srt
OK, I'll test that.
Cheers! -
-
@VoodooFX
On my machine (see my profile) with the nVidia 4 Gig card.
44 minute video.
Large model GPU did not work - out of memory
Medium model GPU - 10 minutes
Medium model CPU - 27 minutes
With you recommended parameters:
Large model GPU - 8.4 minutes (Winner!)
Code:"D:\whisperf\whisper-faster.exe" "D:\a\my movie.mkv" --language=English --model=large-v2 -ct=int8 -bs=5 --output_dir "%%~dpa\" --output_format srt
Cheers. -
-
@VoodooFX - Short or long names for parameters? I stopped using the shorter abbreviation, when there was a choice, some time ago.
When I checked whisper-faster.exe --help at a command prompt, neither of the shortened parameters showed.
You and I know what the abbreviated parameters mean. Other VideoHelp members may not.
I got arount to checking the transposition accuracy on my 44 minute video.
--compute_type int8 (-ct=int8) gives me an empty srt file. Tried many options but I could not get the large model to give me a srt file that was NOT empty. I'm staying with the medium.en model for now.
--beam_size 5 (-bs=5) Gives a definite improvement on accuracy, but took 50% longer to process. The video I'm using has Welsh, French and English words.
In the first 5 minutes of the video I found 4 occasions where the output srt, using --beam_size 5, was more accurate. beam_size 5 is staying in my batch file(s).
My next run will be on 5 videos in D:\a\ to get a more accurate reading of times taken, and just how good the transposition is.
This is the batch file I'm using:Code:@echo off for %%a in ("d:\a\*.mkv") do if exist "%%~dpna.srt" ( echo "%%a" - srt file already exists. ) else ( echo "%%a" && "D:\whisperf\whisper-faster.exe" "%%a" --language=English --model=medium.en --beam_size 5 --output_dir "%%~dpa\" --output_format srt ) echo All done. Press any key to Exit. &pause>nul
Cheers.Last edited by pcspeak; 22nd Aug 2023 at 16:32.
-
It shows all, short and long ones.
Strange, what you see in console at the end?
I tested whole movie [English] and counted all better/worse occasions, it was ~fifty-fifty with beam 5 vs 1, so, it didn't make subs more accurate when it made it slower.
Your batch doesn't do anything what you can't do with whisper-faster.exe alone. -
It shows all, short and long ones.
Your batch doesn't do anything what you can't do with whisper-faster.exe alone.
Strange, what you see in console at the end?Code:D:\WhisperF>whisper-faster-Large-GPU v04.cmd Standalone Faster-Whisper r145.3 running on: CUDA Starting transcription on: d:\a\my movie.mkv Transcription speed: 5.78 audio seconds/s Subtitles are written to 'd:\a' directory. Operation finished in: 499 seconds Press any key to Exit.
The output folder.Code:D:\a>dir Volume in drive D is D Volume Serial Number is 00CE-8685 Directory of D:\a 23/08/2023 09:34 AM <DIR> . 23/08/2023 09:34 AM <DIR> .. 11/04/2023 06:31 AM 223,042,765 my movie.mkv 23/08/2023 09:34 AM 0 my movie.srt 2 File(s) 223,042,765 bytes 2 Dir(s) 72,641,765,376 bytes free D:\a>
-
Run normally without cmd, and check what it writes with "--verbose=true, -f=all".
-
I killed the transcription. Ctrl+C.
Code:D:\WhisperF>whisper-faster.exe "d:\a\my movie.mkv" --language=English --model=large-v2 --compute_type=int8 --beam_size 5 --verbose=true --output_dir "D:\a\" --output_format all Standalone Faster-Whisper r145.3 running on: CUDA Number of visible GPU devices: 1 Supported compute types by GPU: {'int8', 'float16', 'int8_float32', 'float32', 'int8_float16'} [2023-08-23 10:40:33.256] [ctranslate2] [thread 2040] [info] CPU: GenuineIntel (SSE4.1=true, AVX=true, AVX2=true, AVX512=false) [2023-08-23 10:40:33.256] [ctranslate2] [thread 2040] [info] - Selected ISA: AVX2 [2023-08-23 10:40:33.256] [ctranslate2] [thread 2040] [info] - Use Intel MKL: true [2023-08-23 10:40:33.256] [ctranslate2] [thread 2040] [info] - SGEMM backend: MKL (packed: false) [2023-08-23 10:40:33.256] [ctranslate2] [thread 2040] [info] - GEMM_S16 backend: MKL (packed: false) [2023-08-23 10:40:33.256] [ctranslate2] [thread 2040] [info] - GEMM_S8 backend: MKL (packed: false, u8s8 preferred: true) [2023-08-23 10:40:33.256] [ctranslate2] [thread 2040] [info] GPU #0: NVIDIA GeForce GTX 1650 SUPER (CC=7.5) [2023-08-23 10:40:33.256] [ctranslate2] [thread 2040] [info] - Allow INT8: true [2023-08-23 10:40:33.256] [ctranslate2] [thread 2040] [info] - Allow FP16: true (with Tensor Cores: true) [2023-08-23 10:40:33.256] [ctranslate2] [thread 2040] [info] - Allow BF16: false [2023-08-23 10:40:51.485] [ctranslate2] [thread 2040] [info] Using CUDA allocator: cuda_malloc_async [2023-08-23 10:40:51.978] [ctranslate2] [thread 2040] [info] Loaded model D:\WhisperF\_models\faster-whisper-large-v2 on device cuda:0 [2023-08-23 10:40:51.978] [ctranslate2] [thread 2040] [info] - Binary version: 6 [2023-08-23 10:40:51.979] [ctranslate2] [thread 2040] [info] - Model specification revision: 3 [2023-08-23 10:40:51.979] [ctranslate2] [thread 2040] [info] - Selected compute type: int8_float16 Model loaded in: 18.83 seconds Starting transcription on: d:\a\my movie.mkv Processing audio with duration 44:00.085 VAD filter removed 00:49.825 of audio VAD filter kept the following audio segments: [00:00.000 -> 00:56.292], [00:58.716 -> 01:41.412], [01:43.836 -> 04:52.548], [04:54.204 -> 07:46.692], [07:48.156 -> 10:38.052], [10:40.380 -> 11:55.140], [11:56.796 -> 16:09.828], [16:13.020 -> 16:16.164], [16:19.644 -> 18:30.852], [18:32.604 -> 20:04.164], [20:05.724 -> 20:45.828], [20:49.404 -> 22:37.572], [22:40.188 -> 25:46.596], [25:48.252 -> 26:17.220], [26:22.428 -> 38:51.684], [38:54.876 -> 39:05.508], [39:08.316 -> 42:06.468], [42:08.220 -> 43:19.524], [43:22.428 -> 43:55.908] Audio processing finished in: 22.68 seconds Processing segment at 00:00.000 [2023-08-23 10:41:16.509] [ctranslate2] [thread 5376] [info] Loaded cuBLAS library version 11.8.1 * Compression ratio threshold is not met with temperature 0.0 (5.535714 > 2.400000) Processing segment at 00:29.000 * Compression ratio threshold is not met with temperature 0.0 (5.535714 > 2.400000) Processing segment at 00:58.000 * Compression ratio threshold is not met with temperature 0.0 (5.535714 > 2.400000) Processing segment at 01:27.000 * Compression ratio threshold is not met with temperature 0.0 (5.535714 > 2.400000) Processing segment at 01:56.000 * Compression ratio threshold is not met with temperature 0.0 (5.535714 > 2.400000) Processing segment at 02:25.000 * Compression ratio threshold is not met with temperature 0.0 (5.535714 > 2.400000) Processing segment at 02:54.000 * Compression ratio threshold is not met with temperature 0.0 (5.535714 > 2.400000) Processing segment at 03:23.000 * Compression ratio threshold is not met with temperature 0.0 (5.535714 > 2.400000) Processing segment at 03:52.000 * Compression ratio threshold is not met with temperature 0.0 (5.535714 > 2.400000) Processing segment at 04:21.000 * Compression ratio threshold is not met with temperature 0.0 (5.535714 > 2.400000) Processing segment at 04:50.000 * Compression ratio threshold is not met with temperature 0.0 (5.535714 > 2.400000) Processing segment at 05:19.000 * Compression ratio threshold is not met with temperature 0.0 (5.535714 > 2.400000) Processing segment at 05:48.000 * Compression ratio threshold is not met with temperature 0.0 (5.535714 > 2.400000) Processing segment at 06:17.000 * Compression ratio threshold is not met with temperature 0.0 (5.535714 > 2.400000) Processing segment at 06:46.000 * Compression ratio threshold is not met with temperature 0.0 (5.535714 > 2.400000) Processing segment at 07:15.000 Traceback (most recent call last): File "D:\whisper-fast\__main__.py", line 657, in <module> File "D:\whisper-fast\__main__.py", line 605, in cli File "faster_whisper\transcribe.py", line 931, in restore_speech_timestamps File "faster_whisper\transcribe.py", line 415, in generate_segments File "faster_whisper\transcribe.py", line 651, in generate_with_fallback KeyboardInterrupt [5388] Failed to execute script '__main__' due to unhandled exception! D:\WhisperF>
Code:Processing segment at 00:00.000 [2023-08-23 11:12:07.354] [ctranslate2] [thread 10732] [info] Loaded cuBLAS library version 11.8.1 * Log probability threshold is not met with temperature 0.0 (-1.928111 < -1.000000) * No speech threshold is met (0.633789 > 0.600000) Processing segment at 00:30.000 * Log probability threshold is not met with temperature 0.0 (-1.928111 < -1.000000) * No speech threshold is met (0.633789 > 0.600000) Processing segment at 01:00.000 * Log probability threshold is not met with temperature 0.0 (-1.928111 < -1.000000) * No speech threshold is met (0.633789 > 0.600000) Processing segment at 01:30.000 * Log probability threshold is not met with temperature 0.0 (-1.928111 < -1.000000) * No speech threshold is met (0.633789 > 0.600000) Processing segment at 02:00.000 * Log probability threshold is not met with temperature 0.0 (-1.928111 < -1.000000) * No speech threshold is met (0.633789 > 0.600000) Processing segment at 02:30.000 * Log probability threshold is not met with temperature 0.0 (-1.928111 < -1.000000) * No speech threshold is met (0.633789 > 0.600000) Processing segment at 03:00.000 * Log probability threshold is not met with temperature 0.0 (-1.928111 < -1.000000) * No speech threshold is met (0.633789 > 0.600000) Processing segment at 03:30.000 * Log probability threshold is not met with temperature 0.0 (-1.928111 < -1.000000) * No speech threshold is met (0.633789 > 0.600000) Processing segment at 04:00.000 * Log probability threshold is not met with temperature 0.0 (-1.928111 < -1.000000) * No speech threshold is met (0.633789 > 0.600000) Processing segment at 04:30.000 Traceback (most recent call last): File "D:\whisper-fast\__main__.py", line 657, in <module> File "D:\whisper-fast\__main__.py", line 605, in cli File "faster_whisper\transcribe.py", line 931, in restore_speech_timestamps File "faster_whisper\transcribe.py", line 408, in generate_segments File "faster_whisper\transcribe.py", line 620, in encode KeyboardInterrupt [10968] Failed to execute script '__main__' due to unhandled exception!
I'm now officially out of my comfort zone. But most happy to learn.
Cheers. -
Try with "int8_float32".
-
Congratulations! That's working. I'll run it through to the end and get back to you.
Code:Starting transcription on: d:\a\my movie.mkv Processing audio with duration 44:00.085 VAD filter removed 00:49.825 of audio VAD filter kept the following audio segments: [00:00.000 -> 00:56.292], [00:58.716 -> 01:41.412], [01:43.836 -> 04:52.548], [04:54.204 -> 07:46.692], [07:48.156 -> 10:38.052], [10:40.380 -> 11:55.140], [11:56.796 -> 16:09.828], [16:13.020 -> 16:16.164], [16:19.644 -> 18:30.852], [18:32.604 -> 20:04.164], [20:05.724 -> 20:45.828], [20:49.404 -> 22:37.572], [22:40.188 -> 25:46.596], [25:48.252 -> 26:17.220], [26:22.428 -> 38:51.684], [38:54.876 -> 39:05.508], [39:08.316 -> 42:06.468], [42:08.220 -> 43:19.524], [43:22.428 -> 43:55.908] Audio processing finished in: 22.6 seconds Processing segment at 00:00.000 [2023-08-23 11:48:06.036] [ctranslate2] [thread 5980] [info] Loaded cuBLAS library version 11.8.1 [00:00.000 --> 00:08.880] On salvage hunters best buys. Cheers. Drew looks back at his all-time favorite buys from Northern Europe. Oh [00:08.880 --> 00:09.960] My god, look [00:11.520 --> 00:18.220] Searching for rare continental pieces in France. He plays on the language barrier to get a steal of a deal Processing segment at 00:18.220 [00:19.160 --> 00:20.700] Wait, thank you [00:27.400 --> 00:33.400] In Belgium as an exclusive antiques Emporium he's spoiled for choice at a jaw-dropping collection [00:34.620 --> 00:38.060] Wow the death mask Napoleon is it Wow [00:39.420 --> 00:43.380] In Amsterdam, he's enchanted by an ancient Greek poetess [00:43.980 --> 00:47.460] This is stunning. Look at that. What a thing Processing segment at 00:48.220 [00:48.220 --> 00:54.680] These are Drew's favorite Northern European hunting grounds. Oh, yeah. Now you're talking about that. Yeah a grand bazaar [00:59.080 --> 01:02.540] Drew Pritchard is one of Britain's leading decorative salvage dealers [01:03.080 --> 01:07.360] Stop here for quality and fun in his hunt for weird and wonderful objects [01:08.200 --> 01:11.120] What's a fabulous thing? That's something I've never seen before [01:11.680 --> 01:14.000] He scoured the country and the continent Processing segment at 01:11.580 [01:14.520 --> 01:21.480] Merci, merci, that's got on salvage hunters best buys. He takes us inside his most remarkable deals [01:22.080 --> 01:22.960] 1550 what I do [01:23.700 --> 01:25.260] He's not that's cracking [01:26.020 --> 01:28.820] revealing his favorite purchases Wow [01:28.820 --> 01:36.340] Seriously impressive places. Oh my word. I just don't know what to look first and people you can buy one piece out there [01:36.340 --> 01:38.360] But I'm gonna charge a lot of money for it Processing segment at 01:35.940 [01:44.150 --> 01:44.750] Oh [01:44.750 --> 01:47.670] Where's 30-year career in the antique and salvage trade [01:48.350 --> 01:50.010] 450 euros. Yes, so [01:51.210 --> 01:56.290] That's nice drew has traveled far and wide across the continent. Hello [01:57.110 --> 01:57.490] incredible [01:58.650 --> 02:04.910] To bring the best European decorative antiques back to the UK. I try not to hit the Octotrion [02:06.590 --> 02:09.510] Yeah, let's see 800 Processing segment at 02:04.660 [02:10.310 --> 02:10.930] Let's see [02:11.530 --> 02:17.110] Traveling around Europe buying antiques. Yes. It is as good as it sounds it really is [02:17.110 --> 02:19.930] For me going there. It's like a melting pot [02:19.930 --> 02:24.290] I'm really never know what I'm gonna find you can find some really exceptional things if you look around [02:24.290 --> 02:27.750] All of the things in Belgium all in one place Wow [02:28.830 --> 02:32.670] But there's a special place in Drew's heart for the northern part of the continent [02:32.670 --> 02:37.250] It was one of the major trade routes of the 18th and 19th century Processing segment at 02:32.400 Traceback (most recent call last): File "D:\whisper-fast\__main__.py", line 657, in <module> File "D:\whisper-fast\__main__.py", line 605, in cli File "faster_whisper\transcribe.py", line 931, in restore_speech_timestamps File "faster_whisper\transcribe.py", line 408, in generate_segments File "faster_whisper\transcribe.py", line 620, in encode KeyboardInterrupt [2332] Failed to execute script '__main__' due to unhandled exception!
-
It crashed.
Code:[03:17.330 --> 03:20.710] Came back did really well and I've been coming ever since [03:21.870 --> 03:23.330] in La Belle France [03:23.330 --> 03:30.430] Drew and T scoured the countryside for some typically galaxy in one of the country's many secondhand shops or brocons [03:32.850 --> 03:35.310] The French have an incredible sort of Processing segment at 03:30.460 [03:35.310 --> 03:41.030] Love of brocons, it's commonplace, but it's in their culture [03:41.030 --> 03:46.610] So you get to go to great shops and brocons all over the country [03:46.610 --> 03:50.530] They're everywhere and they're full generally rammed and I like them that way [03:52.950 --> 04:00.890] On the outskirts of Rouen in Normandy Drew visited a huge three-story barn stuffed with eclectic beautiful and curious items [04:00.890 --> 04:03.510] many sourced from local chateau and farms Processing segment at 04:00.460 [04:05.810 --> 04:13.210] The authentic brocon shop experience they encountered is the result of owner Max Teterland's passion for French provincial antiques [04:14.730 --> 04:18.870] Max has been a dealer I think 40 years today. I'm looking for [04:19.590 --> 04:23.310] Garden stuff predominantly and then one sort of rustic [04:24.050 --> 04:27.810] Steel work, you know all that sort of stuff you can pick up around here. Yeah [04:28.510 --> 04:32.090] She did my rail a la brocon the board. Yeah Processing segment at 04:27.240 * Log probability threshold is not met with temperature 0.0 (-1.334731 < -1.000000) Traceback (most recent call last): File "D:\whisper-fast\__main__.py", line 657, in <module> File "D:\whisper-fast\__main__.py", line 605, in cli File "faster_whisper\transcribe.py", line 931, in restore_speech_timestamps File "faster_whisper\transcribe.py", line 415, in generate_segments File "faster_whisper\transcribe.py", line 651, in generate_with_fallback RuntimeError: CUDA failed with error out of memory [6356] Failed to execute script '__main__' due to unhandled exception!
-
Close all programs using GPU, that includes an internet browser, maybe restart PC. If that doesn't help, set beam to 1.
Btw, can you share this audio [remuxed with mkvtoolnix without video]?Last edited by VoodooFX; 22nd Aug 2023 at 21:15.
-
Actually, it gets out of memory not because of higher beam, but because of fallback when "--best_of" is at work, it's 5 by default, you can try to lower it.
If it still gets out of memory then you can disable fallback -> " --temperature_increment_on_fallback=None". -
I sent you a pm.
--temperature_increment_on_fallback=None
That worked. The error rate is high compared to the medium.en model using the following:Code:whisper-faster.exe "d:\a\*.mkv" --language=English --model=medium.en --compute_type=int8_float32 --beam_size 5 --output_dir D:\a\ --output_format srt
-
Is "--model=medium.en --compute_type=int8_float16" working normally on that file?
-
Yes, it does work.
With nothing else changed What's interesting is the punctuation.
float32 gives me a period at the end of each sentence.
--model=medium.en --compute_type=int8_float16
Code:Processing segment at 18:27.040 [18:47.420 --> 18:48.280] No [18:50.040 --> 18:51.240] Everything has its price [18:51.240 --> 18:51.920] I know [18:51.920 --> 18:52.980] Yes you're very right [18:52.980 --> 18:54.320] You're very right [18:55.620 --> 18:56.100] God [18:56.580 --> 18:57.540] Oh what the heck [18:58.340 --> 18:59.020] Thank you [18:59.020 --> 19:00.420] This is a really good piece [19:00.420 --> 19:01.080] It is [19:01.080 --> 19:01.960] Thank you so much [19:01.960 --> 19:02.900] We appreciate it [19:03.440 --> 19:04.080] Appreciate it [19:04.080 --> 19:05.200] Wonderful thing [19:05.200 --> 19:07.120] One of the nicest things I've ever bought [19:07.120 --> 19:08.840] One of the nicest things I've ever bought [19:08.840 --> 19:09.260] Wonderful [19:09.260 --> 19:11.300] Honestly it's one of the best things I've ever bought [19:11.300 --> 19:15.400] It had that certain magic to it Processing segment at 18:55.020
Code:Processing segment at 18:38.540 [18:58.920 --> 19:00.600] This is a really good piece, sir. [19:00.700 --> 19:01.040] It is. [19:01.240 --> 19:01.940] Thank you so much. [19:02.200 --> 19:02.860] We appreciate it. [19:03.520 --> 19:04.080] Appreciate it. [19:04.240 --> 19:04.880] Wonderful thing. [19:05.740 --> 19:07.100] One of the nicest things I've ever bought. [19:07.600 --> 19:08.800] One of the nicest things I've ever bought. [19:08.960 --> 19:09.260] Wonderful. [19:09.600 --> 19:11.280] Honestly, it's one of the best things I've ever bought. [19:12.080 --> 19:15.240] It had that certain magic to it. [19:15.540 --> 19:18.340] That single piece makes all of those hours [19:18.340 --> 19:20.480] and all of that travel worth it. [19:21.040 --> 19:23.500] And I was particularly pleased with the deal I did on it. [19:23.720 --> 19:25.760] I did give the dealer a little bit of a kicking. [19:26.560 --> 19:27.480] He still made a profit. [19:27.760 --> 19:28.680] I got what I wanted. Processing segment at 19:08.300
-
OK.
Could you test "r143" and "r145" with "--model=large-v2 --compute_type=int8"?
There I uploaded these versions: https://github.com/Purfview/whisper-standalone-win/releases/tag/faster-whisper -
I think pcspeak was going to close the thread but...
Just a word of thanks for the two main coders (so far as I know) in this sub forum continuing improvement on this.
Yes I would have to upgrade a GPU but there was mention of NVidia vs Radeon. I'm a way off from adding cards before I see a proven need for the upgrade.
onward. -
Old "int8" was same as "int8_float32", now "int8" is the auto selection from three "int8_..." variations.
"float32" requires ~twice more memory than "int8_..." and model loading is much faster with "float32".
Btw, on my CPU transcription with "float32" is ~twice faster than with "int8_float32".
There is documentation about different quantizations -> https://opennmt.net/CTranslate2/quantization.html -
Hello, I'm using a computer with an i5 9400 CPU and an RX6600 GPU.
So, when using Whisper-Faster, how should I set it up for the best performance?
Thank you, everyone.
Similar Threads
-
Standalone Faster-Whisper - Portable AI auto-transcription-translation
By VoodooFX in forum SubtitleReplies: 17Last Post: 22nd Oct 2023, 04:41 -
Whisper engines in Subtitle Edit
By loninappleton in forum SubtitleReplies: 0Last Post: 16th May 2023, 23:20 -
Speech Model updates for VOSK or Whisper
By loninappleton in forum SubtitleReplies: 2Last Post: 17th Jan 2023, 23:47 -
A guide to generating subtitles through Whisper AI
By lordlance in forum SubtitleReplies: 1Last Post: 12th Jan 2023, 20:44 -
Subtitle Edit 3.6.10 new version with Whisper option
By loninappleton in forum SubtitleReplies: 33Last Post: 18th Dec 2022, 14:24