VideoHelp Forum




+ Reply to Thread
Results 1 to 25 of 25
  1. Member
    Join Date
    Apr 2007
    Location
    Australia
    Search Comp PM
    This was all tested on Windows 10 x64.
    I have added subtitles to approximately 20 videos using this method.

    The brand of video card and the amount of memory on it is important.
    For me, my nVidia card with 4 Gig of memory works with the medium model.
    For the large model I have to revert to using the computer's cpu. (--device=cpu)

    If your computer only has 4 Gig of memory, then whisper will probbly fail.
    Test each option. The medium model using the computer's cpu should work on most newer computers

    Download this and extract it to an empty folder. (e.g. d:\whisp\whisper-faster.exe)
    https://github.com/Purfview/whisper-standalone-win/releases/download/faster-whisper/Wh...ter_r145.3.zip
    Download this and extract it to the same folder.
    https://github.com/Purfview/whisper-standalone-win/releases/download/libs/cuBLAS.and.cuDNN.7z
    Add a copy of ffmpeg.exe to the same folder.
    https://www.gyan.dev/ffmpeg/builds/ffmpeg-git-essentials.7z (x64 only)

    If it DOES NOT EXIST - whisper-faster will download the model required, and place it in the correct sub-folder.

    The default in the following batch file is for the tiny model. You just want to get it to work. Worry about larger models later.
    Wait for the download of the model to finish. The transcription will start automatically after the download.

    Code:
    :1.
    :: WITH an 8Gig nVidia video card using onboard gpu.
    :: whisper-faster.exe "my movie.mkv" --language=English --model=large-v2 --output_format srt
    :2.
    :: WITHOUT an 8Gig nVidia video card. (runs on system memory. slower than GPU)
    :: whisper-faster.exe "my movie.mkv" --device=cpu --language=English --model=large-v2 --output_format srt
    :3.
    :: WITH an 4Gig nVidia video card.
    :: whisper-faster.exe "my movie.mkv" --language=English --model=medium --output_format srt
    :4.
    :: WITHOUT an 4Gig nVidia video card. (runs on system memory. slower than GPU)
    :: whisper-faster.exe "my movie.mkv" --device=cpu --language=English --model=medium --output_format srt
    :5.
    :: The tiny model to get things working.
    whisper-faster.exe "my movie.mkv" --language=English --model=tiny --output_format srt
    :6.
    :: The tiny model using the computer's cpu to get things working. Try this if the one above (:5.) fails.
    ::whisper-faster.exe "my movie.mkv" --device=cpu --language=English --model=tiny --output_format srt
    pause
    I put the accuracy at 90-95 percent.
    I feed each output srt file to Subtitle Edit.
    Tools/Fix common errors
    then
    Tools/Break/Split long lines
    then
    Spell check.
    The saved result is close enough for me. The Start timing marks can be off. I don't care.

    The whole process, once I got it working, is excellent.
    Cheers.
    Last edited by pcspeak; 18th Aug 2023 at 20:44. Reason: Punctuation and clarity
    Quote Quote  
  2. Video Damager VoodooFX's Avatar
    Join Date
    Oct 2021
    Location
    At Doom9
    Search PM
    Originally Posted by pcspeak View Post
    For me, my nVidia card with 4 Gig of memory works with the medium model.
    You can fit large model in 4GB VRAM with "-ct=int8".

    Originally Posted by pcspeak View Post
    Add a copy of ffmpeg
    It doesn't use ffmpeg in anyway.

    Originally Posted by pcspeak View Post
    --output_format srt
    You don't need to specify it because it's default setting.

    Originally Posted by pcspeak View Post
    I put the accuracy at 90-95 percent.
    You can check if "-bs=5" improves accuracy.
    Quote Quote  
  3. Member
    Join Date
    Apr 2007
    Location
    Australia
    Search Comp PM
    You can fit large model in 4GB VRAM with "-ct=int8".
    I didn't know that. Thanks. Testing for speed is where I'm at a the moment.
    It doesn't use ffmpeg in anyway.
    Yeah, I know. I've been testing. This does.
    https://superuser.com/questions/1778870/how-do-i-use-ffmpeg-and-openai-whisper-to-tran...-a-rtmp-stream
    Earlier renditions of OpenAI's Whisper needed ffmpeg.
    I'm a belt and braces type. Having ffmpeg in the folder or in my %Path% doesn't break anything.
    --output_format srt
    You don't need to specify it because it's default setting.
    It's there because sometimes I change the format of the subtitles to vtt or txt. Post processing, it's mostly about how well Subtitle Edit handles the newly created subtitles. (Thanks Nik!) There can be minor differences and I'm just trying to get my head around which will be the most accurate for the creation of subs for all the episodes of Salvage Hunters I've recorded over the years. The Welsh names are interesting to deal with.
    Originally Posted by pcspeak View Post

    You can check if "-bs=5" improves accuracy.
    OK, I'll test that.


    Cheers!
    Quote Quote  
  4. Video Damager VoodooFX's Avatar
    Join Date
    Oct 2021
    Location
    At Doom9
    Search PM
    Originally Posted by pcspeak View Post
    Earlier renditions of OpenAI's Whisper needed ffmpeg.
    Latest needs too, but your post is about Faster-Whisper.

    Btw, instead of "--output_format" you can use shorter alternative -> "-f".
    Quote Quote  
  5. Member
    Join Date
    Apr 2007
    Location
    Australia
    Search Comp PM
    @VoodooFX
    On my machine (see my profile) with the nVidia 4 Gig card.
    44 minute video.
    Large model GPU did not work - out of memory
    Medium model GPU - 10 minutes
    Medium model CPU - 27 minutes

    With you recommended parameters:
    Large model GPU - 8.4 minutes (Winner!)
    Code:
     "D:\whisperf\whisper-faster.exe" "D:\a\my movie.mkv" --language=English --model=large-v2 -ct=int8 -bs=5 --output_dir "%%~dpa\" --output_format srt
    Now to check the accuracy.
    Cheers.
    Quote Quote  
  6. Video Damager VoodooFX's Avatar
    Join Date
    Oct 2021
    Location
    At Doom9
    Search PM
    Originally Posted by pcspeak View Post
    Code:
     "D:\whisperf\whisper-faster.exe" "D:\a\my movie.mkv" --language=English --model=large-v2 -ct=int8 -bs=5 --output_dir "%%~dpa\" --output_format srt
    Same command in short:

    Code:
     "D:\whisperf\whisper-faster.exe" "D:\a\my movie.mkv" -l=en -m=large-v2 -ct=int8 -bs=5 -o=source -f=srt
    Quote Quote  
  7. Member
    Join Date
    Apr 2007
    Location
    Australia
    Search Comp PM
    @VoodooFX - Short or long names for parameters? I stopped using the shorter abbreviation, when there was a choice, some time ago.
    When I checked whisper-faster.exe --help at a command prompt, neither of the shortened parameters showed.
    You and I know what the abbreviated parameters mean. Other VideoHelp members may not.

    I got arount to checking the transposition accuracy on my 44 minute video.
    --compute_type int8 (-ct=int8) gives me an empty srt file. Tried many options but I could not get the large model to give me a srt file that was NOT empty. I'm staying with the medium.en model for now.

    --beam_size 5 (-bs=5) Gives a definite improvement on accuracy, but took 50% longer to process. The video I'm using has Welsh, French and English words.
    In the first 5 minutes of the video I found 4 occasions where the output srt, using --beam_size 5, was more accurate. beam_size 5 is staying in my batch file(s).

    My next run will be on 5 videos in D:\a\ to get a more accurate reading of times taken, and just how good the transposition is.
    This is the batch file I'm using:
    Code:
    @echo off
    for %%a in ("d:\a\*.mkv") do if exist "%%~dpna.srt" (
        echo "%%a" - srt file already exists.
        ) else (
        echo "%%a" && "D:\whisperf\whisper-faster.exe" "%%a" --language=English --model=medium.en --beam_size 5 --output_dir "%%~dpa\" --output_format srt
        )
     echo All done. Press any key to Exit. &pause>nul
    Unless I get asked a question this is my last post for this thread. I'm starting to bore myself.
    Cheers.
    Last edited by pcspeak; 22nd Aug 2023 at 17:32.
    Quote Quote  
  8. Video Damager VoodooFX's Avatar
    Join Date
    Oct 2021
    Location
    At Doom9
    Search PM
    Originally Posted by pcspeak View Post
    When I checked whisper-faster.exe --help at a command prompt, neither of the shortened parameters showed.
    It shows all, short and long ones.

    Originally Posted by pcspeak View Post
    --compute_type int8 (-ct=int8) gives me an empty srt file
    Strange, what you see in console at the end?

    Originally Posted by pcspeak View Post
    --beam_size 5 (-bs=5) Gives a definite improvement on accuracy... In the first 5 minutes of the video I found 4 occasions
    I tested whole movie [English] and counted all better/worse occasions, it was ~fifty-fifty with beam 5 vs 1, so, it didn't make subs more accurate when it made it slower.

    Originally Posted by pcspeak View Post
    This is the batch file I'm using
    Your batch doesn't do anything what you can't do with whisper-faster.exe alone.
    Quote Quote  
  9. Member
    Join Date
    Apr 2007
    Location
    Australia
    Search Comp PM
    It shows all, short and long ones.
    My bad. You are correct. I've run --help on an earlier version of whisper-faster.exe by mistake.


    Your batch doesn't do anything what you can't do with whisper-faster.exe alone.
    You're right again. I'm just using copy/paste from other batch files. Old habits die hard.



    Strange, what you see in console at the end?
    Code:
    D:\WhisperF>whisper-faster-Large-GPU v04.cmd
    
    Standalone Faster-Whisper r145.3 running on: CUDA
    
    
    Starting transcription on: d:\a\my movie.mkv
    
    
    Transcription speed: 5.78 audio seconds/s
    
    Subtitles are written to 'd:\a' directory.
    
    
    Operation finished in: 499 seconds
    
      Press any key to Exit.
    None of the usual time codes or text one expects.


    The output folder.
    Code:
    D:\a>dir
      Volume in drive D is D
     Volume Serial Number is 00CE-8685
    
     Directory of D:\a
    
    23/08/2023  09:34 AM    <DIR>          .
    23/08/2023  09:34 AM    <DIR>          ..
    11/04/2023  06:31 AM       223,042,765 my movie.mkv
    23/08/2023  09:34 AM                 0 my movie.srt
                   2 File(s)    223,042,765 bytes
                   2 Dir(s)  72,641,765,376 bytes free
    
    D:\a>
    Cheers.
    Quote Quote  
  10. Member
    Join Date
    Apr 2007
    Location
    Australia
    Search Comp PM
    I killed the transcription. Ctrl+C.
    Code:
    D:\WhisperF>whisper-faster.exe &quot;d:\a\my movie.mkv&quot; --language=English --model=large-v2 --compute_type=int8 --beam_size 5 --verbose=true --output_dir &quot;D:\a\&quot; --output_format all
    
    Standalone Faster-Whisper r145.3 running on: CUDA
    
    Number of visible GPU devices: 1
    
    Supported compute types by GPU: {'int8', 'float16', 'int8_float32', 'float32', 'int8_float16'}
    
    [2023-08-23 10:40:33.256] [ctranslate2] [thread 2040] [info] CPU: GenuineIntel (SSE4.1=true, AVX=true, AVX2=true, AVX512=false)
    [2023-08-23 10:40:33.256] [ctranslate2] [thread 2040] [info]  - Selected ISA: AVX2
    [2023-08-23 10:40:33.256] [ctranslate2] [thread 2040] [info]  - Use Intel MKL: true
    [2023-08-23 10:40:33.256] [ctranslate2] [thread 2040] [info]  - SGEMM backend: MKL (packed: false)
    [2023-08-23 10:40:33.256] [ctranslate2] [thread 2040] [info]  - GEMM_S16 backend: MKL (packed: false)
    [2023-08-23 10:40:33.256] [ctranslate2] [thread 2040] [info]  - GEMM_S8 backend: MKL (packed: false, u8s8 preferred: true)
    [2023-08-23 10:40:33.256] [ctranslate2] [thread 2040] [info] GPU #0: NVIDIA GeForce GTX 1650 SUPER (CC=7.5)
    [2023-08-23 10:40:33.256] [ctranslate2] [thread 2040] [info]  - Allow INT8: true
    [2023-08-23 10:40:33.256] [ctranslate2] [thread 2040] [info]  - Allow FP16: true (with Tensor Cores: true)
    [2023-08-23 10:40:33.256] [ctranslate2] [thread 2040] [info]  - Allow BF16: false
    [2023-08-23 10:40:51.485] [ctranslate2] [thread 2040] [info] Using CUDA allocator: cuda_malloc_async
    [2023-08-23 10:40:51.978] [ctranslate2] [thread 2040] [info] Loaded model D:\WhisperF\_models\faster-whisper-large-v2 on device cuda:0
    [2023-08-23 10:40:51.978] [ctranslate2] [thread 2040] [info]  - Binary version: 6
    [2023-08-23 10:40:51.979] [ctranslate2] [thread 2040] [info]  - Model specification revision: 3
    [2023-08-23 10:40:51.979] [ctranslate2] [thread 2040] [info]  - Selected compute type: int8_float16
    
    Model loaded in: 18.83 seconds
    
    Starting transcription on: d:\a\my movie.mkv
    
    Processing audio with duration 44:00.085
    
    VAD filter removed 00:49.825 of audio
    VAD filter kept the following audio segments: [00:00.000 -> 00:56.292], [00:58.716 -> 01:41.412], [01:43.836 -> 04:52.548], [04:54.204 -> 07:46.692], [07:48.156 -> 10:38.052], [10:40.380 -> 11:55.140], [11:56.796 -> 16:09.828], [16:13.020 -> 16:16.164], [16:19.644 -> 18:30.852], [18:32.604 -> 20:04.164], [20:05.724 -> 20:45.828], [20:49.404 -> 22:37.572], [22:40.188 -> 25:46.596], [25:48.252 -> 26:17.220], [26:22.428 -> 38:51.684], [38:54.876 -> 39:05.508], [39:08.316 -> 42:06.468], [42:08.220 -> 43:19.524], [43:22.428 -> 43:55.908]
    
    Audio processing finished in: 22.68 seconds
    
    Processing segment at 00:00.000
    [2023-08-23 10:41:16.509] [ctranslate2] [thread 5376] [info] Loaded cuBLAS library version 11.8.1
    * Compression ratio threshold is not met with temperature 0.0 (5.535714 > 2.400000)
    Processing segment at 00:29.000
    * Compression ratio threshold is not met with temperature 0.0 (5.535714 > 2.400000)
    Processing segment at 00:58.000
    * Compression ratio threshold is not met with temperature 0.0 (5.535714 > 2.400000)
    Processing segment at 01:27.000
    * Compression ratio threshold is not met with temperature 0.0 (5.535714 > 2.400000)
    Processing segment at 01:56.000
    * Compression ratio threshold is not met with temperature 0.0 (5.535714 > 2.400000)
    Processing segment at 02:25.000
    * Compression ratio threshold is not met with temperature 0.0 (5.535714 > 2.400000)
    Processing segment at 02:54.000
    * Compression ratio threshold is not met with temperature 0.0 (5.535714 > 2.400000)
    Processing segment at 03:23.000
    * Compression ratio threshold is not met with temperature 0.0 (5.535714 > 2.400000)
    Processing segment at 03:52.000
    * Compression ratio threshold is not met with temperature 0.0 (5.535714 > 2.400000)
    Processing segment at 04:21.000
    * Compression ratio threshold is not met with temperature 0.0 (5.535714 > 2.400000)
    Processing segment at 04:50.000
    * Compression ratio threshold is not met with temperature 0.0 (5.535714 > 2.400000)
    Processing segment at 05:19.000
    * Compression ratio threshold is not met with temperature 0.0 (5.535714 > 2.400000)
    Processing segment at 05:48.000
    * Compression ratio threshold is not met with temperature 0.0 (5.535714 > 2.400000)
    Processing segment at 06:17.000
    * Compression ratio threshold is not met with temperature 0.0 (5.535714 > 2.400000)
    Processing segment at 06:46.000
    * Compression ratio threshold is not met with temperature 0.0 (5.535714 > 2.400000)
    Processing segment at 07:15.000
    Traceback (most recent call last):
      File &quot;D:\whisper-fast\__main__.py&quot;, line 657, in <module>
      File &quot;D:\whisper-fast\__main__.py&quot;, line 605, in cli
      File &quot;faster_whisper\transcribe.py&quot;, line 931, in restore_speech_timestamps
      File &quot;faster_whisper\transcribe.py&quot;, line 415, in generate_segments
      File &quot;faster_whisper\transcribe.py&quot;, line 651, in generate_with_fallback
    KeyboardInterrupt
    [5388] Failed to execute script '__main__' due to unhandled exception!
    
     D:\WhisperF>
    With --beam_size 5 removed.

    Code:
    Processing segment at 00:00.000
    [2023-08-23 11:12:07.354] [ctranslate2] [thread 10732] [info] Loaded cuBLAS library version 11.8.1
    * Log probability threshold is not met with temperature 0.0 (-1.928111 < -1.000000)
    * No speech threshold is met (0.633789 > 0.600000)
    Processing segment at 00:30.000
    * Log probability threshold is not met with temperature 0.0 (-1.928111 < -1.000000)
    * No speech threshold is met (0.633789 > 0.600000)
    Processing segment at 01:00.000
    * Log probability threshold is not met with temperature 0.0 (-1.928111 < -1.000000)
    * No speech threshold is met (0.633789 > 0.600000)
    Processing segment at 01:30.000
    * Log probability threshold is not met with temperature 0.0 (-1.928111 < -1.000000)
    * No speech threshold is met (0.633789 > 0.600000)
    Processing segment at 02:00.000
    * Log probability threshold is not met with temperature 0.0 (-1.928111 < -1.000000)
    * No speech threshold is met (0.633789 > 0.600000)
    Processing segment at 02:30.000
    * Log probability threshold is not met with temperature 0.0 (-1.928111 < -1.000000)
    * No speech threshold is met (0.633789 > 0.600000)
    Processing segment at 03:00.000
    * Log probability threshold is not met with temperature 0.0 (-1.928111 < -1.000000)
    * No speech threshold is met (0.633789 > 0.600000)
    Processing segment at 03:30.000
    * Log probability threshold is not met with temperature 0.0 (-1.928111 < -1.000000)
    * No speech threshold is met (0.633789 > 0.600000)
    Processing segment at 04:00.000
    * Log probability threshold is not met with temperature 0.0 (-1.928111 < -1.000000)
    * No speech threshold is met (0.633789 > 0.600000)
    Processing segment at 04:30.000
    Traceback (most recent call last):
      File "D:\whisper-fast\__main__.py", line 657, in <module>
      File "D:\whisper-fast\__main__.py", line 605, in cli
      File "faster_whisper\transcribe.py", line 931, in restore_speech_timestamps
      File "faster_whisper\transcribe.py", line 408, in generate_segments
      File "faster_whisper\transcribe.py", line 620, in encode
    KeyboardInterrupt
     [10968] Failed to execute script '__main__' due to unhandled exception!

    I'm now officially out of my comfort zone. But most happy to learn.
    Cheers.
    Quote Quote  
  11. Member
    Join Date
    Apr 2007
    Location
    Australia
    Search Comp PM
    Congratulations! That's working. I'll run it through to the end and get back to you.
    Code:
    Starting transcription on: d:\a\my movie.mkv
    
    Processing audio with duration 44:00.085
    
    VAD filter removed 00:49.825 of audio
    VAD filter kept the following audio segments: [00:00.000 -> 00:56.292], [00:58.716 -> 01:41.412], [01:43.836 -> 04:52.548], [04:54.204 -> 07:46.692], [07:48.156 -> 10:38.052], [10:40.380 -> 11:55.140], [11:56.796 -> 16:09.828], [16:13.020 -> 16:16.164], [16:19.644 -> 18:30.852], [18:32.604 -> 20:04.164], [20:05.724 -> 20:45.828], [20:49.404 -> 22:37.572], [22:40.188 -> 25:46.596], [25:48.252 -> 26:17.220], [26:22.428 -> 38:51.684], [38:54.876 -> 39:05.508], [39:08.316 -> 42:06.468], [42:08.220 -> 43:19.524], [43:22.428 -> 43:55.908]
    
    Audio processing finished in: 22.6 seconds
    
    Processing segment at 00:00.000
    [2023-08-23 11:48:06.036] [ctranslate2] [thread 5980] [info] Loaded cuBLAS library version 11.8.1
    [00:00.000 --> 00:08.880]  On salvage hunters best buys. Cheers. Drew looks back at his all-time favorite buys from Northern Europe. Oh
    [00:08.880 --> 00:09.960]  My god, look
    [00:11.520 --> 00:18.220]  Searching for rare continental pieces in France. He plays on the language barrier to get a steal of a deal
    Processing segment at 00:18.220
    [00:19.160 --> 00:20.700]  Wait, thank you
    [00:27.400 --> 00:33.400]  In Belgium as an exclusive antiques Emporium he's spoiled for choice at a jaw-dropping collection
    [00:34.620 --> 00:38.060]  Wow the death mask Napoleon is it Wow
    [00:39.420 --> 00:43.380]  In Amsterdam, he's enchanted by an ancient Greek poetess
    [00:43.980 --> 00:47.460]  This is stunning. Look at that. What a thing
    Processing segment at 00:48.220
    [00:48.220 --> 00:54.680]  These are Drew's favorite Northern European hunting grounds. Oh, yeah. Now you're talking about that. Yeah a grand bazaar
    [00:59.080 --> 01:02.540]  Drew Pritchard is one of Britain's leading decorative salvage dealers
    [01:03.080 --> 01:07.360]  Stop here for quality and fun in his hunt for weird and wonderful objects
    [01:08.200 --> 01:11.120]  What's a fabulous thing? That's something I've never seen before
    [01:11.680 --> 01:14.000]  He scoured the country and the continent
    Processing segment at 01:11.580
    [01:14.520 --> 01:21.480]  Merci, merci, that's got on salvage hunters best buys. He takes us inside his most remarkable deals
    [01:22.080 --> 01:22.960]  1550 what I do
    [01:23.700 --> 01:25.260]  He's not that's cracking
    [01:26.020 --> 01:28.820]  revealing his favorite purchases Wow
    [01:28.820 --> 01:36.340]  Seriously impressive places. Oh my word. I just don't know what to look first and people you can buy one piece out there
    [01:36.340 --> 01:38.360]  But I'm gonna charge a lot of money for it
    Processing segment at 01:35.940
    [01:44.150 --> 01:44.750]  Oh
    [01:44.750 --> 01:47.670]  Where's 30-year career in the antique and salvage trade
    [01:48.350 --> 01:50.010]  450 euros. Yes, so
    [01:51.210 --> 01:56.290]  That's nice drew has traveled far and wide across the continent. Hello
    [01:57.110 --> 01:57.490]  incredible
    [01:58.650 --> 02:04.910]  To bring the best European decorative antiques back to the UK. I try not to hit the Octotrion
    [02:06.590 --> 02:09.510]  Yeah, let's see 800
    Processing segment at 02:04.660
    [02:10.310 --> 02:10.930]  Let's see
    [02:11.530 --> 02:17.110]  Traveling around Europe buying antiques. Yes. It is as good as it sounds it really is
    [02:17.110 --> 02:19.930]  For me going there. It's like a melting pot
    [02:19.930 --> 02:24.290]  I'm really never know what I'm gonna find you can find some really exceptional things if you look around
    [02:24.290 --> 02:27.750]  All of the things in Belgium all in one place Wow
    [02:28.830 --> 02:32.670]  But there's a special place in Drew's heart for the northern part of the continent
    [02:32.670 --> 02:37.250]  It was one of the major trade routes of the 18th and 19th century
    Processing segment at 02:32.400
    Traceback (most recent call last):
      File "D:\whisper-fast\__main__.py", line 657, in <module>
      File "D:\whisper-fast\__main__.py", line 605, in cli
      File "faster_whisper\transcribe.py", line 931, in restore_speech_timestamps
      File "faster_whisper\transcribe.py", line 408, in generate_segments
      File "faster_whisper\transcribe.py", line 620, in encode
    KeyboardInterrupt
    [2332] Failed to execute script '__main__' due to unhandled exception!
    Cheers.
    Quote Quote  
  12. Member
    Join Date
    Apr 2007
    Location
    Australia
    Search Comp PM
    It crashed.
    Code:
    [03:17.330 --> 03:20.710]  Came back did really well and I've been coming ever since
    [03:21.870 --> 03:23.330]  in La Belle France
    [03:23.330 --> 03:30.430]  Drew and T scoured the countryside for some typically galaxy in one of the country's many secondhand shops or brocons
    [03:32.850 --> 03:35.310]  The French have an incredible sort of
    Processing segment at 03:30.460
    [03:35.310 --> 03:41.030]  Love of brocons, it's commonplace, but it's in their culture
    [03:41.030 --> 03:46.610]  So you get to go to great shops and brocons all over the country
    [03:46.610 --> 03:50.530]  They're everywhere and they're full generally rammed and I like them that way
    [03:52.950 --> 04:00.890]  On the outskirts of Rouen in Normandy Drew visited a huge three-story barn stuffed with eclectic beautiful and curious items
    [04:00.890 --> 04:03.510]  many sourced from local chateau and farms
    Processing segment at 04:00.460
    [04:05.810 --> 04:13.210]  The authentic brocon shop experience they encountered is the result of owner Max Teterland's passion for French provincial antiques
    [04:14.730 --> 04:18.870]  Max has been a dealer I think 40 years today. I'm looking for
    [04:19.590 --> 04:23.310]  Garden stuff predominantly and then one sort of rustic
    [04:24.050 --> 04:27.810]  Steel work, you know all that sort of stuff you can pick up around here. Yeah
    [04:28.510 --> 04:32.090]  She did my rail a la brocon the board. Yeah
    Processing segment at 04:27.240
    * Log probability threshold is not met with temperature 0.0 (-1.334731 < -1.000000)
    Traceback (most recent call last):
      File "D:\whisper-fast\__main__.py", line 657, in <module>
      File "D:\whisper-fast\__main__.py", line 605, in cli
      File "faster_whisper\transcribe.py", line 931, in restore_speech_timestamps
      File "faster_whisper\transcribe.py", line 415, in generate_segments
      File "faster_whisper\transcribe.py", line 651, in generate_with_fallback
    RuntimeError: CUDA failed with error out of memory
    [6356] Failed to execute script '__main__' due to unhandled exception!
    Quote Quote  
  13. Video Damager VoodooFX's Avatar
    Join Date
    Oct 2021
    Location
    At Doom9
    Search PM
    Close all programs using GPU, that includes an internet browser, maybe restart PC. If that doesn't help, set beam to 1.

    Btw, can you share this audio [remuxed with mkvtoolnix without video]?
    Quote Quote  
  14. Video Damager VoodooFX's Avatar
    Join Date
    Oct 2021
    Location
    At Doom9
    Search PM
    Actually, it gets out of memory not because of higher beam, but because of fallback when "--best_of" is at work, it's 5 by default, you can try to lower it.
    If it still gets out of memory then you can disable fallback -> " --temperature_increment_on_fallback=None".
    Quote Quote  
  15. Member
    Join Date
    Apr 2007
    Location
    Australia
    Search Comp PM
    I sent you a pm.

    --temperature_increment_on_fallback=None
    That worked. The error rate is high compared to the medium.en model using the following:
    Code:
    whisper-faster.exe "d:\a\*.mkv" --language=English --model=medium.en --compute_type=int8_float32 --beam_size 5 --output_dir D:\a\ --output_format srt
    And much slower.
    Quote Quote  
  16. Member
    Join Date
    Apr 2007
    Location
    Australia
    Search Comp PM
    Yes, it does work.
    With nothing else changed What's interesting is the punctuation.
    float32 gives me a period at the end of each sentence.

    --model=medium.en --compute_type=int8_float16
    Code:
    Processing segment at 18:27.040
    [18:47.420 --> 18:48.280]  No
    [18:50.040 --> 18:51.240]  Everything has its price
    [18:51.240 --> 18:51.920]  I know
    [18:51.920 --> 18:52.980]  Yes you're very right
    [18:52.980 --> 18:54.320]  You're very right
    [18:55.620 --> 18:56.100]  God
    [18:56.580 --> 18:57.540]  Oh what the heck
    [18:58.340 --> 18:59.020]  Thank you
    [18:59.020 --> 19:00.420]  This is a really good piece
    [19:00.420 --> 19:01.080]  It is
    [19:01.080 --> 19:01.960]  Thank you so much
    [19:01.960 --> 19:02.900]  We appreciate it
    [19:03.440 --> 19:04.080]  Appreciate it
    [19:04.080 --> 19:05.200]  Wonderful thing
    [19:05.200 --> 19:07.120]  One of the nicest things I've ever bought
    [19:07.120 --> 19:08.840]  One of the nicest things I've ever bought
    [19:08.840 --> 19:09.260]  Wonderful
    [19:09.260 --> 19:11.300]  Honestly it's one of the best things I've ever bought
    [19:11.300 --> 19:15.400]  It had that certain magic to it
    Processing segment at 18:55.020
    --model=medium.en --compute_type=int8_float32
    Code:
    Processing segment at 18:38.540
    [18:58.920 --> 19:00.600]  This is a really good piece, sir.
    [19:00.700 --> 19:01.040]  It is.
    [19:01.240 --> 19:01.940]  Thank you so much.
    [19:02.200 --> 19:02.860]  We appreciate it.
    [19:03.520 --> 19:04.080]  Appreciate it.
    [19:04.240 --> 19:04.880]  Wonderful thing.
    [19:05.740 --> 19:07.100]  One of the nicest things I've ever bought.
    [19:07.600 --> 19:08.800]  One of the nicest things I've ever bought.
    [19:08.960 --> 19:09.260]  Wonderful.
    [19:09.600 --> 19:11.280]  Honestly, it's one of the best things I've ever bought.
    [19:12.080 --> 19:15.240]  It had that certain magic to it.
    [19:15.540 --> 19:18.340]  That single piece makes all of those hours
    [19:18.340 --> 19:20.480]  and all of that travel worth it.
    [19:21.040 --> 19:23.500]  And I was particularly pleased with the deal I did on it.
    [19:23.720 --> 19:25.760]  I did give the dealer a little bit of a kicking.
    [19:26.560 --> 19:27.480]  He still made a profit.
    [19:27.760 --> 19:28.680]  I got what I wanted.
    Processing segment at 19:08.300
    Quote Quote  
  17. Member
    Join Date
    Apr 2007
    Location
    Australia
    Search Comp PM
    The output
    Image Attached Files
    Quote Quote  
  18. I think pcspeak was going to close the thread but...

    Just a word of thanks for the two main coders (so far as I know) in this sub forum continuing improvement on this.

    Yes I would have to upgrade a GPU but there was mention of NVidia vs Radeon. I'm a way off from adding cards before I see a proven need for the upgrade.

    onward.
    Quote Quote  
  19. How does int8_float32 on CPU differ from int8 or float32?
    Quote Quote  
  20. Video Damager VoodooFX's Avatar
    Join Date
    Oct 2021
    Location
    At Doom9
    Search PM
    Originally Posted by Ejdehan View Post
    How does int8_float32 on CPU differ from int8 or float32?
    Old "int8" was same as "int8_float32", now "int8" is the auto selection from three "int8_..." variations.
    "float32" requires ~twice more memory than "int8_..." and model loading is much faster with "float32".

    Btw, on my CPU transcription with "float32" is ~twice faster than with "int8_float32".

    There is documentation about different quantizations -> https://opennmt.net/CTranslate2/quantization.html
    Quote Quote  
  21. Hello, I'm using a computer with an i5 9400 CPU and an RX6600 GPU.

    So, when using Whisper-Faster, how should I set it up for the best performance?

    Thank you, everyone.
    Quote Quote  



Similar Threads

Visit our sponsor! Try DVDFab and backup Blu-rays!