VideoHelp Forum
+ Reply to Thread
Results 1 to 26 of 26
Thread
  1. I've had Ok luck with using Whisper in the small model version on some shorter pieces. But a full length play in mkv repeatedly hung after a number of hours. Yes, the job ran for a long time on a PC with no other activity: HD light stays on and everything else froze. Since it was on overnight I'll say it was running for 24 hours. I expected a long run for this but twice I've seen the problem displayed so I'll leave off of trying it again.

    Previously I had seen SE freezeups in normal sub editting and could get it going with a reboot by running SFC System File Checker in
    Windows 10. I found that solution by accident but perhaps it points to some other problem. Unfortunately, error reporting does not take place that I'm aware that is visible. Is there a log someplace for the Whisper or VOSK options that turn?
    Quote Quote  
  2. Member
    Join Date
    Mar 2008
    Location
    United States
    Search Comp PM
    Yes there is a file called error_log.txt in the SE folder.
    You could also check the system event log (eventvwr.msc from the RUN box) Application and System events at the corresponding times
    Quote Quote  
  3. Thanks Dave,

    I'll take a look at that. I wasn't aware how the special dialog that opens for VOSK or Whisper reports to SE logs.
    Quote Quote  
  4. Member
    Join Date
    Mar 2021
    Location
    Israel
    Search Comp PM
    Originally Posted by loninappleton View Post
    I've had Ok luck with using Whisper in the small model version on some shorter pieces. But a full length play in mkv repeatedly hung after a number of hours. Yes, the job ran for a long time on a PC with no other activity: HD light stays on and everything else froze. Since it was on overnight I'll say it was running for 24 hours. I expected a long run for this but twice I've seen the problem displayed so I'll leave off of trying it again.

    Previously I had seen SE freezeups in normal sub editting and could get it going with a reboot by running SFC System File Checker in
    Windows 10. I found that solution by accident but perhaps it points to some other problem. Unfortunately, error reporting does not take place that I'm aware that is visible. Is there a log someplace for the Whisper or VOSK options that turn?
    I installed Whisper AI CMD version and I had excellent results with it. It has more flexibility than using SE.
    You should use GPU instead of CPU.
    Depending on your GPU, you can get better and faster transcription using at least medium model for example with 8GB GPU.
    You can help it if you isolate the speech by using Soleeter and use that with Whisper AI. This way there are no confusing surrounding noises.
    Please check the attached Whisper AI help file. There are many conditions that can help you prevent it from freezing.
    I found that this option helps it best.
    --condition_on_previous_text False
    Image Attached Files
    Quote Quote  
  5. Hello Subtitle and thanks.

    I took a quick look at the attachment. In the past I've encountered errors since I don't understand command line or the errors it generates. That's why I use SE. I still have to look at that log as well.

    No one has said why running SFC /scannow would bring back an SE freezeup so far. That is also peculiar.
    Quote Quote  
  6. Member
    Join Date
    Mar 2021
    Location
    Israel
    Search Comp PM
    Whisper AI needs a lot of resources especially RAM, GPU and CPU. Also it needs all other applications closed while transcribing.
    I have Intel i5 and 16GB RAM and GPU with 8GB and it can still freeze on me if I as much as breath.
    Command line option is very easy to set up. If you intend to give it a try, then you must have Python version 3.10.7 and please note that the latest version will not work well with Whisper AI.
    Quote Quote  
  7. Member
    Join Date
    Mar 2021
    Location
    Israel
    Search Comp PM
    I have tried to transcribe the lyrics of the song Ventura Highway, America with different Whisper AI models.
    The highest I could get is medium because my GPU is only 8GB and increasing it would mean replacing the power supply with at least 1000W.
    As you can see only the medium transcription has few errors.
    https://www.youtube.com/watch?v=Ps2CMcyditQ&ab_channel=PeutEtreDejaVu
    Image Attached Files
    Quote Quote  
  8. Originally Posted by loninappleton View Post
    Thanks Dave,

    I'll take a look at that. I wasn't aware how the special dialog that opens for VOSK or Whisper reports to SE logs.
    Without Everything search tool I'd be lost. The log file is in the roaming location of the SE install.
    Not much showing in it however-- I've copied in the the two most recent starts of the program.

    But I think there is something else going on with long load times etc. I've repeatedly turned off Windows updates
    in services.msc only to have it return. I turned it off again. Perhaps with no cold restarts it will stay disabled and I will retry getting Whisper to complete my play file. It is long and contains 18th century english. All that might slow things down. I'm just trying to see if I can avoid the freezups. I'll start that today...

    The PC running the job has only this task going. Other processes I'd have to check msconfig. Another possibilty is to pull the ethernet plug when I'm doing one of these. Anu suggestions on paring down CPU activity are welcome.

    I will pull the ethernet just to see how it goes.





    -----------------------------------------------------------------------------
    Date: 06/06/2023 23:00:56
    SE: 3.6.13.0 - Microsoft Windows NT 10.0.19045.0 - 64-bit
    Message: C:\Users\lon\AppData\Roaming\Subtitle Edit\Whisper\main.exe --language en --model "C:\Users\lon\AppData\Roaming\Subtitle Edit\Whisper\Models\small.en.bin" --output-srt --print-progress "C:\Users\lon\AppData\Local\Temp\a3442e20-61f4-4b9a-8a6d-d6a880b76750.wav"

    -----------------------------------------------------------------------------
    Date: 06/06/2023 23:02:20
    SE: 3.6.13.0 - Microsoft Windows NT 10.0.19045.0 - 64-bit
    Message: C:\Users\lon\AppData\Roaming\Subtitle Edit\Whisper\main.exe --language en --model "C:\Users\lon\AppData\Roaming\Subtitle Edit\Whisper\Models\small.en.bin" --output-srt --print-progress "C:\Users\lon\AppData\Local\Temp\09895080-c665-4df2-81d0-5a3ab2fb378a.wav"
    Quote Quote  
  9. Member
    Join Date
    Mar 2021
    Location
    Israel
    Search Comp PM
    Try to this job on Google Notebook (Google Colab). I haven't used it myself because I have trouble with Google Drive but few friends used it and were pleased with it.
    Your computer freezes because it doesn't have enough resources. The log doesn't tell you that unfortunately.
    Quote Quote  
  10. I will have a look at Google Notebook though never heard of it.
    Today I'm making another attempt to run the job in SE Whisper with ethernet unplugged and all services shut down through msconfig.

    While running the job I have Task Manager open and the CPU activity is flat at about 50% of usage. With that going I'll be able to
    see if data stops and freezes. Keep in mind I saw data flowing the old way for the good part of a day then something froze. I have the idea that
    Win10 uses it's services to interfere. Not an ourageous thought when you read all of the snoops and such Windows has routinely running. Open services and start reading a few. There's over 100.

    However if I want to connect ethernet again, services will have to resume.
    Quote Quote  
  11. Originally Posted by Subtitles View Post
    Try to this job on Google Notebook (Google Colab). I haven't used it myself because I have trouble with Google Drive but few friends used it and were pleased with it.
    Your computer freezes because it doesn't have enough resources. The log doesn't tell you that unfortunately.
    I will sound like Mr. Goldilocks but in looking over Google colab/notebook it refers to Python installs and Github etc-- things that are
    beyond me. The user interface at Subtitle Edit is about my limit.
    Quote Quote  
  12. Member
    Join Date
    Mar 2021
    Location
    Israel
    Search Comp PM
    Originally Posted by loninappleton View Post
    Originally Posted by Subtitles View Post
    Try to this job on Google Notebook (Google Colab). I haven't used it myself because I have trouble with Google Drive but few friends used it and were pleased with it.
    Your computer freezes because it doesn't have enough resources. The log doesn't tell you that unfortunately.
    I will sound like Mr. Goldilocks but in looking over Google colab/notebook it refers to Python installs and Github etc-- things that are
    beyond me. The user interface at Subtitle Edit is about my limit.
    I understand. You prefer to use GUI applications.
    You can try VideoStudio Pro 2023 or VideoStudio Ultimate 2023 they have 30 days fully functional free trial and their software includes Speech to Text converter.
    I tried it and it is relatively simple to use. Apparently they are based on VOSK. It should run smoothly without freezing unless your computer has other issues.
    Quote Quote  
  13. Since yesterday the job is still going so we'll see what the result is.
    On the VOSK, that always worked well in the SE application.
    Using Task Master to monitor provides what the SE routine does not : run time,
    any spikes in activity (none) and so on.

    Thanks for the help on this.
    Quote Quote  
  14. Member
    Join Date
    Mar 2021
    Location
    Israel
    Search Comp PM
    Apparently SE can't process the small model because it needs 2GB of VRAM (Video RAM).
    Try the tiny or base models and see if you get better response.
    As I mentioned before, Whisper AI needs a lot of resources.
    As a general rule, if the computer hangs or freezes, I just switch it off there is no point in letting it work all night because it will not do anything.
    Is your video accessable easily? I can make you srt files with different models using my system with Command Line. It has never failed me unless I ask it to do the impossible which is the large model.
    Quote Quote  
  15. I have had success.

    Don't know where my original message on went but here's the result:
    I looked at Task Manager this, the following day, and it had some blips then no activity in the display. The job was done
    after 20 hours which I don't see at unusual using the small model.

    So I replugged the ethernet and reset msconfig to its normal mode with the services back on.

    There may be some programming trick available to SE to run Win10 normally.

    thanks to all who answered.
    Quote Quote  
  16. One further observation.

    I looked at a very early Whisper job I did but had not scanned the whole length of the piece.
    I have no explanation for a page or so long series of "hiccups" where the same line is repeated
    over and over-- not an error message but a short piece of text such as "continues reading."
    There is then nothing after it but that same two word phrase for a page or so and time stamped as I recall.

    I can only assume or conjecture that before the Ethernet was unpluggesd, those were
    attempts to access the system.
    Quote Quote  
  17. Member
    Join Date
    Mar 2021
    Location
    Israel
    Search Comp PM
    Originally Posted by loninappleton View Post
    One further observation.

    I looked at a very early Whisper job I did but had not scanned the whole length of the piece.
    I have no explanation for a page or so long series of "hiccups" where the same line is repeated
    over and over-- not an error message but a short piece of text such as "continues reading."
    There is then nothing after it but that same two word phrase for a page or so and time stamped as I recall.

    I can only assume or conjecture that before the Ethernet was unpluggesd, those were
    attempts to access the system.
    This kind of Whisper behaviour can be prevented with the command line version by adding to the script:
    --condition_on_previous_text False

    The full description is in the Whisper help file I attached previously

    --condition_on_previous_text CONDITION_ON_PREVIOUS_TEXT
    if True, provide the previous output of the model as a prompt for the next window; disabling may make the text inconsistent across windows, but the model becomes less prone to getting stuck in a failure loop (default: True)

    What this means is that it prevents the repetition of the same text over and over again.
    Quote Quote  
  18. Yes, I realized it was a different problem after I recoded the same piece last night. Thanks for the tip but I am not using command line. Is there some sort of trigger from special characters like " " , [ ] * in the program that does this ? Is this a bug that SE can fix in the load of the program ?
    Quote Quote  
  19. Member
    Join Date
    Mar 2021
    Location
    Israel
    Search Comp PM
    Originally Posted by loninappleton View Post
    Yes, I realized it was a different problem after I recoded the same piece last night. Thanks for the tip but I am not using command line. Is there some sort of trigger from special characters like " " , [ ] * in the program that does this ? Is this a bug that SE can fix in the load of the program ?
    Try isolating the speech from the environment audio noise using apps like spleeter.
    https://www.youtube.com/watch?v=Yk2_vTLE_pM&ab_channel=OWC
    But since you don't use command line, search for alternatives on how to isolate speech or vocals in GUI.
    Quote Quote  
  20. A good idea.

    Apologies for confusing editing with translating above.

    I was informed earlier for best result in Whisper to let the program load the audio from the MKV.

    If loading the (I guess) WAV directly I could use what you suggest or the multiple voice tools available from such as Goldwave or
    Audacity or an old favorite called The Levelator which was developed for pod casts and brings voice forward. That produces a WAV directly.
    Quote Quote  
  21. I looked at spleeter remarks. Some conflicts there with usage. But I think that Levelator which I know works and doesn't produce any weird sound from trying to split tracks is a good place to start. It's just a simple drag and drop. I've used it for years.

    here's a bit about it:

    https://en.wikipedia.org/wiki/Levelator

    This does sound like a Whisper problem though.
    Quote Quote  
  22. Member
    Join Date
    Mar 2021
    Location
    Israel
    Search Comp PM
    None of these apps will give you a good vocal track.
    Try Acoustica they have 30 days free trial and I think they use Spleeter for separating the vocal track.
    https://acondigital.com/products/acoustica
    Quote Quote  
  23. Thanks for all the referals on this.

    I'd prefer fixes to the tools themselves rather than try to do patches. I'm not an expert user. Filtering for voice with something like Levelator is all I can manage. The source is old-- 1978 and from a public source on Youtube. It may be just the audio degradation. However VOSK is a good fall-back for me. My original for this piece was completed in VOSK and then hand-editted. It's nearly done. I'm just proofing it by ear and looking up some phrases.
    Quote Quote  
  24. Member
    Join Date
    Mar 2021
    Location
    Israel
    Search Comp PM
    I didn't realize that this was just a one time task.
    Anyway I am glad you got your transcription finally.
    When you feel brave enough to consider command line Whisper AI setup, I will be happy to help.
    It usually takes less than 20-30 minutes to setup everything from scratch for experienced user.
    Quote Quote  
  25. Thanks for the considerate offer. A while back there was a member who showed how to set up the Python etc and I tried that.
    Errors ensued so I'd rather leave it to SE and Whisper. After all, SE has a whole team of contributers. If the problem is a known error, they should get it resolved. I have not contacted SE directly on it. Perhaps you or others could explain the problem better than I.

    As to tasks, I've only seen this peculiar error from one source. But like I said, the source may be degraded in some way I don't understand.
    Quote Quote  
  26. Update:

    I have now seen the repeated line error... in Whisper on a new project using SE.
    But the text eventually comes back on. I did not pull the power plug or touch the rig until the job showed
    complete.

    To compensate, and maybe complete this task, I'm doing the same job with VOSK, then see if I can fill in the lines just listed for a time as [ music] [ music ] [ music ] .... after a music sequence had concluded in the Whisper transfer. The behaviour is very odd since the spoken language is very clear.

    To Whisper's credit, the lines that do transfer to text are largely very good using the small model in Subtitle Edit.

    I wondered if the medium model would now work as I continue to unplug ethernet to avoid any Microsoft jamming ? (see previous posts on pulling ethernet plug to complete a job above)
    Quote Quote  



Similar Threads

Visit our sponsor! Try DVDFab and backup Blu-rays!