Hello again,
I have a short clip of French solo singing from a proshot play. It does not record on the voice to text SRT I have. What combination of technique from VOSK or Whisper can help with this-- best in Subtitle Edit ?
In other words can VOSK or Whisper be set up for French, Spanish etc in SE?
The goal is just to transcribe the French text then possibly an English translation. But the fact that it is sung can short circuit any attempt to do this. Just say not doable if so.
+ Reply to Thread
Results 1 to 27 of 27
-
-
Is this a well known song that you can find lyrics online somewhere?
-
Is this on YouTube or any public site we can hear? Maybe you could upload it? There are plenty of sites that will transcribe audio for free online, I remember trying a few. Just google it. Once you get that, use Google translation.
Like any speech to text accuracy isn't 100% unless you are willing to pay for it and then you are further converting it into another language which isn't 100% accurate either. It's not going to be very good in any usable means but can be done in basic terms.
I don't know French, perhaps someone on here does and can help. -
SE Whisper Audio to Text can transcribe French and Spanish and many other languages.
Songs will work even better than normal dialogue. I have transcribed songs by Edith Piaf so this should work for your song as well.
Just select the language and the model, load the video or audio file and it should work.
But you are limited by the model that you can select perhaps the tiny might work for you (looking at your system, unless you have upgraded it by now). It is not that accurate obviously. Also a lot depends on the sound quality of that song.
Check spotify they have lyrics on some of the content, but if this is not a well known song, chances are that you will not find it on Spotify.
As mentioned in one of the replies above, upload the song and perhaps someone will be able to help you.Last edited by Subtitles; 18th Jul 2023 at 10:06.
-
Thanks for answering,
Unfortunately it's just some extempore lyrics.. not in the text of the play. As you know I've run into something
similar earlier (since you helped on it.) I asked a friend with these interests if they knew any French. That is the easiest and best way I suppose. -
@Subtitles.
I'll give it whirl with Whisper but it fails for me repeatedly. The last time I used Whisper I had Maria Callas, not Edith Piaff.I wanted to know if I was missing changing a language format from English to French and the procedure to get the model.
I have a fresh request in with Subtitle Edit to solve the looping problem in Whisper described just a few days ago and pointing to the piece of code mentioned here in videohelp.Last edited by loninappleton; 18th Jul 2023 at 11:34.
-
If you have tried it and it failed, then there is no point in trying again.
The problem with SE is that they install by default the cpp software which is just not good enough.
You will always have this limitation unless you install one of the Whisper bundles mentioned in the Engine (Top right)
cpp (C++ already installed)
Open AI (Needs Python)
Const-me
CTranslate2
WhisperX
Here is some information about these bundles:
A simplified/optimized version called whisper.cpp (written in C++)(Already installed)
The original OpenAI Whisper (requires Python)
A GPU optimized version called Whisper Const-me (written in C++)
A optimized version called Whisper CTranslate2 (also known as FasterWhisper) (requires Python)
Another Python version called whisperX (requires Python) -
Anything where there is no command line work. Sorry, I just can't cope. Anyway if the Python is installed will these options run ok from the Subtitle Edit menu?
-
Yes.
Open the SE Help and type Whisper. It will take you to the list I mentioned above and the download links -
Ok, so select help file first. I'll try it as I go here....
Selecting help in SE just goes to the Nikse FAQ.
I see that from previous attempts at things, I do have Python 3.9 in a folder that shows 15 Mb. I don't know if that
is right but it apparently installed for me or I can redo it.
Also, even though the Whisper Window shows a whole long list of languages, I had not tried to select any of those. Is there additional models information to install for all that large amount of languages? Or will it just run in the normal way?
Also, the only GPU I have for this is the onboard video on an ASUS A320 with a small Ryzen CPU. -
last thing today.
There's the mkv clip altogether maybe four minutes.
On the application, I got balled up trying to select Whisper X. A message at the top of a Desktop window said WhisperX looking for Python exe. Then it automagically reverts to CPP.
I may not want to chase this rabbit down the hole but will see any replies.
The song in the clip will, IIRC correctly, be appreciated by Edith Piaff fans.
The SRT output in CPP set to French and the small model gave the English spoken text and the usual "speaking a foreign language" given at the sung parts.
Maybe someone knows a smattering of French.
Anyway, we know Python is on the PC. It may be necessary to put it in the SE application or roaming path someplace.
Enough for now. I would like to know if anyone gets or is moved to get the sung part on the clip with their tools. -
First of all, it is not enough to just install Python in your system. You need to follow the instructions on how to install each Whisper bundle when using Python.
The song is very short and the sound level is so low I had to put the volume up to maximum.
Whisper doesn't handle mixed languages well. In the case of your clip, you can transcribe twice for each language French and then English and after that make a combined srt file, obviously by keeping only the correct language lines.
But here is the transcription and the translation srt files. Minor editting is needed. They look good but you will be the judge of that.Last edited by Subtitles; 19th Jul 2023 at 04:02.
-
Much appreciated. Thanks for doing this. I didn't know audio volume was an issue.
On the Python installs one of the other members had made a help file which did those things and I think brought it to videohelp at one time. I had tried some of that but had difficulties. I did retain bearmancer's text help and just looked at that. It is an involved process. The screen shot I made of the routine with lots of command line is also too small to read. It needs an auto install.
I'm sure that someone will take that on. -
Glad I was able to help. Nice video clip.
I think you will get better results if you add to your computer a GPU. At least 2GB or 4GB. It doesn't have to be new perhaps you can find a used one in a computer repair shop or on eBay. A new one costs about $50 on Amazon but a used one maybe $20. A lot of gamers upgrade their GPU to better ones so they sell the old ones.
Check the PSU requirement but a 2GB shouldn't be a problem hopefully. -
I did find my notepad copy of the bearmancer routine for any interested. But I hesitate on just posting it to videohelp. I can load it to
my cloud site at Mediafire if there is interest. Or maybe Bear is still on here and can clarify. His routine includes another source/program called Decipher. In sum it takes a different eye on this than I can manage. -
Yes, there are things I should get to upgrade. I'd like to see a working solution for these various problems. Until then I get by with minimum graphics since I don't game etc.
I'm sure Partpicker or one of those can make a good suggestion when needed. It would be a Radeon of some sort to use with the Ryzen 3400 and MSI A320M-K. -
Just a note on the video clip. It is from The Beaux Stratagem by George Farquhar and is available on the net. A subtitle was available but the things I do I strive for word-accurate subs.
The errors are humorous ( my favorite is poultry for paltry) but they are too numerous to mention. That's why better audio to text is needed. That and rewriting most every line to reflect accurately who is speaking.Last edited by loninappleton; 25th Jul 2023 at 12:25.
-
-
Because you need to install it first YOURSELF in your system in order for SE can recogize it.
-
-
Some progress.
I used the Conste-me option and it self installed. I did one short test with that same Beax Stratagem clip with some improvement and in the short clip at least was able to step up to the medium size model.
There are some longer tests like a repeated fail on a Maria Callas documentary that may get past the looping problem.
That fail was going from a sung passage back into interviews which only responded thenceforth with errors.
thanks for all your patience. -
Using medium model can cause a lot of issues unless you have a GPU with 8GB VRAM.
I would try using tiny or base (need 1GB VRAM) or small (needs 2GB VRAM).
Fun fact in case you didn't know: Click on F2 during transcription and you should see the task progress and the subtitles as they are generated. -
I'll have to try the F2 though I run Task Manager during the job which shows all the CPU usage etc and indicates when the job is done. All this seems more and more a like kluge-- something that's not ready for prime time at all.
When I tried the combination of Conste-me whatevs that is C++ based with medium model it failed on a 90 min documentary: everything after minute 50 was blanked but it does finish to end of the timed video. I also unplugged ethernet for any potential interference. It seemed that the 90 minute documentary got done way too fast in the normal scheme of one of these jobs. GPU ran at 99 or 100% the whole time and CPU usage was pretty low. -
I noticed today in trying to revert to smaller models that "tiny" is in the models folder but does not show as an option with CPP in Subtitle Edit. iny has an unusual notation as :
tiny.en.bin.$$$
unlike the other two that just end in .bin. I have not attempted any corrections on it. -
What is the size of the files?
My SE went crazy and wouldn't download any models *.bin files. The models subfolder has the files name but they all have zero size.
Be careful this can happen to you as well and I have no idea how to fix it, at least not on this computer.
Similar Threads
-
Subtitle needs French Italian langauge translation within the body of text
By loninappleton in forum SubtitleReplies: 12Last Post: 25th May 2023, 19:04 -
Audio to text? Forign audio to text then to English?
By ChasVideo in forum AudioReplies: 0Last Post: 8th Jul 2022, 10:23 -
MPC - disable english subtitles when the audio is english
By Senai in forum SubtitleReplies: 1Last Post: 13th Apr 2022, 15:27 -
how to deal with special characters such as french accents
By inklara in forum Video Streaming DownloadingReplies: 7Last Post: 10th Jan 2022, 11:13 -
Someone not able to receive a video in a text or clip of movie in an email
By videofan70 in forum Newbie / General discussionsReplies: 25Last Post: 24th Feb 2021, 14:34