Is there any way to convert subtitles from a Video File (that are part of the video - Hard Burned) to black and white images (saved in BMP, like the format *.idx or something) instead of having to recognize them with OCR that is toooo slow?
SubRip supports that option only for the DVD Subtitles Ripping, but for the Hard Burned Subtitles there isn't any or I cant find it. I'm sure that SubRip can produce some black and white images of the subtitles that are clear to read with eyes but difficult to recognise with OCR. So I think that it is not difficult for a program to create a subtitle file with a sequence of these images.
Does anyone know how to do this? thanks
+ Reply to Thread
Results 1 to 15 of 15
-
-
I think there is an option in SubRip to OCR from hard subbed AVI to srt. From there, srt2sup to create a "real" DVD subtitle stream. I can't really see any way to do this without taking a route over some kind of text based format. Any app (wether OCR or other) must determine "OK, this is a character". Once that is done, it may as well interpret which character it is, as creating an image of it.
I find it hard to believe the app would go "OK, this is a character, but dang if I know what it is!" Spotting a character and spotting what character it is, amounts to the same thing.
/Mats -
http://avielle.chez-alice.fr/video/sublog.html
to extract
and maybe you want http://www.compression.ru/video/subtitles_removal/index_en.html to remove them from the video after you've made a separate stream. -
Originally Posted by mats.hogberg
Download some of the subs from sites like http://www.opensubtitles.org.
While many are quite good, hardly any are perfect. Many seem to have been created by Subrip-style OCR, as the mistakes aren't normal typos but OCR errors, like confusing I, l, 1 (India, land, 123), r/n, etc, running words together. And one thing that irritates me is punctuation, dashes (—) are always converted to hyphens (-), and typographic quotes (“”) become straight quotes(").
[edit: Hmm, this stupid interface has converted my characters to Unicode numbers.]
Whenever I want to use one of these files, I spend about an hour cleaning up those mistakes, not including syncing it with my version. -
Ah, yes, that's true, you can't blindly trust OCR, but that was not my point.
That kind of errors are mostly due to the user incorrectly telling the OCR that "1" is in fact "l" or that the typeface used doesn't make 1 and l enough dissimilar.
So, in this case, the engine spots a character, and also find what it is (but at times make mistakes).
/Mats -
Thanks AlanHK for the suggestion of these programs, I will try to use them to find this gives a solution to my problem.
-
I tested the SubLog plugin for VirtualDub but I had some problems.
First I had to change that part of the idx file
# Original frame size
size: 640x96
to
# Original frame size
size: 640x532
cause the subs was covering all the screen. Then there were some problems in some subs like the image "untitled.png". I also to move the subtitles more down cause they appeared very high.
When I choosed not to save in vobsub format the image I took was like the "test-4.png " which was better than the one created for vobsub option.
But the best picture I got was from SubRip "image_14.PNG" but there are no timecodes for these images SubRip is creating.
Have you used SubLog program succesfully?
Do you know any application to adjust the timecodes of an idx/sub subtitle file (like subtitle workshop do for the other text formats)?
(Noet: I will upload the images I mentioned above)[/img] -
sorry... I couldnt find the way to upload the images. If somebody could tell me how to..
-
-
Originally Posted by jhammer00
Then DVDSubedit can open the Sup to change the times, colours, positions. -
Ok, here is the result from SubLOg. the only problem is that there are some horizontal lines sometimes that I couldn't get rid of by adjusting the settings of the filter.
When I unchecked the option "Save in VobSub format" the SubLog produced images in seperate files and the result was better.
But unfortunately there is no player that playes subtiles from images files as far as I know. Maybe if there was a way to convert a sequence of images to idx/sub format? (I think you must have the timecode to another file)
Also the SubRip extracted very good quality images of the subs but again only as image files and not in idx/sub format.
SubRip example:
SubRip made the best subpictures (better than SubLog) but unfortunately it doesn't have the option to save these subpictures as an idx/sub file. If it could do that it would be just perfect for ripping hardburned subtitles form video files without OCR recognition. I tried to contact the guys who work on SubRip to make a suggestion about including such an option in a future version of SubRip but I didn't get any reply.
Do you know who to contact about SubRip?
Or maybe how can I improve the results of SubLog filter? -
Do you know who to contact about SubRip?
http://forum.doom9.org/member.php?u=41898
Cheers
Manusse[/quote]
Similar Threads
-
Create Subtitle File from Bitmaps
By otternase in forum SubtitleReplies: 2Last Post: 30th Jul 2010, 12:26 -
Extract Subtitle Stream from MPG (Used VOB2MPG to create file)
By secretsubscriber in forum SubtitleReplies: 2Last Post: 4th Jun 2009, 14:02 -
How to create an .avi file with only images and mp3?
By RZ_29 in forum Newbie / General discussionsReplies: 1Last Post: 1st Mar 2009, 11:47 -
How to create a DVB subtitle in ts file
By elmc in forum SubtitleReplies: 0Last Post: 23rd Jul 2008, 10:59 -
How to create subtitle file from transcript text file
By amagrace in forum SubtitleReplies: 7Last Post: 8th May 2008, 11:44