VideoHelp Forum




+ Reply to Thread
Results 1 to 15 of 15
  1. Member
    Join Date
    Jan 2007
    Location
    Greece
    Search Comp PM
    Is there any way to convert subtitles from a Video File (that are part of the video - Hard Burned) to black and white images (saved in BMP, like the format *.idx or something) instead of having to recognize them with OCR that is toooo slow?

    SubRip supports that option only for the DVD Subtitles Ripping, but for the Hard Burned Subtitles there isn't any or I cant find it. I'm sure that SubRip can produce some black and white images of the subtitles that are clear to read with eyes but difficult to recognise with OCR. So I think that it is not difficult for a program to create a subtitle file with a sequence of these images.

    Does anyone know how to do this? thanks
    Quote Quote  
  2. Always Watching guns1inger's Avatar
    Join Date
    Apr 2004
    Location
    Miskatonic U
    Search Comp PM
    I have yet to find one that does a good OCR job even on subtitle sub-pictures, let alone ripping them from a moving image. I'll be interested to see what people recommend to you.
    Read my blog here.
    Quote Quote  
  3. Member mats.hogberg's Avatar
    Join Date
    Jul 2002
    Location
    Sweden (PAL)
    Search Comp PM
    I think there is an option in SubRip to OCR from hard subbed AVI to srt. From there, srt2sup to create a "real" DVD subtitle stream. I can't really see any way to do this without taking a route over some kind of text based format. Any app (wether OCR or other) must determine "OK, this is a character". Once that is done, it may as well interpret which character it is, as creating an image of it.
    I find it hard to believe the app would go "OK, this is a character, but dang if I know what it is!" Spotting a character and spotting what character it is, amounts to the same thing.

    /Mats
    Quote Quote  
  4. Member AlanHK's Avatar
    Join Date
    Apr 2006
    Location
    Hong Kong
    Search Comp PM
    http://avielle.chez-alice.fr/video/sublog.html
    to extract

    and maybe you want http://www.compression.ru/video/subtitles_removal/index_en.html to remove them from the video after you've made a separate stream.
    Quote Quote  
  5. Member AlanHK's Avatar
    Join Date
    Apr 2006
    Location
    Hong Kong
    Search Comp PM
    Originally Posted by mats.hogberg
    I find it hard to believe the app would go "OK, this is a character, but dang if I know what it is!" Spotting a character and spotting what character it is, amounts to the same thing.
    Unfortunately not so. You can never blindly trust OCR.
    Download some of the subs from sites like http://www.opensubtitles.org.

    While many are quite good, hardly any are perfect. Many seem to have been created by Subrip-style OCR, as the mistakes aren't normal typos but OCR errors, like confusing I, l, 1 (India, land, 123), r/n, etc, running words together. And one thing that irritates me is punctuation, dashes (—) are always converted to hyphens (-), and typographic quotes (“”) become straight quotes(").
    [edit: Hmm, this stupid interface has converted my characters to Unicode numbers.]

    Whenever I want to use one of these files, I spend about an hour cleaning up those mistakes, not including syncing it with my version.
    Quote Quote  
  6. Member mats.hogberg's Avatar
    Join Date
    Jul 2002
    Location
    Sweden (PAL)
    Search Comp PM
    Ah, yes, that's true, you can't blindly trust OCR, but that was not my point.
    That kind of errors are mostly due to the user incorrectly telling the OCR that "1" is in fact "l" or that the typeface used doesn't make 1 and l enough dissimilar.
    So, in this case, the engine spots a character, and also find what it is (but at times make mistakes).

    /Mats
    Quote Quote  
  7. Member
    Join Date
    Jan 2007
    Location
    Greece
    Search Comp PM
    Thanks AlanHK for the suggestion of these programs, I will try to use them to find this gives a solution to my problem.
    Quote Quote  
  8. Member
    Join Date
    Jan 2007
    Location
    Greece
    Search Comp PM
    I tested the SubLog plugin for VirtualDub but I had some problems.

    First I had to change that part of the idx file

    # Original frame size
    size: 640x96

    to

    # Original frame size
    size: 640x532

    cause the subs was covering all the screen. Then there were some problems in some subs like the image "untitled.png". I also to move the subtitles more down cause they appeared very high.

    When I choosed not to save in vobsub format the image I took was like the "test-4.png " which was better than the one created for vobsub option.

    But the best picture I got was from SubRip "image_14.PNG" but there are no timecodes for these images SubRip is creating.

    Have you used SubLog program succesfully?

    Do you know any application to adjust the timecodes of an idx/sub subtitle file (like subtitle workshop do for the other text formats)?


    (Noet: I will upload the images I mentioned above)[/img]
    Quote Quote  
  9. Member
    Join Date
    Jan 2007
    Location
    Greece
    Search Comp PM
    test-4.PNG
    Quote Quote  
  10. Member
    Join Date
    Jan 2007
    Location
    Greece
    Search Comp PM
    image_14.PNG
    Quote Quote  
  11. Member
    Join Date
    Jan 2007
    Location
    Greece
    Search Comp PM
    sorry... I couldnt find the way to upload the images. If somebody could tell me how to..
    Quote Quote  
  12. Member AlanHK's Avatar
    Join Date
    Apr 2006
    Location
    Hong Kong
    Search Comp PM
    Originally Posted by jhammer00
    Do you know any application to adjust the timecodes of an idx/sub subtitle file (like subtitle workshop do for the other text formats)?
    Try SubToSup to convert to Sup format.

    Then DVDSubedit can open the Sup to change the times, colours, positions.
    Quote Quote  
  13. Member
    Join Date
    Jan 2007
    Location
    Greece
    Search Comp PM
    Ok, here is the result from SubLOg. the only problem is that there are some horizontal lines sometimes that I couldn't get rid of by adjusting the settings of the filter.



    When I unchecked the option "Save in VobSub format" the SubLog produced images in seperate files and the result was better.



    But unfortunately there is no player that playes subtiles from images files as far as I know. Maybe if there was a way to convert a sequence of images to idx/sub format? (I think you must have the timecode to another file)

    Also the SubRip extracted very good quality images of the subs but again only as image files and not in idx/sub format.

    SubRip example:



    SubRip made the best subpictures (better than SubLog) but unfortunately it doesn't have the option to save these subpictures as an idx/sub file. If it could do that it would be just perfect for ripping hardburned subtitles form video files without OCR recognition. I tried to contact the guys who work on SubRip to make a suggestion about including such an option in a future version of SubRip but I didn't get any reply.

    Do you know who to contact about SubRip?

    Or maybe how can I improve the results of SubLog filter?
    Quote Quote  
  14. Member manusse's Avatar
    Join Date
    Jun 2006
    Location
    France
    Search Comp PM
    Do you know who to contact about SubRip?
    You can contact ai4spam on the doom9 forum but I think he doesn't have much time for development now:
    http://forum.doom9.org/member.php?u=41898

    Cheers
    Manusse[/quote]
    Quote Quote  



Similar Threads

Visit our sponsor! Try DVDFab and backup Blu-rays!