VideoHelp Forum




+ Reply to Thread
Results 1 to 2 of 2
  1. I'm pulling a subtitle out of an mkv file I have to edit it. The srt file that is created has all the lower case L's turned in to upper case I's as if it's doing OCR on the file and not extracting text. As per MKVMerge GUI the track type is S_TEXT/UTF8. I have played the video in VLC and the soft subs show up correctly.
    Is the tool doing OCR or could this be a bad character encoding or any other suggestions an getting a good srt text file out?

    Thanks,
    Quote Quote  
  2. Member AlanHK's Avatar
    Join Date
    Apr 2006
    Location
    Hong Kong
    Search Comp PM
    Presumably you're using MKVextract.
    That gives you the exact same text as displayed in VLC.
    It doesn't and can't OCR.

    But the guy who created the subtitle obviously used OCR, maybe on DVD subs, and didn't bother to check it.

    If you can change the sub font in VLC you will see the subs show the same error. But in Arial and similar sans fonts I and l look identical.
    If you use a serif font, like Georgia or Times or Courier, then you can clearly see the difference.
    Georgia also has a more distinct number 1.

    I l 1 Arial



    I l 1 Georgia



    I l 1 Times



    I l 1 Courier


    There isn't a "good" subtitle hiding in the video, it's fucked and you have to fix it yourself if you want a good one.
    I often extract the subs so I can fix and replace them for exactly that reason.

    And I use Georgia as the text font in Subtitle Workshop to make it more obvious.
    It has some functions to find and fix such common errors.

    Spellcheck will find most of the rest.
    Last edited by AlanHK; 16th Nov 2013 at 21:38.
    Quote Quote  



Similar Threads

Visit our sponsor! Try DVDFab and backup Blu-rays!