VideoHelp Forum
+ Reply to Thread
Results 1 to 10 of 10
Thread
  1. In the past I have extracted the Closed Captions subtitles from a DVD. I used VSRip and after a few minutes I had all of the subtitles error free. Now I want to extract the regular subtitles from another DVD. I tried to use SubRip and I was always asked to type in letters and the final result totally unacceptable (full of mis-spellings). My question is this: I thought the subtitles were digitally encoded in the DVD. Why can't they simply be digitally extracted (like when I extracted out the Closed Captions)? Thanks for any insight on this topic.
    Quote Quote  
  2. In what format do you want the subs? What do you plan on doing with them? The 2 most common image type subs are SUP and IDX/SUB.

    To get the SUP files, open the right IFO in PGCDemux and save them out. To get the IDX/SUB format subtitles, open the right IFO in VobSub Configure (comes in the VobSub package), and save them. You can also use VSRip for the job. Neither way requires an OCR. Other image-based formats include SON and SST subs, which can be gotten by opening the IFO in SubRip and saving them to the graphical sub format of your choice.

    Any text based subtitle format (SRT, SSA, etc.) requires an OCR, which you're trying to avoid.
    Quote Quote  
  3. I normally get subtitles files from the web, and they are in sub or srt format which I am familiar with. I can use Time Adjuster to sync the subtitles and I can see the time (using notepad) in minutes and seconds when the subtitles should appear in the video. When everything is right, I use DVD Flick along with the srt or sub file with the avi to make the DVD. If I can't get the subtitles from the web, then it seems that I should be able to get them from the DVD directly and rip them into the srt or sub file. I am not familiar with what you mentioned (IDX/SUB) and I wouldn't know how to adjust the timing with that, and furthermore, I don't think DVD Flick will handle that subtitle format.
    Quote Quote  
  4. Member mats.hogberg's Avatar
    Join Date: Jul 2002
    Location: Sweden (PAL)
    Search Comp PM
    SubRip doesn't know an A from a Z when it starts looking at the subtitles (just like a child that is learning to read for the first time), so it will ask you once (unlike the child) for each first time it encounders a character it hasn't seen before. Once you've told it that a Z is really a Z, it wont ask you again about Z.
    First subtitles, it needs a lot of interactivity, but after a while, it knows most characters, and decodes (OCR) to text pretty much without user intervention.
    If you pay close attention, switch to italics when it is italics and so on, the output has been more than satisfactory, at least for me. If you need DVD subtitles in text format, I've found no better way than to use SubRip.

    /Mats
    Quote Quote  
  5. The root of all evil träskmannen's Avatar
    Join Date: May 2005
    Location: Belgium
    Search Comp PM
    Originally Posted by jimdagys
    My question is this: I thought the subtitles were digitally encoded in the DVD. Why can't they simply be digitally extracted?
    The subtitles in a DVD are a series of pictures of the text, not as text itself (like a screenshot of a word-document instead of the word document itself). This means that you have to OCR them if you want them as text. Why do you want them as text, what are you planning to do with the subs?
    In the beginning the Universe was created. This has made a lot of people very angry and been widely regarded as a bad move.
    Quote Quote  
  6. Yep.
    I normally get subtitles files from the web, and they are in sub or srt format which I am familiar with.
    And to create those subs in SRT or SUB format, someone performed an OCR - for every single one. If you want them in SRT format, you'll have to do the same thing. The reason I went all through that in my first post was 1), you were trying to avoid an OCR (all the typing), and 2), you didn't provide near enough detail about what you were up to, and there are plenty of uses for subs in the graphical formats I described how to get.

    Learn to type better. Once you get the hang of it, you should be able to get the subs for an entire movie using SubRip in 10-15 minutes, tops. There are programs out there that can run a spellcheck on the results, and allow you to go over them for editing purposes.

    Something that might help; ai4spam has created a giant character matrix file that contains the fonts and letters used in the subs of many, many movies. With any luck, you'll load the character matrix before beginning and you may not have to do any typing at all, or very little anyway. Without luck, you'll still do a lot of typing. You can get it here:

    http://foxyshadis.slightlydark.com/random/CharMatrix.rar
    Quote Quote  
  7. Now I understand the fact that regular subtitles in a DVD cannot be extracted into a simple text format unless you use OCR. About your question why do I want srt/sub text subtitles, there are many websites, such as
    http://www.divxmovies.com/subtitles/
    that give you the subtitles in the srt/sub text format. So I am familiar with this simple format that can be read in Notepad and the subtitle time can easily be adjusted with Time Adjuster and the srt file is compatible with simple DVD authoring programs like DVD Flick. Initially, I couldn't find the particular subtitle file that I wanted. (Later I found the subtitle file.) So I thought I could ask a friend who has the DVD to extract the subtitles for me and email them to me. But now I see that OCR is required and that is a bit more complicated if I am asking somebody else to do it. I suppose if the DVD has closed captions, it would be easier (if I am asking somebody a favor and I want to minimize their work) to extract the closed captions rather than extract the regular subtitles.
    Quote Quote  
  8. Member mats.hogberg's Avatar
    Join Date: Jul 2002
    Location: Sweden (PAL)
    Search Comp PM
    You could have your friend extract the subs in "image" format (sub/idx), which is a rather straight forward procedure, that you then OCR yourself, with SubRip.

    /Mats
    Quote Quote  
  9. Thanks about the tip of extracting subtitles into the "image" format (sub/idx) format. I will have to read up on that.
    I don't mind doing the OCR myself.
    Quote Quote  
  10. Member mats.hogberg's Avatar
    Join Date: Jul 2002
    Location: Sweden (PAL)
    Search Comp PM
    manono gave a brief overview a few posts back:
    Originally Posted by manono
    To get the IDX/SUB format subtitles, open the right IFO in VobSub Configure (comes in the VobSub package), and save them. You can also use VSRip for the job. Neither way requires an OCR.
    /Mats
    Quote Quote  



Similar Threads