VideoHelp Forum
+ Reply to Thread
Page 2 of 2
FirstFirst 1 2
Results 31 to 42 of 42
Thread
  1. Feels Good Man 2nHxWW6GkN1l916N3ayz8HQoi's Avatar
    Join Date
    Jan 2024
    Location
    Pepe Island
    Search Comp PM
    Oops, some formatting errors. My bad. I fixed it by importing it in subtitle edit and exporting it again as .srt. Try this one. test.srt

    Edit: There will be some random lines left empty because the OCR went kaput, but I'm gonna look into it. There are like 8 lines out of 310 empty. Gonna have to try different image processing techniques.
    Quote Quote  
  2. Originally Posted by 2nHxWW6GkN1l916N3ayz8HQoi View Post
    Oops, some formatting errors. My bad. I fixed it by importing it in subtitle edit and exporting it again as .srt. Try this one. Image
    [Attachment 77032 - Click to enlarge]


    Edit: There will be some random lines left empty because the OCR went kaput, but I'm gonna look into it. There are like 8 lines out of 310 empty. Gonna have to try different image processing techniques.

    thank you
    now it open but like that... (gibberish)
    it happend to me too , i didn't found a way to fix it to work with subtitle workshop
    and by the way , what program did you used to make it srt ?
    Image Attached Thumbnails Click image for larger version

Name:	gibberish.jpg
Views:	13
Size:	197.5 KB
ID:	77033  

    Quote Quote  
  3. Feels Good Man 2nHxWW6GkN1l916N3ayz8HQoi's Avatar
    Join Date
    Jan 2024
    Location
    Pepe Island
    Search Comp PM
    Originally Posted by benhouli View Post
    thank you
    now it open but like that... (gibberish)
    it happend to me too , i didn't found a way to fix it to work with subtitle workshop
    and by the way , what program did you used to make it srt ?
    It works for me... in subtitle edit. Isn't workshop kinda dead anyway? Been some time since the last update.

    Image
    [Attachment 77034 - Click to enlarge]


    As for how I got it, I just used python, tesseract ocr and coded my own stuff. Subtitle edit was used to export all 310 png images and their corresponding timestamps and then in python I applied tesseract ocr for each image. Leaving the image unaltered gave a bunch of gibberish so I just used binarization to improve the success rate. It left 8 lines empty and I played around with some other techniques and now there's only 3 or 4 empty. I don't think I can do better than that with what I currently know anyway. I'm sure there are users here that are expert when it comes to this stuff.
    Quote Quote  
  4. Originally Posted by 2nHxWW6GkN1l916N3ayz8HQoi View Post
    Originally Posted by benhouli View Post
    thank you
    now it open but like that... (gibberish)
    it happend to me too , i didn't found a way to fix it to work with subtitle workshop
    and by the way , what program did you used to make it srt ?
    It works for me... in subtitle edit. Isn't workshop kinda dead anyway? Been some time since the last update.

    Image
    [Attachment 77034 - Click to enlarge]


    As for how I got it, I just used python, tesseract ocr and coded my own stuff. Subtitle edit was used to export all 310 png images and their corresponding timestamps and then in python I applied tesseract ocr for each image. Leaving the image unaltered gave a bunch of gibberish so I just used binarization to improve the success rate. It left 8 lines empty and I played around with some other techniques and now there's only 3 or 4 empty. I don't think I can do better than that with what I currently know anyway. I'm sure there are users here that are expert when it comes to this stuff.
    in subtitle edit it work, but i use to work with subtitle workshop for the sync... 8 lines it's great job , i can fill these empty lines, but it will hard for me to use subtitle edit , but i will try , thank you
    Quote Quote  
  5. Feels Good Man 2nHxWW6GkN1l916N3ayz8HQoi's Avatar
    Join Date
    Jan 2024
    Location
    Pepe Island
    Search Comp PM
    Originally Posted by benhouli View Post
    in subtitle edit it work, but i use to work with subtitle workshop for the sync... 8 lines it's great job , i can fill these empty lines, but it will hard for me to use subtitle edit , but i will try , thank you
    Ok, so I fixed some other problems and added a helpful "warning" so you could see what lines to add manually. Now keep in mind, just because the other lines "seem" to be ok, that doesn't mean they are, so a fast proofread could be good.

    Image
    [Attachment 77036 - Click to enlarge]


    The final srt file (with 8 lines missing) is this. output.srt I dunno if the other programs could already do this automatically but for me it was easier to write the script than to read a bunch of documentation. I tried it in subtitle edit and I couldn't see any option to process the image through specific methods.
    Quote Quote  
  6. Originally Posted by 2nHxWW6GkN1l916N3ayz8HQoi View Post
    Originally Posted by benhouli View Post
    in subtitle edit it work, but i use to work with subtitle workshop for the sync... 8 lines it's great job , i can fill these empty lines, but it will hard for me to use subtitle edit , but i will try , thank you
    Ok, so I fixed some other problems and added a helpful "warning" so you could see what lines to add manually. Now keep in mind, just because the other lines "seem" to be ok, that doesn't mean they are, so a fast proofread could be good.

    Image
    [Attachment 77036 - Click to enlarge]


    The final srt file (with 8 lines missing) is this. Image
    [Attachment 77037 - Click to enlarge]
    I dunno if the other programs could already do this automatically but for me it was easier to write the script than to read a bunch of documentation. I tried it in subtitle edit and I couldn't see any option to process the image through specific methods.
    thank you
    Quote Quote  
  7. I'm just curious if there is a library to recognize Hebrew characters in Subtitle Edit, and Subtitle Workshop. At least Subtitle Edit recognizes pretty well graphics dvb_subtitles nightmare but fails with conversion to text.

    BTW: ffmpeg and MKVToolnix fail to extract graphics version, so not worth to try.
    Quote Quote  
  8. Originally Posted by noemi7 View Post
    I'm just curious if there is a library to recognize Hebrew characters in Subtitle Edit, and Subtitle Workshop. At least Subtitle Edit recognizes pretty well graphics dvb_subtitles nightmare but fails with conversion to text.

    BTW: ffmpeg and MKVToolnix fail to extract graphics version, so not worth to try.

    2nHxWW6GkN1l916N3ayz8HQoi did a great job... as he said , he done it with a script...

    i tried the ocr in ts doctor... it seems decent too...

    about subtitle edit , in the past i got some datebase for hebrew characters to subtitle edit , but it was long time ago , and i don't have it anymore , and i don't remember from where i got it ...

    thanks for the info noemi7
    Quote Quote  
  9. Feels Good Man 2nHxWW6GkN1l916N3ayz8HQoi's Avatar
    Join Date
    Jan 2024
    Location
    Pepe Island
    Search Comp PM
    Originally Posted by noemi7 View Post
    I'm just curious if there is a library to recognize Hebrew characters in Subtitle Edit, and Subtitle Workshop. At least Subtitle Edit recognizes pretty well graphics dvb_subtitles nightmare but fails with conversion to text.

    BTW: ffmpeg and MKVToolnix fail to extract graphics version, so not worth to try.
    If by extracting you mean, you want the subtitle data in some external files completely separated from the video container data, you can do this by using subtitle edit. Though I have no idea how those files would be useful on their own or how they could be "imported" to another video container like normal text based subtitles.

    Image
    [Attachment 77041 - Click to enlarge]


    Edit: I checked and the sub/idx pair can actually be imported on another mkv file. Good to know.
    Last edited by 2nHxWW6GkN1l916N3ayz8HQoi; 15th Feb 2024 at 14:09.
    Quote Quote  
  10. Originally Posted by 2nHxWW6GkN1l916N3ayz8HQoi View Post
    Edit: I checked and the sub/idx pair can actually be imported on another mkv file. Good to know.
    Funny thing is that you can import them to mkv directly from a ts as dvb_subtitles but mkvextract doesn't support their extraction.
    Quote Quote  
  11. Member netmask56's Avatar
    Join Date
    Sep 2005
    Location
    Sydney, Australia
    Search Comp PM
    Originally Posted by noemi7 View Post
    Originally Posted by 2nHxWW6GkN1l916N3ayz8HQoi View Post
    Edit: I checked and the sub/idx pair can actually be imported on another mkv file. Good to know.
    Funny thing is that you can import them to mkv directly from a ts as dvb_subtitles but mkvextract doesn't support their extraction.
    Yet if you have a MKV file with PGS subs within the mkv file, my Zidoo media player UHD3000 shows them and if I use Inviska MKV to extract the PGS as a standalone subtitle it extracts the PGS file as a .SUP which is a graphics type that can then be OCR in Subtitle Edit to produce a srt or .ass sub.
    SONY 75" Full array 200Hz LED TV, Yamaha A1070 amp, Zidoo UHD3000, BeyonWiz PVR V2 (Enigma2 clone), Chromecast, Windows 11 Professional, QNAP NAS TS851
    Quote Quote  
  12. Originally Posted by netmask56 View Post
    (...) if I use Inviska MKV to extract the PGS as a standalone subtitle it extracts the PGS file as a .SUP which is a graphics type that can then be OCR in Subtitle Edit to produce a srt or .ass sub.
    Inviska (which uses mkvextract) cannot extract subtitles from the video uploaded by benhouli, and re-muxed as mkv. Are dvb subtitles coded differently in Australia? I'm in Europe, and streams that I used to capture have text based or hardcoded subtitles. I found a few channels that have graphics dvb subtitles PID but in fact there is no data in it.
    Quote Quote  



Similar Threads

Visit our sponsor! Try DVDFab and backup Blu-rays!