VideoHelp Forum
+ Reply to Thread
Results 1 to 12 of 12
Thread
  1. Member
    Join Date
    Sep 2005
    Location
    Darkest Peru
    Search Comp PM
    I'm trying to grab the subtitles from a movie on Pluto.tv and convert it to .srt. While it's easy enough to get the playlist, the format is strange.

    As far as I can tell the numbering of the linked .vtt files and the timecodes of the subtitles reset to zero after each commercial break(?). So there are multiple similarly named files and the time codes overlap.

    Copy of playlist:
    https://www.mediafire.com/file/t1h2psu9jmnn25y/playlist.m3u8/file

    I'm open to suggestions on how to manage this one.
    Quote Quote  
  2. Wow, this looks extremely weird. I don't think there's an easy fix for this.
    Do you really can't find the subtitles anywhere else?
    Quote Quote  
  3. Looking at the breaks/restarts ....

    Code:
    https://siloh.pluto.tv/c6009f_pluto/clip/5e9efff357d2e0001a0997a1_Indiscreet_1958/subHD/20200421_071515/hls/0-710628/en/en.m3u8_0000000000.vtt
    https://siloh.pluto.tv/c6009f_pluto/clip/5e9efff357d2e0001a0997a1_Indiscreet_1958/subHD/20200421_071515/hls/710629-1392351/en/en.m3u8_0000000000.vtt
    https://siloh.pluto.tv/c6009f_pluto/clip/5e9efff357d2e0001a0997a1_Indiscreet_1958/subHD/20200421_071515/hls/1392352-2126919/en/en.m3u8_0000000000.vtt
    https://siloh.pluto.tv/c6009f_pluto/clip/5e9efff357d2e0001a0997a1_Indiscreet_1958/subHD/20200421_071515/hls/2126920-2667418/en/en.m3u8_0000000000.vtt
    https://siloh.pluto.tv/c6009f_pluto/clip/5e9efff357d2e0001a0997a1_Indiscreet_1958/subHD/20200421_071515/hls/2667419-3070821/en/en.m3u8_0000000000.vtt
    https://siloh.pluto.tv/c6009f_pluto/clip/5e9efff357d2e0001a0997a1_Indiscreet_1958/subHD/20200421_071515/hls/3070822-3492034/en/en.m3u8_0000000000.vtt
    https://siloh.pluto.tv/c6009f_pluto/clip/5e9efff357d2e0001a0997a1_Indiscreet_1958/subHD/20200421_071515/hls/3492035-3973349/en/en.m3u8_0000000000.vtt
    https://siloh.pluto.tv/c6009f_pluto/clip/5e9efff357d2e0001a0997a1_Indiscreet_1958/subHD/20200421_071515/hls/3973350-4366659/en/en.m3u8_0000000000.vtt
    https://siloh.pluto.tv/c6009f_pluto/clip/5e9efff357d2e0001a0997a1_Indiscreet_1958/subHD/20200421_071515/hls/4366660-5330164/en/en.m3u8_0000000000.vtt
    https://siloh.pluto.tv/c6009f_pluto/clip/5e9efff357d2e0001a0997a1_Indiscreet_1958/subHD/20200421_071515/hls/5330165-end/en/en.m3u8_0000000000.vtt
    results in ...

    0-710628
    710629-1392351
    1392352-2126919
    2126920-2667418
    2667419-3070821
    3070822-3492034
    3492035-3973349
    3973350-4366659
    4366660-5330164
    5330165-end

    These numbers are the beginning and end times in milliseconds for each of the 10 segments

    You will need to adjust the offset of the timing on the .vtt files for segments 2 to 10.
    Quote Quote  
  4. Originally Posted by jack_666 View Post
    You will need to adjust the offset of the timing on the .vtt files for segments 2 to 10.
    Can you explain a bit more?
    Can this be done while downloading or afterwards?
    And how?

    I think I have a vague idea of what needs to be done, but I don't know how to download the segments separately.
    Quote Quote  
  5. @[ss]vegeta

    First of all I must thank you for all the contributions you have made to help people. You have my respect. Hats off to you.

    Here is how I think it can be done.

    We have 10 segments each with a different number of subsegments {14,136,146,108,80,84,96,78,192,132}

    We will loop 10 times (i_loop) and then have a subloop with times as per the array (j_loop)

    For each iteration, we will curl the the appropriate the url

    the code should be something like this .....

    Code:
    curl https://siloh.pluto.tv/c6009f_pluto/clip/5e9efff357d2e0001a0997a1_Indiscreet_1958/subHD/20200421_071515/hls/%i_loop_index_name%/en/en.m3u8_%j_loop_index_number%.vtt >> movie_vtt_part%i%


    we may need to delete all references to

    WEBVTT
    X-TIMESTAMP-MAP

    which can be done via sed with the global flag.

    After this code is completed, we should have the merged 10 parts.

    All the above can be done via a script

    Next is a manual step. Execute for each part 2 to 10

    Open each of the parts 2 to 10 via subtitle edit and use the option "Synchronization" ==> Adjust all times .... ==> Select first line of subtitle file ==> Enter the offset time ==> Select Radio Button "Selected line(s) and forward" ==> Click "Show Later"


    Now merge all parts parts.

    If you can figure out how to do the time offset via code then the whole process can be done automatically.

    One possible way is to capture the time stamps on the .vtt and add the offset time. Do a grep for --> and manipulate the captured data.
    Last edited by jack_666; 18th Nov 2021 at 14:33.
    Quote Quote  
  6. Originally Posted by jack_666 View Post
    ...
    Thanks a lot for the explanation and thanks for noticing my contributions
    Quote Quote  
  7. Member
    Join Date
    Sep 2005
    Location
    Darkest Peru
    Search Comp PM
    I know there is a patch floating around for ffmpeg that fixes webvtt offset, but it has never been implemented and patching/recompiling ffmpeg is above my pay grade. https://gist.github.com/SebiderSushi/bdf8d46d5501f7085d0b27d8a19eb12c
    As for the rest of this... I'll have to re-read it a few more times.
    Quote Quote  
  8. Be easier and less time consuming to learn a new language........
    Quote Quote  
  9. I wrote a script and created the ten files

    Open each of the parts 2 to 10 via subtitle edit and use the option "Synchronization" ==> Adjust all times .... ==> Select first line of subtitle file ==> Enter the offset time ==> Select Radio Button "Selected line(s) and forward" ==> Click "Show Later"


    Now merge all ten parts.

    May need to add WEBVTT at the very beginning of part1
    Image Attached Files
    Quote Quote  
  10. Member
    Join Date
    Sep 2005
    Location
    Darkest Peru
    Search Comp PM
    Wow. Thanks. I have that movie on BD but the release doesn't have subtitles. There are some on the net, but they are terrible; full of errors and timing issues.

    This has some issues, most are fixable automatically. Multiple speakers aren't hyphenated, but this is so much better than what there was to work with before.

    Edit: I had an idea to try putting the links into m3u8x. I broke it up manually into 10 playlists every time it reset to 0. Simple, and partially successful. I could only get the first playlist to work. The second one ignored en.m3u8_0000000000.vtt and then crashed the program.
    I'm sure there's (maybe) a way to make it work with that method that would be simpler than what you did with the script.
    Last edited by doctorm; 19th Nov 2021 at 01:39.
    Quote Quote  
  11. Member
    Join Date
    Sep 2005
    Location
    Darkest Peru
    Search Comp PM
    What did you base the timing on, the offset or just by ear?
    Quote Quote  



Similar Threads

Visit our sponsor! Try DVDFab and backup Blu-rays!