VideoHelp Forum
+ Reply to Thread
Results 1 to 24 of 24
Thread
  1. Hello

    I downloaded a documentary from medici.tv with youtube-dl GUI and it worked well.

    Now I'm trying to rip subtitles from the same docu (they are splitted into many .webvtt segments) using Firefox and activate developer mode to inspect element. Then head to Network tab to find and search for: 'subtitles' or 'm3u8'.

    What I get is a file of about 70 KB with .mp4 container.
    I opened the file with notepad and this is the (partial) content:

    WEBVTT

    00:00:00.000 --> 00:00:04.280
    Le piano est un instrument
    plutôt neutre.
    X-TIMESTAMP-MAP=MPEGTS:2415600,LOCAL:00:00:00.000

    00:00:00.320 --> 00:00:04.439
    Mais il a des possibilités illimitées
    de se transformer.
    WEBVTT
    X-TIMESTAMP-MAP=MPEGTS:2815110,LOCAL:00:00:00.000
    WEBVTT
    X-TIMESTAMP-MAP=MPEGTS:2941110,LOCAL:00:00:00.000

    00:00:00.000 --> 00:00:06.080
    C'est ainsi qu'il peut devenir
    un instrument chantant...
    WEBVTT
    X-TIMESTAMP-MAP=MPEGTS:3488310,LOCAL:00:00:00.000
    WEBVTT
    X-TIMESTAMP-MAP=MPEGTS:4262400,LOCAL:00:00:00.000

    00:00:00.000 --> 00:00:02.520
    De voir que cet instrument
    WEBVTT
    X-TIMESTAMP-MAP=MPEGTS:4489200,LOCAL:00:00:00.000
    WEBVTT
    X-TIMESTAMP-MAP=MPEGTS:4546800,LOCAL:00:00:00.000

    00:00:00.000 --> 00:00:03.040
    réagit véritablement,
    WEBVTT
    X-TIMESTAMP-MAP=MPEGTS:4820400,LOCAL:00:00:00.000

    00:00:00.160 --> 00:00:04.760
    et se plie
    à ce que vous voulez faire,
    WEBVTT
    X-TIMESTAMP-MAP=MPEGTS:5248800,LOCAL:00:00:00.000

    00:00:01.599 --> 00:00:03.560
    à ce que vous cherchez,
    WEBVTT
    X-TIMESTAMP-MAP=MPEGTS:5569200,LOCAL:00:00:00.000
    WEBVTT
    X-TIMESTAMP-MAP=MPEGTS:5583510,LOCAL:00:00:00.000

    00:00:00.000 --> 00:00:03.000
    est quelque chose d'extraordinaire,
    WEBVTT
    X-TIMESTAMP-MAP=MPEGTS:5853510,LOCAL:00:00:00.000

    00:00:00.441 --> 00:00:03.880
    la raison pour laquelle je suis,
    encore aujourd'hui,
    WEBVTT
    X-TIMESTAMP-MAP=MPEGTS:6202710,LOCAL:00:00:00.000

    00:00:00.161 --> 00:00:02.041
    heureux d'être pianiste.
    WEBVTT
    X-TIMESTAMP-MAP=MPEGTS:6386400,LOCAL:00:00:00.000
    WEBVTT
    X-TIMESTAMP-MAP=MPEGTS:15670800,LOCAL:00:00:00.000

    ---truncated
    then I renamed the file as .vtt and converted it as .srt with Subtitle Edit.
    What i get is a file like this:

    1
    00:00:00,000 --> 00:00:04,280
    Le piano est un instrument
    plutôt neutre.
    X-TIMESTAMP-MAP=MPEGTS:2415600,LOCAL:00:00:00.000

    2
    00:00:00,000 --> 00:00:03,760
    Mes premières rencontres
    avec Abbado remontent
    WEBVTT
    X-TIMESTAMP-MAP=MPEGTS:204764400,LOCAL:00:00:00.000

    3
    00:00:00,000 --> 00:00:02,720
    Il était clair
    que ce genre d'activité
    WEBVTT
    X-TIMESTAMP-MAP=MPEGTS:194860800,LOCAL:00:00:00.000

    4
    00:00:00,000 --> 00:00:02,960
    dans la vie musicale italienne.
    WEBVTT
    X-TIMESTAMP-MAP=MPEGTS:195951510,LOCAL:00:00:00.000

    5
    00:00:00,000 --> 00:00:03,440
    sinon de façon très limitée.
    WEBVTT
    X-TIMESTAMP-MAP=MPEGTS:196556400,LOCAL:00:00:00.000
    WEBVTT
    X-TIMESTAMP-MAP=MPEGTS:196750800,LOCAL:00:00:00.000

    6
    00:00:00,000 --> 00:00:01,320
    Oui,
    WEBVTT
    X-TIMESTAMP-MAP=MPEGTS:197899200,LOCAL:00:00:00.000

    7
    00:00:00,000 --> 00:00:02,520
    Vous avez joué Bartók
    sans répétition ?
    WEBVTT
    X-TIMESTAMP-MAP=MPEGTS:221727600,LOCAL:00:00:00.000

    8
    00:00:00,000 --> 00:00:01,120
    mais en fait,
    WEBVTT
    X-TIMESTAMP-MAP=MPEGTS:198000000,LOCAL:00:00:00.000

    9
    00:00:00,000 --> 00:00:01,256
    des autres.

    10
    00:00:00,000 --> 00:00:01,480
    Abbado est un peu plus âgé
    que moi,
    WEBVTT
    X-TIMESTAMP-MAP=MPEGTS:205318800,LOCAL:00:00:00.000

    ---truncated
    The result does not seem to be a correct .srt file. In fact, combined with video, timestamps are busted, completely useless and out of sync.

    Am I wrong in converting from .vtt to .srt or should I look for a different key with the developer tool?
    Quote Quote  
  2. LZAA это то же самое, что и здесь
    Last edited by sysanin; 21st Aug 2019 at 16:19.
    Quote Quote  
  3. Originally Posted by LZAA View Post
    Url?
    not resolved
    Last edited by pedrothelion; 21st Aug 2019 at 16:51.
    Quote Quote  
  4. I guess this should be simple to fix using regular expressions.
    Find:
    Code:
    ^(WEBVTT|X-TIMESTAMP-MAP.*)(\r\n|\r|\n)
    and replace by nothing.
    Editors like Notepad++ and SublimeText offer such Search and Replace (All) feature with regex.
    Quote Quote  
  5. But timing problem remains. It does not seem regular as expression...
    Quote Quote  
  6. Sorry, I didn't see the timing problem. Indeed, that seems to be the complicated part.

    I'm wondering if it is even possible to reconstruct the correct timings from just the info you posted.
    Does the following fit the video?
    Code:
    1
    00:00:00.000 --> 00:00:04.280
    Le piano est un instrument
    plutôt neutre.
    
    2
    00:00:26.840 --> 00:00:30.279
    Mais il a des possibilités illimitées
    de se transformer.
    
    3
    00:00:32.679 --> 00:00:38.759
    C'est ainsi qu'il peut devenir
    un instrument chantant...
    
    4
    00:00:47.36 --> 00:00:49.880
    De voir que cet instrument
    
    5
    00:00:50.520 --> 00:00:53.560
    réagit véritablement,
    
    6
    00:00:53.560 --> 00:00:58.320
    et se plie
    à ce que vous voulez faire,
    
    7
    00:00:58.032 --> 00:00:61.880
    à ce que vous cherchez,
    
    8
    00:01:02.039 --> 00:01:05.039
    est quelque chose d'extraordinaire,
    
    9
    00:01:05.039 --> 00:01:08.919
    la raison pour laquelle je suis,
    encore aujourd'hui,
    
    10
    00:01:08.919 --> 00:01:10.960
    heureux d'être pianiste.
    Last edited by sneaker; 21st Aug 2019 at 17:50.
    Quote Quote  
  7. Originally Posted by sneaker View Post
    Sorry, I didn't see the timing problem. Indeed, that seems to be the complicated part.

    I'm wondering if it is even possible to reconstruct the correct timings from just the info you posted.
    Does the following fit the video?
    Code:
    1
    00:00:00.000 --> 00:00:04.280
    Le piano est un instrument
    plutôt neutre.
    
    2
    00:00:26.840 --> 00:00:30.279
    Mais il a des possibilités illimitées
    de se transformer.
    
    3
    00:00:32.679 --> 00:00:38.759
    C'est ainsi qu'il peut devenir
    un instrument chantant...
    
    4
    00:00:47.36 --> 00:00:49.880
    De voir que cet instrument
    
    5
    00:00:50.520 --> 00:00:53.560
    réagit véritablement,
    
    6
    00:00:53.560 --> 00:00:58.320
    et se plie
    à ce que vous voulez faire,
    
    7
    00:00:58.032 --> 00:00:61.880
    à ce que vous cherchez,
    
    8
    00:01:02.039 --> 00:01:05.039
    est quelque chose d'extraordinaire,
    
    9
    00:01:05.039 --> 00:01:08.919
    la raison pour laquelle je suis,
    encore aujourd'hui,
    
    10
    00:01:08.919 --> 00:01:10.960
    heureux d'être pianiste.
    Nope... it should be like this:

    WEBVTT

    1
    00:00:22.020 --> 00:00:27.004
    The piano is finally
    a rather neutral instrument.

    2
    00:00:27.008 --> 00:00:31.018
    But it has a limitless ability
    to transform itself.

    3
    00:00:31.022 --> 00:00:35.005
    Thus, it can become

    4
    00:00:37.003 --> 00:00:39.003
    a singing instrument...

    5
    00:00:47.017 --> 00:00:50.001
    To see how this instrument

    6
    00:00:50.005 --> 00:00:53.018
    actually reacts,

    7
    00:00:53.022 --> 00:00:58.013
    and yields
    to whatever you want to do,

    8
    00:00:59.010 --> 00:01:02.001
    to what you are after,

    9
    00:01:02.005 --> 00:01:05.012
    is something quite extraordinary.

    10
    00:01:05.016 --> 00:01:09.005
    This is why I still am,

    11
    00:01:09.009 --> 00:01:11.003
    so happy to be a pianist.
    I can provide access if needed.
    Quote Quote  
  8. I probably won't be able to help. I cannot even manually get from the quotes in post #1 to the file from post #8, let alone in an automated way. I think it's not possible but you need more data.
    Quote Quote  
  9. Originally Posted by sneaker View Post
    I probably won't be able to help. I cannot even manually get from the quotes in post #1 to the file from post #8, let alone in an automated way. I think it's not possible but you need more data.
    Thanks for your reply. I finally succeeded with Allavsoft
    I will also try with URL Snooper.
    Quote Quote  
  10. I have the exact same problem as the OP.

    I used FFMPEG to download subs from the same web site with the following command:

    ffmpeg -i "https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/9378/files/18/12/1068412/9378-HWQKVTXrYyxwncZq.ism/9378-HWQKVTXrYyxwncZq-textstream_eng=1000.m3u8?mime=mp4&source=html5" subs.srt

    The result is identical to youtube-dl and so it should be because youtube-dl uses FFMPEG to download and convert subs.

    I downloaded one of the VTT segments from the above playlist:

    WEBVTT
    X-TIMESTAMP-MAP=MPEGTS:43875090,LOCAL:00:00:00.000

    5
    00:00:00.000 --> 00:00:02.791
    What do you think
    of her over there?
    It appears that in a segmented VTT the timestamp is indicated by the MPEGTS value, which is offset by 1 second (90000). The correct start time (in seconds) could therefore be calculated by dividing the MPEG-TS by 90000, less one. FFMPEG/youtube-dl merely copies and pastes the local time, which in a segmented WebVTT is always zero.
    Quote Quote  
  11. Anonymous872
    Guest
    Originally Posted by Paralucent View Post
    I have the exact same problem as the OP.

    I used FFMPEG to download subs from the same web site with the following command:

    ffmpeg -i "https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/9378/files/18/12/1068412/9378-HWQKVTXrYyxwncZq.ism/9378-HWQKVTXrYyxwncZq-textstream_eng=1000.m3u8?mime=mp4&source=html5" subs.srt

    The result is identical to youtube-dl and so it should be because youtube-dl uses FFMPEG to download and convert subs.

    I downloaded one of the VTT segments from the above playlist:

    WEBVTT
    X-TIMESTAMP-MAP=MPEGTS:43875090,LOCAL:00:00:00.000

    5
    00:00:00.000 --> 00:00:02.791
    What do you think
    of her over there?
    It appears that in a segmented VTT the timestamp is indicated by the MPEGTS value, which is offset by 1 second (90000). The correct start time (in seconds) could therefore be calculated by dividing the MPEG-TS by 90000, less one. FFMPEG/youtube-dl merely copies and pastes the local time, which in a segmented WebVTT is always zero.


    Hello paralucent,
    Make sure it's right for you, because I'm not sure.
    check both.

    https://mega.nz/folder/iNZinI5a#tsu-HOff_GHWA4CwqZsTLw
    Quote Quote  
  12. Thank you so much marcelo_bor! The first one, subtitle A, was perfect!

    How did you do it?
    Quote Quote  
  13. Anonymous872
    Guest
    Originally Posted by Paralucent View Post
    Thank you so much marcelo_bor! The first one, subtitle A, was perfect!

    How did you do it?
    so i used the m3u8x program.
    here is the link> https://www.videohelp.com/software/m3u8x
    caption b is 10 seconds late then.

    done.
    Quote Quote  
  14. Thanks for the info. I started to install m3u8x a few days ago but a Windows Security Warning box appeared saying the certificate is from "DO_NOT_TRUST_FiddlerRoot". Did you get the same message?
    Quote Quote  
  15. Anonymous872
    Guest
    Originally Posted by Paralucent View Post
    Thanks for the info. I started to install m3u8x a few days ago but a Windows Security Warning box appeared saying the certificate is from "DO_NOT_TRUST_FiddlerRoot". Did you get the same message?

    It definitely didn't come to me.
    you don't need to install anything, just extract the program to a folder and use it remembering that it is portable.
    if you feel safe, go ahead on the program or search for information on 'DO_NOT_TRUST_FiddlerRoot' on google.
    Quote Quote  
  16. Thanks got it working.

    If I paste the m3u8 subtitles URL in the 'Quality URL box' and then click 'Download' I don't get any messages about certificates. In order to obtain correctly synched subs I found I needed to select 'time according to EXTINF'. Thanks again for your help, marcelo_bor.
    Quote Quote  
  17. As for as my test, Allavsoft can download both videos and subtitles from medici tv. Simply copy and paste the medici tv video link to Allavsoft and click Download button, it will download both subtitles and videos into the Download folder.
    Quote Quote  
  18. Member
    Join Date
    Sep 2005
    Location
    Darkest Peru
    Search Comp PM
    Same problem with another site. When I use m3u8x I get a folder of 141 en.m3u8_0000000000.vtt files containing text like: WEBVTT
    X-TIMESTAMP-MAP=MPEGTS:126000,LOCAL:00:00:00.000

    I'm sure I"m doing something wrong. m3u8x is kind of hard to understand though.

    I run m3u8x. Paste the subtitle playlist into the box that says "Quality URL", check 'This Subtitle' and "One...One", and 'Download'.
    A box opens, I click 'Time according to "EXTINF" and 'Start download'.

    Am I missing something? I can get the subtitles complete and merged with ffmpeg, but it's full of garbage and has timing errors like mentioned above.
    Quote Quote  
  19. that m3u8x download box when you finish downloading all parts, click 'Combine' button on the top
    Then it should build a merged srt for you
    Quote Quote  
  20. there's disney+ addon on greasy fork that act like this so maybe modify ?
    Quote Quote  
  21. Member
    Join Date
    Sep 2005
    Location
    Darkest Peru
    Search Comp PM
    Originally Posted by tamagoyaki View Post
    that m3u8x download box when you finish downloading all parts, click 'Combine' button on the top
    Then it should build a merged srt for you
    I don't see a combine button. Also, I think there is an issue with there being commercial breaks and that the subtitle count restarts after each one. As a result m3u8x assumes it already download all segment by the first break.
    I don't think it's the right tool.

    Do I run into site rule issues if I mention what streaming site I'm looking at?

    Edit: It looks like there is an ffmpeg patch for this that has never been added. Is there a way to institute that myself? https://gist.github.com/SebiderSushi/bdf8d46d5501f7085d0b27d8a19eb12c
    Last edited by doctorm; 5th Jun 2021 at 11:17.
    Quote Quote  



Similar Threads

Visit our sponsor! Try DVDFab and backup Blu-rays!