VideoHelp Forum
+ Reply to Thread
Results 1 to 28 of 28
Thread
  1. Member
    Join Date
    Feb 2021
    Location
    Sweden
    Search Comp PM
    Hi again!
    I have asked a few times here in order to get help in automating extraction of the m3u8 URL from sites with live streaming video players. In several cases people here were able to solve this so I can automate the downloads I make.

    Now I have another case where the current "method" for me is to manually use the F12 key in Firefox while the video is playing and then grab the m3u8 from the network tab and save it to a file which my downloader can read to get the stream using ffmpeg.

    But then at irregular intervals the m3u8 URL changes and the download fails and I have to detect this and jump in and extract a new m3u8 URL into the file.
    It would be better to have a command that can be used say 1 minute before the video download starts to refresh the m3u8 URL in the file.

    So, is there someone here who can suggest a command (or several) that will extract the full m3u8 URL from this webpage?

    I am using an Ubuntu server for the downloads so the extraction command must run on Linux.

    In this case I have looked at the page source for "m3u8" and found this:

    Code:
    const player = new Plyr(video, {controls: enabledControls, seekTime: 6});
    var hlsUrl = 'https://' + data.best + '/hls/' + 'msnbc_live' + '/index.m3u8';
    So there is some functionality that browsers understand to fill in the blanks and get a full URL, I believe.
    Since I have saved expired m3u8 URL values I can see that the part that varies is data.best as can be seen by this sequence of previously working URL's (the string 'msnbc_live' also varies but is available on the source line shown above):

    Code:
        #M3U8="https://cdn-de1-eu.lncnetworks.host/hls/msnbc_live/index.m3u8"
        #M3U8="https://cdn-fr1-eu.lncnetworks.host/hls/msnbc_live/index.m3u8" #2023-03-24 13:00
        #M3U8="https://cdn-ks2-na.lncnetworks.host/hls/ctvnews_live/index.m3u8" #2023-04-30 18:00
        #M3U8="https://cdn-ca1-na.lncnetworks.host/hls/msnbc_live/index.m3u8" #2023-05-01
        #M3U8="https://cdn-de1-eu.lncnetworks.host/hls/msnbcus_live/index.m3u8"
        #M3U8="https://cdn-de1-eu.lncnetworks.host/hls/msnbc_live/index.m3u8"
        #M3U8="https://cdn-fr1-eu.lncnetworks.host/hls/msnbcus_live/index.m3u8" #2023-08-25
        #M3U8="https://cdn-de1-eu.lncnetworks.host/hls/msnbc_live/index.m3u8" #2023-08-29 10:34
        #M3U8="https://cdn-fr3-eu.lncnetworks.host/hls/msnbc_live/index.m3u8" #2023-08-27, 2023-08-29 18:12
        #M3U8="https://cdn-de1-eu.lncnetworks.host/hls/msnbcus_live/index.m3u8" #2023-09-26 13:52
        #M3U8="https://cdn-fr3-eu.lncnetworks.host/hls/msnbc_live/index.m3u8" #2023-10-03 20:48
        #M3U8="https://cdn-fr3-eu.lncnetworks.host/hls/lncextra_live/index.m3u8" #2023-10-07 23:43
        #M3U8="https://cdn-fr3-eu.lncnetworks.host/hls/msnbc_live/index.m3u8" #2023-10-07 23:43
        #M3U8="https://cdn-fi1-eu.lncnetworks.host/hls/msnbc_live/index.m3u8" #2023-10-24 20:36
        #M3U8="https://cdn-fr1-eu.lncnetworks.host/hls/msnbc_live/index.m3u8" #2023-10-25 14:03
        #M3U8="https://cdn-de1-eu.lncnetworks.host/hls/lncextra_live/index.m3u8" #2023-11-06 12:17
        #M3U8="https://cdn-fr1-eu.lncnetworks.host/hls/msnbc_live/index.m3u8" #2023-11-07 08:24
        #M3U8="https://cdn-nl1-eu.lncnetworks.host/hls/msnbc_live/index.m3u8" #2023-11-07 19:21
        #M3U8="https://cdn-nl1-eu.lncnetworks.host/hls/msnbcca_live/index.m3u8" #2023-12-12 17:34
        #M3U8="https://cdn-nl1-eu.lncnetworks.host/hls/msnbc_live/index.m3u8" #2023-12-13 07:49
        #M3U8="https://cdn-tx1-na.lncnetworks.host/hls/msnbc_live/index.m3u8" #2023-12-26 15:59
        #M3U8="https://cdn-ca2-na.lncnetworks.host/hls/msnbc_live/index.m3u8" #2023-12-27 12:10
        #M3U8="https://cdn-ks3-na.lncnetworks.host/hls/msnbc_live/index.m3u8" #2023-12-29 12:10
        #M3U8="https://cdn-ca2-na.lncnetworks.host/hls/msnbcca_live/index.m3u8" #2024-01-04 17:05
        #M3U8="https://cdn-ks2-na.lncnetworks.host/hls/msnbc_live/index.m3u8" #2024-01-05 08:43
        #M3U8="https://cdn-fr1-eu.lncnetworks.host/hls/msnbc_live/index.m3u8" #2024-01-14 09:20
        M3U8="https://cdn-nl1-eu.lncnetworks.host/hls/msnbc_live/index.m3u8" #2024-01-15 09:58
    Unfortunately I am not clever enough to figure out how to extract the actual value of data.best and what is actually following /hls/ .

    Any suggestions gratefully received!
    Quote Quote  
  2. Why change the URL?

    "https://cdn-de1-eu.lncnetworks.host/hls/msnbc_live/index.m3u8"

    This still works.
    Quote Quote  
  3. Member
    Join Date
    Feb 2021
    Location
    Sweden
    Search Comp PM
    Originally Posted by LZAA View Post
    Why change the URL?

    "https://cdn-de1-eu.lncnetworks.host/hls/msnbc_live/index.m3u8"

    This still works.
    The reason is that it randomly stops working....
    Could be hours, days or even weeks after I manually entered the URL I found by using the F12 key while playing the video in FireFox.

    Since I am running the stream download in an automated fashion on an Ubuntu Server I don't immediately see when it fails so I can fix it (manually).
    I'd rather have some automated function to interrogate the webpage and get the m3u8 URL that works and do this say 1 minute before a recording is set to start.

    The two items I need to extract are:

    data.best: This seems to be a variation of the same theme looking like "cdn-de1-eu."

    'msnbc_live': This literal seems to point the URL to different sub-providers at the network.

    So getting these at the moment the stream will be downloaded should make the failures much less common.
    Quote Quote  
  4. You still need to watch.
    If 'msnbc' changes, then you need to take the current URL 'm3u8', write 'msnbc' in it again and check whether such a link will work. Write about the result of such an experiment.
    Quote Quote  
  5. Member
    Join Date
    Feb 2021
    Location
    Sweden
    Search Comp PM
    So looking at the historical data I have I see:

    The data.best part starts with one of these 10:

    cdn-ca1-na
    cdn-ca2-na
    cdn-de1-eu
    cdn-fi1-eu
    cdn-fr1-eu
    cdn-fr3-eu
    cdn-ks2-na
    cdn-ks3-na
    cdn-nl1-eu
    cdn-tx1-na

    Followed by .lncnetworks.host/hls/

    Then the last part consists of one of these 5 and then /index.m3u8

    msnbc_live
    msnbcca_live
    msnbcus_live
    lncextra_live
    ctvnews_live

    So in total 50 possible combinations of these.

    But since I can extract the last varying item by checking the return if I call the page URL, it reduces to 10 possible total URL's given that my historic data contain all combinations...

    So I still have to decide between 10 possible total URL's (in the script to run a minute in advance of recording.
    And this is the data.best item in the extracted line:
    Code:
    var hlsUrl = 'https://' + data.best + '/hls/' + 'msnbc_live' + '/index.m3u8'
    So how can one figure out that value?

    Running each of the 10 possibilities in succession and evaluating the result in terms of received video is hard to automate and takes a longish time to perform.
    My other extractors operate in a matter of 0-3 seconds...
    Last edited by BosseB; 16th Jan 2024 at 17:41. Reason: False smileys in the text
    Quote Quote  
  6. its basic, but this will get you the channel name part after /hls/
    Code:
    curl -s "https://livenewschat.eu/politics/" | grep m3u8 | awk -F "'" '{ print $6 }' 
    msnbc_live
    and this will get you the domain name
    Code:
    curl -s "https://data.lncnetworks.host/server.json" | awk -F '"' '{ print $6 }'
    cdn-nl1-eu.lncnetworks.host
    Quote Quote  
  7. Member
    Join Date
    Feb 2021
    Location
    Sweden
    Search Comp PM
    Originally Posted by ElCap View Post
    its basic, but this will get you the channel name part after /hls/
    Code:
    curl -s "https://livenewschat.eu/politics/" | grep m3u8 | awk -F "'" '{ print $6 }' 
    msnbc_live
    and this will get you the domain name
    Code:
    curl -s "https://data.lncnetworks.host/server.json" | awk -F '"' '{ print $6 }'
    cdn-nl1-eu.lncnetworks.host
    GREAT! Many thanks!
    I could figure out the first command but not the second.
    How did you do that?
    I am not understanding how to get the json part, but I hope I'm learning.
    Quote Quote  
  8. Originally Posted by BosseB View Post
    Originally Posted by ElCap View Post
    its basic, but this will get you the channel name part after /hls/
    Code:
    curl -s "https://livenewschat.eu/politics/" | grep m3u8 | awk -F "'" '{ print $6 }' 
    msnbc_live
    and this will get you the domain name
    Code:
    curl -s "https://data.lncnetworks.host/server.json" | awk -F '"' '{ print $6 }'
    cdn-nl1-eu.lncnetworks.host
    GREAT! Many thanks!
    I could figure out the first command but not the second.
    How did you do that?
    I am not understanding how to get the json part, but I hope I'm learning.
    I just looked at all the requests in dev tools and found the response that returned the domain name
    Quote Quote  
  9. Member
    Join Date
    Feb 2021
    Location
    Sweden
    Search Comp PM
    BACK AGAIN!

    Now I have encountered a new streaming site which changes its m3u8 url now and then.
    So I tried to figure out how to extract the currently active m3u8 URL but this time I had no luck primarily because the site is a 76 kbytes source all on one line so grep does not work to get stuff out of it.

    And I tried to look in the FireFox F12 debugger for a clue as to where it retrieves the m3u8, but again I can see the single spot where it appears (the m3u8 name string) but not what the value is...
    So I can get the m3u8 URL using interactively FireFox F12 screen, but that is no good for automation.

    The site I am looking at now is:
    Code:
    https://usnewson.com/watch/cnn-live
    At first I thought it did not change but now I see that it does and I have to figure out a way to extract the current value on demand..

    Any scripting suggestions for linux welcome!
    Quote Quote  
  10. Feels Good Man 2nHxWW6GkN1l916N3ayz8HQoi's Avatar
    Join Date
    Jan 2024
    Location
    Pepe Island
    Search Comp PM
    This display the m3u8 content:
    Code:
    import requests
    
    BASE_URL = "https://pro.usnlive.com/api/stream?key=cnn"
    M3U8_HEADERS = {
        'referer': 'https://usnewson.com/',
    }
    
    cnn_m3u8 = requests.get(requests.get(BASE_URL).text, headers=M3U8_HEADERS)
    print(cnn_m3u8.text)
    Edit: or more simplified
    Code:
    curl -H "referer: https://usnewson.com/" $(curl "https://pro.usnlive.com/api/stream?key=cnn" -sL)
    Last edited by 2nHxWW6GkN1l916N3ayz8HQoi; 5th Feb 2024 at 12:37.
    Quote Quote  
  11. Member
    Join Date
    Feb 2021
    Location
    Sweden
    Search Comp PM
    Originally Posted by 2nHxWW6GkN1l916N3ayz8HQoi View Post
    This display the m3u8 content:
    Code:
    import requests
    
    BASE_URL = "https://pro.usnlive.com/api/stream?key=cnn"
    M3U8_HEADERS = {
        'referer': 'https://usnewson.com/',
    }
    
    cnn_m3u8 = requests.get(requests.get(BASE_URL).text, headers=M3U8_HEADERS)
    print(cnn_m3u8.text)
    Edit: or more simplified
    Code:
    curl -H "referer: https://usnewson.com/" $(curl "https://pro.usnlive.com/api/stream?key=cnn" -sL)
    What language is this?
    Does bash have an import command or is it some other than simple bash script.
    If I put this into a file and run shellcheck against it, there are errors all over...
    Quote Quote  
  12. Feels Good Man 2nHxWW6GkN1l916N3ayz8HQoi's Avatar
    Join Date
    Jan 2024
    Location
    Pepe Island
    Search Comp PM
    Originally Posted by BosseB View Post
    What language is this?
    Does bash have an import command or is it some other than simple bash script.
    If I put this into a file and run shellcheck against it, there are errors all over...
    That's python. If you want bash just use that nested curl command.
    Quote Quote  
  13. Member
    Join Date
    Feb 2021
    Location
    Sweden
    Search Comp PM
    Originally Posted by LZAA View Post
    Great!
    I don't know how you managed to acquire this command but it works just fine, Thank you!!!
    Quote Quote  
  14. Member
    Join Date
    Feb 2021
    Location
    Sweden
    Search Comp PM
    What about this site?
    Code:
    https://livenewsof.com/msnbc-live-stream/
    Here I have no clue as to where the m3u8 is hidden...
    Quote Quote  
  15. Feels Good Man 2nHxWW6GkN1l916N3ayz8HQoi's Avatar
    Join Date
    Jan 2024
    Location
    Pepe Island
    Search Comp PM
    Originally Posted by BosseB View Post
    Here I have no clue as to where the m3u8 is hidden...
    Code:
    curl -s https://livenewsof.com/msnbc-live-stream/ | grep -oP 'var player = new Clappr\.Player\(\{source: "\K[^"]+'
    Quote Quote  
  16. url=$(curl -H "Referer: https://livenewsof.com/msnbc-live-stream/" -si "https://rtmp2.livenewsof.com/livenewsof.com/qJbteqvER7/3.m3u8" | sed -ne "/^Location:/{s@Location: @@;s@:443/@/@;p}")
    echo $url

    The 'URL' is only valid for a few seconds.
    Last edited by LZAA; 7th Feb 2024 at 06:20.
    Quote Quote  
  17. Member
    Join Date
    Feb 2021
    Location
    Sweden
    Search Comp PM
    Originally Posted by 2nHxWW6GkN1l916N3ayz8HQoi View Post
    Originally Posted by BosseB View Post
    Here I have no clue as to where the m3u8 is hidden...
    Code:
    curl -s https://livenewsof.com/msnbc-live-stream/ | grep -oP 'var player = new Clappr\.Player\(\{source: "\K[^"]+'
    Thanks! This works fine.
    Quote Quote  
  18. Member
    Join Date
    Feb 2021
    Location
    Sweden
    Search Comp PM
    What about extracting an m3u8 from this URL?
    Code:
    https://www.livehdtv.net/msnbc-news/
    I have tried variations of the tricks above but cannot get the m3u8 URL nevertheless.

    Another of my extractors has stopped working so I need a new source...
    Quote Quote  
  19. to get your m3u8 link, Stream Detector Addon for Firefox or Chrome

    Code:
    https://helpfulpost.net/msnbcv12/index.m3u8?token=07f5338a2276d0d4d5b1a351ece66f135e9629f2-3f11e4aa0a13fc00193c38bbb262f849-1708718773-1708707973
    Quote Quote  
  20. Member
    Join Date
    Feb 2021
    Location
    Sweden
    Search Comp PM
    Well, that is similar to using F12 in Firefox, but is not possible to automate.
    This thread is about ways to extract m3u8 info from the player on a webpage in s (Linux) script so it can be automated.
    I could see the really long m3u8 thing using F12 when playing the video but could not figure out how to actually script its extraction.
    So that is what I need: a command inside the script that would result in the stream URL (m3u8) for use with ffmpeg.
    Quote Quote  
  21. Member
    Join Date
    Feb 2021
    Location
    Sweden
    Search Comp PM
    Originally Posted by cedric8528 View Post
    to get your m3u8 link, Stream Detector Addon for Firefox or Chrome
    As a test I installed the "Stream Detector" Add-On in my Firefox but I can not find its toolbar button anywhere even though it is listed as installed in settings.
    So finding an URL is not possible since you apparently is required to click the icon and then select the action...

    So what gives regarding this Add-On???
    Exactly where is it supposed to be (the image seems to be a musical note).

    LATER
    I found where one can get to the "Stream Detector" page and retrieve the URL:
    1) Open a webpage with the player
    2) Start playing the video stream <= IMPORTANT the URL can only be extracted while playing the video
    3) Click the "Extensions" icon in the toolbar on top (looks like a jigsaw puzzle piece)
    4) Select "The Stream Detector"
    5) Now a dialog page opens where there is a button at bottom, which copies the stream URL to the clipboard.
    (See attached screenshot)

    The URL thus extracted looks like this:
    Code:
    https://helpfulpost.net/msnbcv12/tracks-v1a1/mono.m3u8?token=43975509923d886c9641a20c5e55f1e2ee2e63a3-f2ff1c269a7215cb736040dbd2601350-1708731737-1708720937
    Where the long numerical items at the end changes after some unknown time.

    This is why I must have a system that is scripted to retrieve the URL automatically just before ffmpeg is set up to use it for the download.
    Image Attached Thumbnails Click image for larger version

Name:	StreamDetector.png
Views:	24
Size:	14.3 KB
ID:	77220  

    Last edited by BosseB; 23rd Feb 2024 at 14:53. Reason: Clarification of problem
    Quote Quote  
  22. Feels Good Man 2nHxWW6GkN1l916N3ayz8HQoi's Avatar
    Join Date
    Jan 2024
    Location
    Pepe Island
    Search Comp PM
    Originally Posted by BosseB View Post
    What about extracting an m3u8 from this URL?
    Code:
    https://www.livehdtv.net/msnbc-news/
    I have tried variations of the tricks above but cannot get the m3u8 URL nevertheless.

    Another of my extractors has stopped working so I need a new source...
    The solution may follow the next path:
    Go to https://www.livehdtv.net/msnbc-news/

    From there go to https://www.livehdtv.net/yayin/?kanal=94&yayin=&guvenlik=$2y$10$lpBMmiEyrvuiX6Lui9u8We...MsxVk/SvW0cGjW
    Can be found in the html
    Code:
    <iframe id="tvekran" marginwidth="0" marginheight="0"  src="https://www.livehdtv.net/yayin/?kanal=94&yayin=&guvenlik=$2y$10$lpBMmiEyrvuiX6Lui9u8We4jX1Ajcqr8sXXHp3PMsxVk/SvW0cGjW" frameborder="0"  scrolling="no"></iframe>
    Then from there go to https://www.livehdtv.net/token.php?stream=msnbcv12
    Can be found in the html
    Code:
    <iframe width="100%" height="100%" src="https://www.livehdtv.net/token.php?stream=msnbcv12" frameborder="0"  scrolling="no"></iframe>
    Then you extract the m3u8 from:
    Code:
    file: "https://helpfulpost.net/msnbcv12/index.m3u8?token=f31e1cc4cfebbb06b60631be046f57b5f940171f-febdc2758ddbec802ab6e4337a303cb1-1708731281-1708720481",
    But for some reason a curl on the iframe src (the yayin one) doesn't work. Maybe cookies related???
    Quote Quote  
  23. Member
    Join Date
    Feb 2021
    Location
    Sweden
    Search Comp PM
    Found out how it can be done now:

    Code:
    M3U8=$(curl -s https://www.livehdtv.net/token.php?stream=msnbcv12 | grep m3u8 | cut -d'"' -f2)
    This extracts the m3u8 URL which works in my download script with ffmpeg.
    Quote Quote  
  24. Feels Good Man 2nHxWW6GkN1l916N3ayz8HQoi's Avatar
    Join Date
    Jan 2024
    Location
    Pepe Island
    Search Comp PM
    Originally Posted by BosseB View Post
    Found out how it can be done now:

    Code:
    M3U8=$(curl -s https://www.livehdtv.net/token.php?stream=msnbcv12 | grep m3u8 | cut -d'"' -f2)
    This extracts the m3u8 URL which works in my download script with ffmpeg.
    Yes but the question is how you got from https://www.livehdtv.net/msnbc-news/ to https://www.livehdtv.net/token.php?stream=msnbcv12
    If you want your solution to be general that's the problem.
    Quote Quote  
  25. Member
    Join Date
    Feb 2021
    Location
    Sweden
    Search Comp PM
    Originally Posted by 2nHxWW6GkN1l916N3ayz8HQoi View Post
    Originally Posted by BosseB View Post
    Found out how it can be done now:

    Code:
    M3U8=$(curl -s https://www.livehdtv.net/token.php?stream=msnbcv12 | grep m3u8 | cut -d'"' -f2)
    This extracts the m3u8 URL which works in my download script with ffmpeg.
    Yes but the question is how you got from https://www.livehdtv.net/msnbc-news/ to https://www.livehdtv.net/token.php?stream=msnbcv12
    If you want your solution to be general that's the problem.
    But that was you who posted that link......
    How did you get there?
    Quote Quote  
  26. Feels Good Man 2nHxWW6GkN1l916N3ayz8HQoi's Avatar
    Join Date
    Jan 2024
    Location
    Pepe Island
    Search Comp PM
    Originally Posted by BosseB View Post
    But that was you who posted that link......
    How did you get there?
    I just got the html source code using the view-source: thing in browser. When I tried to automate via curl and grep regex the iframe src (first one) couldn't be obtained. Even manually viewing the source gave some problems once in a while. Now, if that token param is fixed for msnbc you are lucky. But if it changed to something like msnbcv13 then it will fail.

    I didn't even claim that was a solution, I just said the real one may involve a list of URLs that go through those steps I mentioned (read what I wrote). I don't know how to do the curl on the iframe. If I was sure of the solution I would have just dropped the curl for you to use. Instead I explained the possible steps.
    Quote Quote  
  27. Member
    Join Date
    Feb 2021
    Location
    Sweden
    Search Comp PM
    Originally Posted by ElCap View Post
    Originally Posted by BosseB View Post
    Originally Posted by ElCap View Post
    its basic, but this will get you the channel name part after /hls/
    Code:
    curl -s "https://livenewschat.eu/politics/" | grep m3u8 | awk -F "'" '{ print $6 }' 
    msnbc_live
    and this will get you the domain name
    Code:
    curl -s "https://data.lncnetworks.host/server.json" | awk -F '"' '{ print $6 }'
    cdn-nl1-eu.lncnetworks.host
    GREAT! Many thanks!
    I could figure out the first command but not the second.
    How did you do that?
    I am not understanding how to get the json part, but I hope I'm learning.
    I just looked at all the requests in dev tools and found the response that returned the domain name
    Strange observation:
    So I have used this for a while now and then on 2024-02-22 12:52 the site stopped working in the browser and my extraction of the m3u8 URL also therefore failed.
    But the interesting thing is that the stream seems to still be there so the latest m3u8 url's still work fine for my downloads...
    I have logged the hourly extractions for some time now so I can see what it actually retrieved while it was working and that shows a limited number of different hosts, like 10 or so, which are cycled through randomly while the rest of the url stays the same.

    By using an URL from my log I can still do the automatic downloads even when a site scanning fails and the user no longer sees a player on screen. Strange...


    Edit:
    Here are the working url's going back to 2024-02-11:

    Code:
    https://cdn-fr1-eu.lncnetworks.host/hls//index.m3u8
    https://cdn-fr1-eu.lncnetworks.host/hls/msnbc_live/index.m3u8
    https://cdn-fr1-eu.lncnetworks.host/hls/msnbcca_live/index.m3u8
    https://cdn-fr3-eu.lncnetworks.host/hls/msnbc_live/index.m3u8
    https://cdn-fr3-eu.lncnetworks.host/hls/msnbcca_live/index.m3u8
    https://cdn-nl1-eu.lncnetworks.host/hls/msnbc_live/index.m3u8
    https://cdn-nl1-eu.lncnetworks.host/hls/msnbcca_live/index.m3u8
    https://cdn-pl1-eu.lncnetworks.host/hls/msnbc_live/index.m3u8
    https://cdn-pl1-eu.lncnetworks.host/hls/msnbcca_live/index.m3u8
    https://cdn-uk1-eu.lncnetworks.host/hls/msnbc_live/index.m3u8
    Last edited by BosseB; 24th Feb 2024 at 05:33. Reason: Adding data
    Quote Quote  



Similar Threads

Visit our sponsor! Try DVDFab and backup Blu-rays!