VideoHelp Forum
+ Reply to Thread
Results 1 to 29 of 29
Thread
  1. Member
    Join Date
    Feb 2021
    Location
    Sweden
    Search Comp PM
    I am using an Ubuntu 20.04.3 Server machine to download streaming videos from news sites to 1 hour segments in mp4 format for daytime viewing. I have a 6-9 hour time difference to the sources so this is why I am doing this.

    I am able to script the extraction of the m3u8 stream URL from several sites, but unfortunately it does not work on all.

    For those that work I use this script code:
    Code:
       CMD="curl -s \"${STREAMURL}\" | grep -o -e \"https://.\+m3u8\" | head -n 1"
       M3U8=$(eval $CMD)
    Here the variable STREAMURL is the URL to the webpage on which the player resides and is playing the news shows.

    In other cases I have to use FireFox and while playing the video hit F12 and then watch the Network/All tab for an m3u8 line appearing, click it and then right click and select Copy/URL, which results in something like this:

    Code:
    http://1128480543.rsc.cdn77.org/wF0Xk_UoBZzEHzrCGmG7AA==,1643380502/1128480543/tracks-v1a1/mono.m3u8
    With this m3u8 URL I can then download the video like this:
    Code:
    CMD="ffmpeg -hide_banner -user_agent \"Mozilla\" -i ${M3U8} -vf scale=w=-4:h=360 -c:v libx264 -preset fast -crf 26 -c:a copy -t $CAPTURETIME $TARGETFILE"
    or
    CMD="ffmpeg -hide_banner -referer \"${VIDEOURL}\" -i \"${M3U8}\" -vf scale=w=-4:h=480 -c:v libx264 -preset fast -crf 26 -c:a copy -t ${CAPTURETIME} ${TARGETFILE}"
    Here the variables are:
    VIDEOURL - The URL to the page holding the player
    M3U8 - The m3u8 stream URL retrieved as described above
    CAPTURETIME - The output video duration in seconds
    TARGETFILE - The output mp4 file obviously...

    Notice that on some sites I have to use -user_agent and on other sites -referer, it depends on the site...

    I run the download script as an at job starting a short time before the show starts and ending a bit after it ends.

    This works OK for the few sites I have gotten it to work on, but I have the following problem:

    The M3U8 URL manually extracted may change, in some cases it changes daily or more often (a part of it like the big number 1643380502 changes...)
    Here I really need to get hold of an automatic extraction procedure which works and can be used as part of the download script.

    Any ideas on how to do this?
    I.e. how to extract the m3u8 URL from the websites that do not respond to the command I showed above?

    Like these:
    Code:
    http://www.freeintertv.com/view/id-2565
    https://livenewschat.eu/politics
    https://livenewsof.com/msnbc-live-stream
    Quote Quote  
  2. Hi BosseB

    I started with the second of the three urls i.e. https://livenewschat.eu/politics

    This feed uses the same m3u8 url .... https://ligma.cdn.livenewschat.eu/hls/msnbc_live/index.m3u8 . Master location is constant.

    Use this curl command ( for windows ... you can easily modify for your Ubuntu) to download the m3u8. Then use ffmpeg to capture the stream as per usual.

    curl -k "https://ligma.cdn.livenewschat.eu/hls/msnbc_live/index.m3u8" ^
    -H "Connection: keep-alive" ^
    -H "Pragma: no-cache" ^
    -H "Cache-Control: no-cache" ^
    -H "User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.9 Safari/537.36" ^
    -H "Accept: */*" ^
    -H "Origin: https://livenewschat.eu" ^
    -H "Sec-Fetch-Site: same-site" ^
    -H "Sec-Fetch-Mode: cors" ^
    -H "Sec-Fetch-Dest: empty" ^
    -H "Referer: https://livenewschat.eu/" ^
    -H "Accept-Language: en-US,en;q=0.9" ^
    -H "dnt: 1" ^
    -H "sec-gpc: 1" ^
    --compressed -o livenewschat.m3u8

    I'll look at the other two as time permits.
    Quote Quote  
  3. Hi BosseB

    looking at #3 https://livenewsof.com/msnbc-live-stream

    https://rtmp.livenewsof.com/hls/fx2.m3u8 <== Are you saying that this URL changes often?
    Quote Quote  
  4. Member
    Join Date
    Feb 2021
    Location
    Sweden
    Search Comp PM
    Originally Posted by jack_666 View Post
    Hi BosseB

    looking at #3 https://livenewsof.com/msnbc-live-stream

    https://rtmp.livenewsof.com/hls/fx2.m3u8 <== Are you saying that this URL changes often?
    No, that is one of the sites I cannot extract m3u8 URL from in a script, so I did it using the Firefox browser. But I believe it will not change often if at all.
    The others are bigger and contain number strings with 6-10 digits and here I have seen that they change some more often that the others. Sometimes the character strings like C90Zrw8DEqphyq8lGfWOYg also change, so this is why I would need a way to update just before the download starts.
    Quote Quote  
  5. Member
    Join Date
    Feb 2021
    Location
    Sweden
    Search Comp PM
    Originally Posted by jack_666 View Post
    Hi BosseB

    I started with the second of the three urls i.e. https://livenewschat.eu/politics

    This feed uses the same m3u8 url .... https://ligma.cdn.livenewschat.eu/hls/msnbc_live/index.m3u8 . Master location is constant.

    Use this curl command ( for windows ... you can easily modify for your Ubuntu) to download the m3u8. Then use ffmpeg to capture the stream as per usual.

    Code:
    curl -k "https://ligma.cdn.livenewschat.eu/hls/msnbc_live/index.m3u8" ^
      -H "Connection: keep-alive" ^
      -H "Pragma: no-cache" ^
      -H "Cache-Control: no-cache" ^
      -H "User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.9 Safari/537.36" ^
      -H "Accept: */*" ^
      -H "Origin: https://livenewschat.eu" ^
      -H "Sec-Fetch-Site: same-site" ^
      -H "Sec-Fetch-Mode: cors" ^
      -H "Sec-Fetch-Dest: empty" ^
      -H "Referer: https://livenewschat.eu/" ^
      -H "Accept-Language: en-US,en;q=0.9" ^
      -H "dnt: 1" ^
      -H "sec-gpc: 1" ^
      --compressed -o livenewschat.m3u8
    I'll look at the other two as time permits.
    Is the above a command that goes on the command line as a single line?
    If so it looks pretty long...
    What do the ^ characters do? Are they some kind of Windows special char?

    EDIT:
    Do you mean that the curl command above should be used to stream the video into ffmpeg like this:

    Code:
    curl <massive set of arguments as seen above> | ffmpeg <formatting arguments to get the output in the correct geometry> -t 3600 output.mp4
    If so does the timeout -t 3600 for ffmpeg work to stop curl too when the download is complete?

    I don't really understand your suggestion...

    Or does the curl command above result in an m3u8 URL printed on the command line ready to be used inside my ffmpeg command?
    Last edited by BosseB; 28th Jan 2022 at 18:16.
    Quote Quote  
  6. hi BosseB

    What do the ^ characters do? <== It breaks a long line of code into smaller and more readable parts.

    Bash shell uses this

    Code:
     -H 'Accept: */*' \
      -H 'Origin: https://livenewschat.eu' \
      -H 'Sec-Fetch-Site: same-site' \
      -H 'Sec-Fetch-Mode: cors' \
      -H 'Sec-Fetch-Dest: empty' \
      -H 'Referer: https://livenewschat.eu/' \
      -H 'Accept-Language: en-US,en;q=0.9' \
      -H 'dnt: 1' \
      -H 'sec-gpc: 1' \
    Quote Quote  
  7. you wrote

    The others are bigger and contain number strings with 6-10 digits and here I have seen that they change some more often that the others. Sometimes the character strings like C90Zrw8DEqphyq8lGfWOYg also change, so this is why I would need a way to update just before the download starts.

    C90Zrw8DEqphyq8lGfWOYg <== Time codes. Sets a end of life for the url

    I would need a way to update just before the download starts. <== That is the Holy Grail of automation. Many are seeking this out but alas no Sir Galahad (to my knowledge).
    Quote Quote  
  8. Do you mean that the curl command above should be used to stream the video into ffmpeg
    No the command download the latest m3u8 file .... look at the code.

    -o livenewschat.m3u8 <== the curl will download the m3u8 automatically to the file livenewschat.m3u8 saved on your pwd (present work directory)

    Then you use ffmpeg together with this m3u8 to retrieve the video.

    the curl command above result in an m3u8 URL printed on the command line ready to be used inside my ffmpeg command
    Correct
    Quote Quote  
  9. Member
    Join Date
    Feb 2021
    Location
    Sweden
    Search Comp PM
    Originally Posted by jack_666 View Post
    Do you mean that the curl command above should be used to stream the video into ffmpeg
    No the command download the latest m3u8 file .... look at the code.
    Well I was into the piping of data so ffmpeg could work while the download was ongoing...
    Early on I downloaded to ts files and then after download I tried to reformat, but the reformat took a long time so I had basically a process for conversion running a half hour. That is when I realized that if ffmpeg could get the stream directly I could put the processing in the same ffmpeg command and it would be ready when the stream stopped.

    So that is my approach and it works well for most streams that ffmpeg can be set to download...

    -o livenewschat.m3u8 <== the curl will download the m3u8 automatically to the file livenewschat.m3u8 saved on your pwd (present work directory)

    Then you use ffmpeg together with this m3u8 to retrieve the video.

    the curl command above result in an m3u8 URL printed on the command line ready to be used inside my ffmpeg command
    Correct
    Is there no way to *pipe* the stream from curl into ffmpeg while it is happening?
    That would be the solution if it could be done...

    The stream I am having most problem with concerning changing m3u8 url's is
    Code:
    http://www.freeintertv.com/view/id-2565
    This site offers a host of different streams so the 2565 is an example of the MSNBC stream I am looking for.
    That might complicate extraction from it though...
    Quote Quote  
  10. Member
    Join Date
    Feb 2021
    Location
    Sweden
    Search Comp PM
    Originally Posted by jack_666 View Post
    Hi BosseB

    I started with the second of the three urls i.e. https://livenewschat.eu/politics

    This feed uses the same m3u8 url .... https://ligma.cdn.livenewschat.eu/hls/msnbc_live/index.m3u8 . Master location is constant.

    Use this curl command ( for windows ... you can easily modify for your Ubuntu) to download the m3u8. Then use ffmpeg to capture the stream as per usual.

    Code:
    curl -k "https://ligma.cdn.livenewschat.eu/hls/msnbc_live/index.m3u8" ^
      -H "Connection: keep-alive" ^
      -H "Pragma: no-cache" ^
      -H "Cache-Control: no-cache" ^
      -H "User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.9 Safari/537.36" ^
      -H "Accept: */*" ^
      -H "Origin: https://livenewschat.eu" ^
      -H "Sec-Fetch-Site: same-site" ^
      -H "Sec-Fetch-Mode: cors" ^
      -H "Sec-Fetch-Dest: empty" ^
      -H "Referer: https://livenewschat.eu/" ^
      -H "Accept-Language: en-US,en;q=0.9" ^
      -H "dnt: 1" ^
      -H "sec-gpc: 1" ^
      --compressed -o livenewschat.m3u8
    I'll look at the other two as time permits.
    So I tried to modify your command to use on Linux in the following way:
    Code:
    curl -k "https://ligma.cdn.livenewschat.eu/hls/msnbc_live/index.m3u8" \
      -H "Connection: keep-alive" \
      -H "Pragma: no-cache" \
      -H "Cache-Control: no-cache" \
      -H "User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.9 Safari/537.36" \
      -H "Accept: */*" \
      -H "Origin: https://livenewschat.eu" \
      -H "Sec-Fetch-Site: same-site" \
      -H "Sec-Fetch-Mode: cors" \
      -H "Sec-Fetch-Dest: empty" \
      -H "Referer: https://livenewschat.eu/" \
      -H "Accept-Language: en-US,en;q=0.9" \
      -H "dnt: 1" \
      -H "sec-gpc: 1" \
      --compressed -o livenewschat.m3u8
    But it ran for only a second or two and left an m3u8 file containing this:
    Code:
    #EXTM3U
    #EXT-X-VERSION:3
    #EXT-X-MEDIA-SEQUENCE:360075
    #EXT-X-TARGETDURATION:6
    #EXT-X-KEY:METHOD=AES-128,URI="1643538964500.key",IV=0x00000000000000000000017EAA8E6014
    #EXTINF:6.006,
    1643538976500.ts
    #EXTINF:6.006,
    1643538982500.ts
    #EXT-X-KEY:METHOD=AES-128,URI="1643538988500.key",IV=0x00000000000000000000017EAA8EBDD4
    #EXTINF:6.006,
    1643538988500.ts
    #EXTINF:6.006,
    1643538994500.ts
    #EXTINF:6.006,
    1643539001000.ts
    #EXTINF:6.006,
    1643539007000.ts
    #EXT-X-KEY:METHOD=AES-128,URI="1643539013000.key",IV=0x00000000000000000000017EAA8F1D88
    #EXTINF:6.006,
    1643539013000.ts
    #EXTINF:6.006,
    1643539019000.ts
    #EXTINF:6.006,
    1643539025000.ts
    #EXTINF:6.006,
    1643539031000.ts
    It does not seem like this worked as it should on Linux...

    But when tested on Windows10 the result was this:
    Code:
    curl: option --compressed: the installed libcurl version doesn't support this
    curl: try 'curl --help' for more information
    So do you need a special version of curl?
    Mine is as follows:
    Code:
    curl --version
    curl 7.79.1 (Windows) libcurl/7.79.1 Schannel
    Release-Date: 2021-09-22
    Protocols: dict file ftp ftps http https imap imaps pop3 pop3s smtp smtps telnet tftp
    Features: AsynchDNS HSTS IPv6 Kerberos Largefile NTLM SPNEGO SSL SSPI UnixSockets
    Quote Quote  
  11. Member
    Join Date
    Feb 2021
    Location
    Sweden
    Search Comp PM
    Here is what is hidden behind the second URL above to make it clearer:
    Code:
    #!/bin/sh
    set -euC
    URL=$(curl -qSs -d 'chname=bXNuYmNfbGl2ZQ%3D%3D&ch=http%3A%2F%2Fwww.freeintertv.com%2Fexternals%2Ftv-russia%2Fsmotret-tv3-online&html5=11' 'http://www.freeintertv.com/myAjax/get_item_m3u8/' | grep -Eo '(http|https)://[[:alnum:].,/=*]*index\.m3u8')
    ffmpeg -i "$URL" -c copy video.mp4
    What does line "set -euC" do in the script?

    When I run the script on my Linux box it returns exactly nothing at all....
    Quote Quote  
  12. In PC it works.
    Quote Quote  
  13. convert this windows script to linux in order to capture the m3u8 url

    curl -qSs -d "chname=bXNuYmNfbGl2ZQ%3D%3D&ch=http%3A%2F%2Fwww.f reeintertv.com%2Fexternals%2Ftv-russia%2Fsmotret-tv3-online&html5=11" "http://www.freeintertv.com/myAjax/get_item_m3u8/" | sed -e "s#^.*http\(.*\)m3u8.*$#http\1m3u8#"
    the above code captures the below url

    Quote Quote  
  14. Member
    Join Date
    Feb 2021
    Location
    Sweden
    Search Comp PM
    So I tried to run the text above direct in the terminal on Linux after I had discovered and removed the extra space in the part that read:
    Code:
    2Fwww.f reeintertv.com%2
           ^
    But the result was:
    Code:
    sed: -e expression #1, char 33: unterminated `s' command
    (23) Failed writing body
    So I decided to skip the sed part to see what was actually coming out of the curl call:
    Code:
    $ curl -qSs -d "chname=bXNuYmNfbGl2ZQ%3D%3D&ch=http%3A%2F%2Fwww.freeintertv.com%2Fexternals%2Ftv-russia%2Fsmotret-tv3-online&html5=11" "http://www.freeintertv.com/myAjax/get_item_m3u8/"
    playlist[0]['file']='http://1128480543.rsc.cdn77.org/RxrBwWHJG3JXeR_UqGbxUA==,1643582113/1128480543/index.m3u8'; get_item.showPlayer(); bosse@ubuntuserv:~/www/MSNBC/download$
    So there is something the matter with the pipe into sed, the sed command is this:
    Code:
    sed -e "s#^.*http\(.*\)m3u8.*$#http\1m3u8#"
    Since I have no idea what is going on in the sed part I cannot interpret its error message...
    What does the "unterminated `s' command" mean?

    And from where does the string "bXNuYmNfbGl2ZQ" in the curl command come from?
    If it is changing from time to time then it won't work for long...
    Quote Quote  
  15. You could use python, something like this

    Code:
    #! /usr/bin/python
    import os
    import ffmpy
    import requests
    import re
    
    hds = {
        'Connection': 'keep-alive',
        'Accept': 'text/plain, */*; q=0.01',
        'X-Requested-With': 'XMLHttpRequest',
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.99 Safari/537.36',
        'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
        'Origin': 'http://www.freeintertv.com',
        'Accept-Language': 'en-US,en;q=0.9,es;q=0.8,pt;q=0.7,cs;q=0.6,fr;q=0.5,zh-TW;q=0.4,zh;q=0.3'
        }
    
    
    url_canal = 'http://www.freeintertv.com/myAjax/get_item_m3u8/'
    data_raw = 'chname=bXNuYmNfbGl2ZQ%3D%3D&ch=http%3A%2F%2Fwww.freeintertv.com%2Fexternals%2Ftv-russia%2Fsmotret-tv3-online&html5=11'
    
    
    rs = requests.post(url_canal, headers=hds, data=data_raw)
    cnt = rs.text
    pattern = r"='(.*)'"
    x = re.search(pattern, cnt)
    url_ch = x.group(1)
    
    
    
    ff = ffmpy.FFmpeg(inputs={url_ch: '-headers "User-Agent:  Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.99 Safari/537.36"'}, outputs={'MSNBC.mp4': '-acodec copy -vcodec copy -t 35'}, global_options="-y -hide_banner -loglevel warning")
    ff.run()
    Quote Quote  
  16. Member
    Join Date
    Feb 2021
    Location
    Sweden
    Search Comp PM
    Tested the python script:
    Code:
    $ ./pytest
    Traceback (most recent call last):
      File "./pytest", line 3, in <module>
        import ffmpy
    ModuleNotFoundError: No module named 'ffmpy'
    Same if I changed
    #! /usr/bin/python
    to
    #! /usr/bin/python3
    Same if I test on a different Linux server...

    I have never used python so I don't know how to handle it.
    Quote Quote  
  17. Member
    Join Date
    Feb 2021
    Location
    Sweden
    Search Comp PM
    Originally Posted by LZAA View Post
    Try it:

    echo ffmpeg
    Why are you posting something like this?
    Trolling?
    Quote Quote  
  18. it just means that you dont have ffmpy installed.
    Code:
    pip3 install ffmpy
    then run again
    Code:
    python3 pytest.py
    Quote Quote  
  19. Originally Posted by BosseB View Post
    Tested the python script:
    Code:
    $ ./pytest
    Traceback (most recent call last):
      File "./pytest", line 3, in <module>
        import ffmpy
    ModuleNotFoundError: No module named 'ffmpy'
    Same if I changed
    #! /usr/bin/python
    to
    #! /usr/bin/python3
    Same if I test on a different Linux server...

    I have never used python so I don't know how to handle it.
    You must first install the package before you can use it in your code. Run the following command to install the package and its dependencies.

    Code:
    pip install ffmpy
    Quote Quote  
  20. Member
    Join Date
    Feb 2021
    Location
    Sweden
    Search Comp PM
    So now I get this instead:
    Code:
    $ python3 pytest.py
    [http @ 0x5632785d9200] No trailing CRLF found in HTTP header. Adding it.
    After the message above is printed nothing happens for a while and then the cursor is returned with no further output.
    Note:
    I am doing this on LinuxMint 20.03
    Quote Quote  
  21. Originally Posted by BosseB View Post
    So now I get this instead:
    Code:
    $ python3 pytest.py
    [http @ 0x5632785d9200] No trailing CRLF found in HTTP header. Adding it.
    After the message above is printed nothing happens for a while and then the cursor is returned with no further output.
    The script will download 35 seconds of the MSNBC channel, you will find the mp4 file inside the same folder as the script.
    You have to modify the ffmpeg command to suit what you want to do.
    Code:
    ff = ffmpy.FFmpeg(inputs={url_ch: '-headers "User-Agent:  Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.99 Safari/537.36"'}, outputs={'MSNBC.mp4': '-acodec copy -vcodec copy -t 35'}, global_options="-y -hide_banner -loglevel warning")
    Quote Quote  
  22. Member
    Join Date
    Feb 2021
    Location
    Sweden
    Search Comp PM
    Originally Posted by dark125 View Post
    Originally Posted by BosseB View Post
    So now I get this instead:
    Code:
    $ python3 pytest.py
    [http @ 0x5632785d9200] No trailing CRLF found in HTTP header. Adding it.
    After the message above is printed nothing happens for a while and then the cursor is returned with no further output.
    The script will download 35 seconds of the MSNBC channel, you will find the mp4 file inside the same folder as the script.
    You have to modify the ffmpeg command to suit what you want to do.
    Code:
    ff = ffmpy.FFmpeg(inputs={url_ch: '-headers "User-Agent:  Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.99 Safari/537.36"'}, outputs={'MSNBC.mp4': '-acodec copy -vcodec copy -t 35'}, global_options="-y -hide_banner -loglevel warning")
    OK, I did not realize that it would download the stream itself...
    I found the mp4 now and it is OK (except the wrong size, but that is easily fixed).

    But I already have a suite of scripts that handle the downloads but they need the m3u8 stream URL to work.

    In some cases I have been able to script the extraction of the current m3u8 URL from a number of sites so that it can be extracted a few minutes before the actual download starts and saved to a file read by the download script.

    But for some sites I have not been able to automate this extraction so I have used the F12 debug mode of FireFox to read the m3u8 URL and write it manually to the file. This has worked as long as the m3u8 URL does not change over time, but unfortunately this happens on some sites where there seems to be strings like EJZgovLg2izA2gQ or 1643299593 embedded as part of the URL. These items are often short-lived and therefore a scripted extraction is needed.
    I have noted that the python code above contains this:

    Code:
    data_raw = 'chname=bXNuYmNfbGl2ZQ%3D%3D&ch=http%3A%2F%2Fwww.freeintertv.com%2Fexternals%2Ftv-russia%2Fsmotret-tv3-online&html5=11'
    This is the type of string that in my experience changes with time and so needs automated extraction, but that is not shown in your example. From where did you get these values?

    So this thread is mainly about automatically finding the m3u8 stream url to be used with the ffmpeg command, which looks like this:
    Code:
    CMD="ffmpeg -hide_banner ${MODE} -i \"${M3U8URL}\" -vf scale=w=-4:h=480 -c:v libx264 -preset fast -crf 26 -c:a copy -t ${CAPTURETIME} ${TARGETFILE}"
    Here the variables are:
    MODE: "-user_agent \"Mozilla\"" or "-referer \"${VIDEOURL}\"" depending on site
    VIDEOURL: The page URL used as referer
    M3U8URL: The m3u8 stream url we are discussing here
    CAPTURETIME: The download time in seconds
    TARGETFILE: The output mp4 file

    If I manually find the m3u8 url via FireFox then the ffmpeg works OK.
    But at irregular times the m3u8 changes (in some sites only) so it has to be extracted again...
    This extraction is what I am looking for...
    Quote Quote  
  23. Originally Posted by BosseB View Post
    Code:
    data_raw = 'chname=bXNuYmNfbGl2ZQ%3D%3D&ch=http%3A%2F%2Fwww.freeintertv.com%2Fexternals%2Ftv-russia%2Fsmotret-tv3-online&html5=11'
    This is the type of string that in my experience changes with time and so needs automated extraction, but that is not shown in your example. From where did you get these values?
    That parameter doesn't seem to change, the decode is this

    Code:
    chname=bXNuYmNfbGl2ZQ==&ch=http://www.freeintertv.com/externals/tv-russia/smotret-tv3-online&html5=11
    if we keep decoding bXNuYmNfbGl2ZQ== it is
    Code:
    msnbc_live
    It is simply the name of the channel, so if you want to download another channel, we must change the data_raw, for cnn it would be this
    Code:
    data_raw = 'chname=Y25uX2xpdmU%3D&ch=http%3A%2F%2Fwww.freeintertv.com%2Fexternals%2Ftv-russia%2Fsmotret-tv3-online&html5=11'
    If you want to automate it you can create a bash script to run the python file and use crontab
    Code:
    #!/bin/bash
    
    cd /path/to/script
    
    ./download_msnbc.py
    Quote Quote  
  24. Originally Posted by BosseB View Post

    So there is something the matter with the pipe into sed, the sed command is this:
    Code:
    sed -e "s#^.*http\(.*\)m3u8.*$#http\1m3u8#"
    Since I have no idea what is going on in the sed part I cannot interpret its error message...
    What does the "unterminated `s' command" mean?
    You can also try this
    Code:
    #!/bin/sh
    set -euC
    URL=$(curl -qSs -d 'chname=bXNuYmNfbGl2ZQ%3D%3D&ch=http%3A%2F%2Fwww.freeintertv.com%2Fexternals%2Ftv-russia%2Fsmotret-tv3-online&html5=11' 'http://www.freeintertv.com/myAjax/get_item_m3u8/' | sed -e "s/^.*http\(.*\)m3u8.*$/http\1m3u8/g")
    ffmpeg -y -hide_banner -loglevel warning -headers "User-Agent:  Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.99 Safari/537.36" -i "$URL" -acodec copy -vcodec copy -t 35 MSNBC.mp4
    Quote Quote  
  25. Member
    Join Date
    Feb 2021
    Location
    Sweden
    Search Comp PM
    Originally Posted by dark125 View Post
    Originally Posted by BosseB View Post
    Code:
    data_raw = 'chname=bXNuYmNfbGl2ZQ%3D%3D&ch=http%3A%2F%2Fwww.freeintertv.com%2Fexternals%2Ftv-russia%2Fsmotret-tv3-online&html5=11'
    This is the type of string that in my experience changes with time and so needs automated extraction, but that is not shown in your example. From where did you get these values?
    That parameter doesn't seem to change, the decode is this

    Code:
    chname=bXNuYmNfbGl2ZQ==&ch=http://www.freeintertv.com/externals/tv-russia/smotret-tv3-online&html5=11
    if we keep decoding bXNuYmNfbGl2ZQ== it is
    Code:
    msnbc_live
    It is simply the name of the channel, so if you want to download another channel, we must change the data_raw, for cnn it would be this
    Code:
    data_raw = 'chname=Y25uX2xpdmU%3D&ch=http%3A%2F%2Fwww.freeintertv.com%2Fexternals%2Ftv-russia%2Fsmotret-tv3-online&html5=11'
    Interesting!
    So how is the strange looking string decoded into msnbc_live?
    I know that %2F is encoding of / but how is bXNuYmNfbGl2ZQ decoded/encoded?

    EDIT:
    I tried base64 like this:
    Code:
    $ echo 'bXNuYmNfbGl2ZQ' | base64 -d
    msnbc_livebase64: invalid input
    Then I tried this:
    Code:
    $ echo 'bXNuYmNfbGl2ZQo=' | base64 -d
    msnbc_live
    So somehow it is base64 but the input string has to be padded in some way...
    /EDIT

    I tried putting this into my existing URL-extracting script:
    Code:
    M3U8URL=$(curl -qSs -d 'chname=bXNuYmNfbGl2ZQ%3D%3D&ch=http%3A%2F%2Fwww.freeintertv.com%2Fexternals%2Ftv-russia%2Fsmotret-tv3-online&html5=11' 'http://www.freeintertv.com/myAjax/get_item_m3u8/' | sed -e "s/^.*http\(.*\)m3u8.*$/http\1m3u8/g")
    Then this is saved to the file read by the main download script and it does work!

    Thank you so much for providing this and explaining the source of the strange looking strings!
    Last edited by BosseB; 31st Jan 2022 at 12:18. Reason: Found solution by myself
    Quote Quote  
  26. Member
    Join Date
    Feb 2021
    Location
    Sweden
    Search Comp PM
    Just one question about something that mystifies me:
    Given the way the URL looks like, how does the web-server or the browser or whatever it is deduce that a string like this:
    Code:
    chname=bXNuYmNfbGl2ZQ%3D%3D&ch=http%3A%2F%2Fwww.freeintertv.com%2Fexternals%2Ftv-russia%2Fsmotret-tv3-online&html5=11
    is partially base64 encoded?
    I do not understand how anyone can automatically treat this as an URL by decoding certain parts of it...
    The % notation used for control characters and the like I understand but not use of base64.
    Quote Quote  



Similar Threads

Visit our sponsor! Try DVDFab and backup Blu-rays!