Normally when I rip a stream from a website, I first have to visit the website and use Firefox's Inspect Element tools get the manifest.mpd link. Once I have this link I can then easily download the chunks to a single video file.
My question is... is it possible to get the manifest.mpd link without using a web browser? Maybe some command line program like WGET to probe the website and have it return the manifest.mpd link.
Basically, I want to download the daily news show from a specific website and I'm just trying come up with a way to streamline the process. So in the end, I'll have a windows task scheduled that will automatically download the video everyday without intervention.
-Pete
+ Reply to Thread
Results 1 to 18 of 18
-
-
You didn't say what software you are using but youtube-dl can parse many web pages to get the video url and download the video. For example, youtube videos can be downloaded with just the page URL:
Code:youtube-dl https://www.youtube.com/watch?v=7i_ecXwbtRo
-
Hi syrist ,
======== JUST for EXAMPLE !!! first step !!!
Code:@echo off CHCP 1252 > nul COLOR 9F mode con: cols=100 lines=50 set "ua=Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1" :: set moment=%time:~0,2%:%time:~3,2%:%time:~6,2% :: echo Nous sommes le %date% - ( heure de connexion ) Il est %moment% :Origine echo. :ChoixMois echo. echo exemple : 10 pour Octobre ; echo. SET /P m=Entrer le Mois : :ChoixJour echo. echo exemple : 01 pour le jour n° 1 ; 30 pour le jour n° 30 echo. SET /P j=Entrer le Jour : :Detail_LaD wget -U "%ua%" --no-check-certificate "https://www.ladepeche.fr/" -O "ZD_page_www.ladepeche.fr.txt" type ZD_page_www.ladepeche.fr.txt | sed "s#""#'#g;" > D_test_%m%_%j%.txt
A lot of work after , for catching .mpd link !!!
Cheers .JE SUIS CHARLIE !!! -
Thanks for the suggestion... but I keep getting "Unsupported URL":
Code:youtube-dl --verbose "https://www.ctvnews.ca/ctv-national-news" [debug] System config: [] [debug] User config: [] [debug] Custom config: [] [debug] Command-line args: ['--verbose', 'https://www.ctvnews.ca/ctv-national-news'] [debug] Encodings: locale cp1252, fs mbcs, out cp437, pref cp1252 [debug] youtube-dl version 2019.10.16 [debug] Python version 3.4.4 (CPython) - Windows-10-10.0.18362 [debug] exe versions: none [debug] Proxy map: {} [generic] ctv-national-news: Requesting header WARNING: Falling back on generic information extractor. [generic] ctv-national-news: Downloading webpage [generic] ctv-national-news: Extracting information ERROR: Unsupported URL: https://www.ctvnews.ca/ctv-national-news Traceback (most recent call last): File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpfc5s43s4\build\youtube_dl\YoutubeDL.py", line 796, in extract_info File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpfc5s43s4\build\youtube_dl\extractor\common.py", line 530, in extract File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpfc5s43s4\build\youtube_dl\extractor\generic.py", line 3349, in _real_extract youtube_dl.utils.UnsupportedError: Unsupported URL: https://www.ctvnews.ca/ctv-national-news
-
Hi, if I'm understanding your batch file... it essentially downloads the webpage to a text file. Unfortunately the actual manifest link isn't stored in the mainpage but I did notice the daily video's ContentID is stored there. So I should be able to use that ContentID to generate a new URL which will then give me the video's PackageID. Then using those two numbers I can generate the correct manifest.mpd link. This should work.... thanks for help.
Last edited by syrist; 19th Oct 2019 at 08:20.
-
I figured it out. While the news site's main page doesn't have the actual manifest URL, it does contain the video's content ID number for the current daily episode. I was able to use that number to generate a URL that will give me episode's package ID number. Then using both numbers, I was able to create the final manifest URL.
So in the end, this was the process:
- use WGET to download the webpage source to a text file
- use FINDSTR and FOR loops to search and extract the contentID from the text file
- use WGET again to download the contentID URL to another text file
- use FINDSTR and FOR loops again to search and extract the packageID from the second text file
- finally use AdobeHDS to download the video
Everything fit nicely into a single batch file. -
I thought this was interesting. Could you post the full batch file? Others would find it useful -- especially the text parsing. I went through the procedure and fleshed out the instructions a bit...
The page with the videos is:
Code:https://www.ctvnews.ca/ctv-national-news
Code:https://capi.9c9media.com/destinations/ctvnews_web/platforms/desktop/contents/1801156/contentpackages/3106903/manifest.mpd
[Attachment 50591 - Click to enlarge]
That ContentID (1801156) is the first of the two numbers in the manifest URL. It is also used to get another page with the second number you need:
Code:https://capi.9c9media.com/destinations/ctvnews_web/platforms/desktop/contents/1801156?%24include=%5BId%2CName%2CDesc%2CShortDesc%2CType%2COwner%2CMedia%2CSeason%2CEpisode%2CGenres%2CImages%2CContentPackages%2CAuthentication%2CPeople%2COmniture%2C+revShare%5D&%24lang=en
[Attachment 50592 - Click to enlarge]
That is the second number you need to build the manifist.mpd URL.
<edit---------------------------------------------------------------------------------------------------------------->
syrist suggested this simpler URL which gets an even simpler page to parse:
Code:https://capi.9c9media.com/destinations/ctvnews_web/platforms/desktop/contents/1801156/contentpackages/
[Attachment 50593 - Click to enlarge]
</edit---------------------------------------------------------------------------------------------------------------->
You now have both numbers you need to build the full mpd url and download the video.
Code:youtube-dl https://capi.9c9media.com/destinations/ctvnews_web/platforms/desktop/contents/1801156/contentpackages/3106903/manifest.mpd
Last edited by jagabo; 19th Oct 2019 at 22:07.
-
Hey jagabo, great job on explaning! So the second URL was actually much simpler... it was basically just first half of the manifest URL up until the beginning of the second missing number:
Code:https://capi.9c9media.com/destinations/ctvnews_web/platforms/desktop/contents/1801156/contentpackages/
Code:@ECHO OFF ::-------------------------------------------------------------------------------------------------------------------- :: Download the CTV National News main page to 1_CTV_MAINPAGE.txt ::-------------------------------------------------------------------------------------------------------------------- SET "ua=Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0" wget.exe -U " %" --no-check-certificate "https://www.ctvnews.ca/ctv-national-news" -O "1_CTV_MAINPAGE.txt" ::-------------------------------------------------------------------------------------------------------------------- :: Find all instances of " contentId" and store into 2_CTV_CONTENTID-ALL.txt ::-------------------------------------------------------------------------------------------------------------------- FINDSTR /C:" contentId" "1_CTV_MAINPAGE.txt" > "2_CTV_CONTENTID-ALL.txt" ::-------------------------------------------------------------------------------------------------------------------- :: Clean up the results by removing unnecessary data and save results to 3_CTV_CONTENTID-CLEANED.txt :: set "line=!line:(before)=(after)!" ::-------------------------------------------------------------------------------------------------------------------- setlocal (for /f "delims=" %%i in (2_CTV_CONTENTID-ALL.txt) do ( set "line=%%i" setlocal enabledelayedexpansion set "line=!line: =!" set "line=!line:contentId: =!" set "line=!line:, =!" echo(!line! endlocal ))>"3_CTV_CONTENTID-CLEANED.txt" ::-------------------------------------------------------------------------------------------------------------------- :: Extract only the first line from 3_CTV_CONTENTID-CLEANED.txt and save to variable contentId ::-------------------------------------------------------------------------------------------------------------------- FOR /f "delims=" %%a in (3_CTV_CONTENTID-CLEANED.txt) do set contentId=%%a&call :process :process echo contentId=%contentId% ::-------------------------------------------------------------------------------------------------------------------- :: Use contentId value to download the packageID info to 4_CTV_PACKAGEID.txt ::-------------------------------------------------------------------------------------------------------------------- SET "ua=Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0" wget.exe -U " %" --no-check-certificate "https://capi.9c9media.com/destinations/ctvnews_web/platforms/desktop/contents/%contentId%/contentpackages/" -O "4_CTV_PACKAGEID.txt" ::-------------------------------------------------------------------------------------------------------------------- :: Clean up the results by removing unnecessary data and save results to 5_CTV_PACKAGEID-CLEANED.txt ::-------------------------------------------------------------------------------------------------------------------- setlocal (for /f "delims=" %%i in (4_CTV_PACKAGEID.txt) do ( set "line=%%i" setlocal enabledelayedexpansion set "line=!line:{"Items":[{"Id":=!" set "line=!line:,"Name":"Adaptive8-CTVNews"}],"ItemsType":"Content Package"}=!" echo(!line! endlocal ))>"5_CTV_PACKAGEID-CLEANED.txt" ::-------------------------------------------------------------------------------------------------------------------- :: Extract only the first line from 3_CTV_CONTENTID-CLEANED.txt and save to variable packageId ::-------------------------------------------------------------------------------------------------------------------- FOR /f "delims=" %%a in (5_CTV_PACKAGEID-CLEANED.txt) do set packageId=%%a&call :process :process echo packageId=%packageId% ::-------------------------------------------------------------------------------------------------------------------- :: Use both contentId and packageId to create the final manifest and download the video :: I also renamed manifest.mpd to manifest.f4m as it works with AdobeHDS :: Since AdobeHDS will save video to FLV, I had to use ffmpeg to change video container to mp4 ::-------------------------------------------------------------------------------------------------------------------- SETLOCAL ENABLEEXTENSIONS ENABLEDELAYEDEXPANSION SET ManifestURL=https://capi.9c9media.com/destinations/ctvnews_web/platforms/desktop/contents/%contentId%/contentpackages/%packageId%/manifest.f4m php.exe "C:\Hacks\Files\AdobeHDS\AdobeHDS.php" --outfile "CTV" --delete --manifest "!ManifestURL!" FFmpeg.exe -y -i "CTV.flv" -codec copy "CTV.mp4" ::-------------------------------------------------------------------------------------------------------------------- :: Finally delete all temp files :: I also have to use the "exit" command to exit, otherwise AdobeHDS will keep re-downloading the chunks again... not sure why ::-------------------------------------------------------------------------------------------------------------------- DEL "1_CTV_MAINPAGE.txt" DEL "2_CTV_CONTENTID-ALL.txt" DEL "3_CTV_CONTENTID-CLEANED.txt" DEL "4_CTV_PACKAGEID.txt" DEL "5_CTV_PACKAGEID-CLEANED.txt" DEL "CTV.flv" PAUSE EXIT
Last edited by syrist; 19th Oct 2019 at 22:00.
-
Thanks for batch file. I verified that your simpler URL for the second page works and gets you a much simpler page to parse.
Code:https://capi.9c9media.com/destinations/ctvnews_web/platforms/desktop/contents/1801156/contentpackages/
-
Hi ,
My try .
NOTE : not used : AdobeHDS , FFmpeg . Used streamlink
In the batch file
Code:"x:\python\Scripts\streamlink.exe" "hds://https://capi.9c9media.com/destinations/ctvnews_web/platforms/desktop/contents/%contentId%/contentpackages/%packageId%/manifest.f4m" 640k -o ctvnews_web_1801157.mp4
:: contentId=1801157
:: packageId=3107974
!!! Launching ( manually ) with quality = 640k !!! ( not the best , my line is low )
Code:E:\...\Recherche_MPD>"D:\python_2712\Scripts\streamlink.exe" "hds://https://capi.9c9media.com/destinations/ctvnews_web/platforms/desktop/contents/1801157/contentpackages/3107974/manifest.f4m" 640k -o ctvnews_web_1801157.mp4
Code:[cli][info] Found matching plugin hds for URL hds://https://capi.9c9media.com/destinations/ctvnews_web/platforms/desktop/contents/1801157/contentpackages/3107974/manifest.f4m [cli][info] Available streams: 300k (worst), 480k, 640k, 896k, 1280k, 1536k, 1856k (best) [cli][info] Opening stream: 640k (hds) [download][ctvnews_web_1801157.mp4] Written 117.3 MB (5m8s @ 422.7 KB/s) error: Error when reading from stream: Read timeout, exiting [cli][info] Stream ended [cli][info] Closing currently open stream...
JE SUIS CHARLIE !!! -
I did a download speed test comparing AdobeHDS, youtube-dl, and Streamlink:
AdobeHDS = 27 seconds
youtube-dl = 3 mins 47 seconds
Streamlink = 3 mins 42 seconds (before it pauses near the end)
All 3 downloaded the exact same stream, file size, bitrate, etc. Only difference is AdobeHDS saved to AVS-FLV while youtube-dl and Streamlink saved to AVS-MP4. But it only took 1/2 second for ffmpeg to change the flv container to mp4.
Not sure why the huge speed difference.Last edited by syrist; 20th Oct 2019 at 10:35.
-
Hi syrist ,
For the fun , I have created an other batch file .
Note 1 : I use programs :
_ wget , grep , sed , cut ; staying in a directory "E:\base"
_ streamlink staying in a directory "D:\python\Scripts"
Note 2 : I search for ONE 'contentID'
Note 3 : The number of characters are set for 'packageId' = 7 ; ( see: "e:\base\cut.exe" -c 17-23 )
Note 4 : in the line "...\streamlink.exe" "https.../manifest.f4m" ; adding -l info , you can read all available qualities .
Code:@echo on REM === your commands SET "ua=Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0" "e:\base\wget.exe" -U "%ua%" --no-check-certificate "https://www.ctvnews.ca/ctv-national-news" -O "1_CTV_MAINPAGE.txt" REM === REM === :: catching all 'contentID' type 1_CTV_MAINPAGE.txt | "e:\base\grep.exe" "contentId" > 1_1_contentID.txt :: catching the first line with 'contentID' type 1_1_contentID.txt | "e:\base\sed.exe" "s# ##g;" | "e:\base\sed.exe" -n "1p" > 1_2_contentID.txt :: catching the number for 'contentID' type 1_2_contentID.txt | "e:\base\sed.exe" "s# ##g;" | "e:\base\sed.exe" "s#contentId: ##g;" | "e:\base\cut.exe" -d"," -f 1 > 1_3_contentID.txt :: the contentID as a variable set /p contentId=<1_3_contentID.txt REM === REM === your commands "e:\base\wget.exe" -U " %" --no-check-certificate "https://capi.9c9media.com/destinations/ctvnews_web/platforms/desktop/contents/%contentId%/contentpackages/" -O "4_CTV_PACKAGEID.txt" REM === REM === :: catching the line with 'packageId' type 4_CTV_PACKAGEID.txt | "e:\base\sed.exe" "s#""#'#g" | "e:\base\sed.exe" "s#,'N#|,'N#g;" | "e:\base\cut.exe" -d"|" -f 1 | "e:\base\cut.exe" -c 17-23 > 2_1_packageId.txt :: the packageid as a variable set /p packageId=<2_1_packageId.txt REM === REM === :: DL with quality 640k "D:\python\Scripts\streamlink.exe" "https://capi.9c9media.com/destinations/ctvnews_web/platforms/desktop/contents/%contentId%/contentpackages/%packageId%/manifest.f4m" 640k -o ctvnews_web_%contentId%-%packageId%.mp4 REM === pause :fin
Maybe this could be an improvment ?
Cheers .JE SUIS CHARLIE !!! -
The broadcast date is stored in:
Code:https://capi.9c9media.com/destinations/ctvnews_web/platforms/desktop/contents/%contentId%/
Code:{"Id":1806443,"Name":"CTV National News for Monday, October 21, 2019","Desc":"Led by Chief Anchor and Senior Editor, Lisa LaFlamme, CTV National News is Canada's #1 national newscast","ShortDesc":"Led by Chief Anchor and Senior Editor, Lisa LaFlamme, CTV National News is Canada's #1 national newscast","Type":"episode","Owner":{},"Episode":296,"AgvotCode":"E","AgvotDisclaimer":null,"QfrCode":"G","AiringOrder":"296","BroadcastDate":"2019-10-21","BroadcastTime":"23:00:00","BroadcastDateTime":"2019-10-21T23:00:00-04:00","LastModifiedDateTime":"2019-10-22T07:13:42Z","GameId":"","Album":"","Genres":[],"Keywords":[],"Tags":[],"Images":[{"Type":"thumbnail","Url":"https://images2.9c9media.com/image_asset/2019_3_29_13a0edfc-6b30-4aae-ad20-d926797b2679_png_1920x1080.jpg","Width":1920,"Height":1080}],"Authentication":{"Required":false,"Resources":null},"NextAuthentication":{"StartDate":"2019-10-29T00:00:00-04:00","EndDate":""},"RatingWarnings":[],"People":[],"Funding":null,"MusicLabels":[],"BroadcastNetworks":[]}
BTW, your batch file is great... I havn't worked with linux commands so it was cool to see them in action and it worked (except I had to fix the WGET command on the 23rd line... I think you accidentally split it into a second line) -
Something is wrong when posting
I don't find how to delete this reply (???)Last edited by aazerty; 23rd Oct 2019 at 05:56.
JE SUIS CHARLIE !!! -
Hi syrist ,
Thanks for the reply .
======================
!!! given by your batch file !!! see post #8
Code:https://forum.videohelp.com/threads/394696-Any-way-to-get-a-streaming-site-s-manifest-URL-via-command-line#post2563315
Code:REM === your commands "e:\base\wget.exe" -U " %" --no-check-certificate "https://capi.9c9media.com/destinations/ctvnews_web/platforms/desktop/contents/%contentId%/contentpackages/" -O "4_CTV_PACKAGEID.txt" REM ===
Obviously something were wrong HERE ' -U "!ua!" ' . Twice lines.
??? I have an issue with % ??? when writting it here ???
REPLACE ! by % ( 4 times in the line below )
Code:REM === "e:\base\wget.exe" -U "!ua!" --no-check-certificate "https://capi.9c9media.com/destinations/ctvnews_web/platforms/desktop/contents/!contentId!/contentpackages/" -O "4_CTV_PACKAGEID.txt" REM ===
JE SUIS CHARLIE !!! -
Jagabo,
Where does the following link come from? I can't find it anywhere.
Code:
https://capi.9c9media.com/destinations/ctvnews_web/platforms/desktop/contents/1801156?%24include=%5B -
It can be seen in the network traffic using a browser's Developer Tools. Press F12, go to the Network tab, reload the page:
[Attachment 50662 - Click to enlarge]
Note the highlighted line and the URL in the popup.
Similar Threads
-
mkvmerge command line
By barbosa in forum Newbie / General discussionsReplies: 0Last Post: 10th Feb 2019, 07:09 -
vidCoder command line?
By Cazz in forum Video ConversionReplies: 0Last Post: 13th Nov 2018, 16:38 -
how to obtain the url from manifest.mpd
By gigio in forum Video Streaming DownloadingReplies: 4Last Post: 29th Dec 2017, 18:33 -
tsMuxeR command line
By t00 in forum Blu-ray RippingReplies: 7Last Post: 19th Nov 2017, 12:20 -
Downloading from this site. Cant find manifest or playlist. please help.
By tuberlildk in forum Video Streaming DownloadingReplies: 0Last Post: 23rd Aug 2016, 10:49