I regularly download streams from french TV channels France 5, France 2 or France 3, using Captvty to get the list of available programs, and youtube-dl to download the videos (as Captvty saves them to TS format, requiring an extra conversion step to get a MP4 file, more compact and with better playback). Sometimes I'm too late, and the program is no longer indexed, but until recently, youtube-dl still displayed the "videoID" code, allowing to get the JSON page, and from there I generally manage to reconstruct the M3U8 link, which stays available several weeks after the "official" availability period.
Indeed, the links are of the following form :
On the JSON page :Code:http://replayftv-vh.akamaihd.net/i/streaming-adaptatif/2020/S26/J4/231505795-5ef4c6faf2db4-,standard1,standard2,standard3,standard4,.mp4.csmil/master.m3u8
...the video links have been removed, but the subtitles links are still there, and it turns out that the "231505795-5ef4c6faf2db4-" code corresponds to the beginning of the code of the subtitles links (while the end of that code below seems to be a Unix timestamp) :Code:http://sivideo.webservices.francetelevisions.fr/tools/getInfosOeuvre/v2/?idDiffusion=c4104091-2a68-45ba-820b-7c57d02bb278
Then "S26" means "26th week of the year", and "J4" means "4th day of the week", both values can be found based on the broadcast date. All the rest is usually identical.Code:http://static.francetv.fr/sous-titres//2020/06/25//231505795-5ef4c6faf2db4-1593100151.vtt
Up until around the middle of May 2020, when a program's availibility period had ended, the video links were removed on the JSON page, so youtube-dl could no longer download the video automatically, but it still displayed the "videoID" code (8-4-4-4-12 characters / in the example above : "c4104091-2a68-45ba-820b-7c57d02bb278") which was present in the source code of the program's origin HTML page. But now it's no longer the case, the "videoID" code is removed as well, as in this example :
Without it I have no way of getting the corresponding JSON page, so I can't use the above trick. Google's cached page can contain this code for a few days – that's how I managed to get the video in this example : as you can verify the M3U8 link above is still valid even though the program was officially removed on 02/07/2020 –, but when it gets updated the "videoID" is no longer there, and as far as I know only the last updated page can be accessed via Google Webcache.
The question is : is there a way to retrieve that "videoID" code ? Beside Google is there some other caching service which could keep older versions of indexed pages ? (I tried Archive.org but it doesn't archive those pages.) Or is there a way to deduce it from other available informations ? It seems to be some sort of standard, since a few days ago on a completely unrelated website I found a link which was "https://www.metv.com/20979856-4dd1-43ab-b77a-8d4fe18b40e5" – same pattern, 8-4-4-4-12 characters. What could those codes represent ? Some kind of timestamp ? checksum ? something else ?
From now on I will keep regular backups of the whole lists of available programs, obtained from these links :
But there are programs from April/May/June which I missed and which I would try to get if at all possible. So, failing a positive answer to the question above, if by any chance someone already backups those lists and is reading this it would be nice to share them... (Very unlikely I know.)Code:http://api-front.yatta.francetv.fr/standard/publish/taxonomies/france-2/contents/?size=900&page=0&filter=with-no-vod,only-visible&sort=begin_date:desc http://api-front.yatta.francetv.fr/standard/publish/taxonomies/france-3/contents/?size=900&page=0&filter=with-no-vod,only-visible&sort=begin_date:desc http://api-front.yatta.francetv.fr/standard/publish/taxonomies/france-5/contents/?size=900&page=0&filter=with-no-vod,only-visible&sort=begin_date:desc
+ Reply to Thread
Results 1 to 2 of 2
Last edited by abolibibelot; 6th Jul 2020 at 15:02.
No clue ?