An Escape Plan...
OK! You've captured your own Content Decryption Module (CDM); tried a few sites and found you still cannot get all the keys that you expected. And you are a bit fed-up!
I scratched at that brick wall too trying to find a way out the dungeon. At the time it felt like the wall was made of headers!
It actually isn't.
The real problem is being able to see and understand what is going on! Headers are relatively easy to sort out, in the main.
Headers go unseen in our normal web interactions and finding a way to see what is happening is essential.
But before we do - a note from our sponsor:-
In what that follows, examples will come from sites that are free to access and which mainly broadcast media, in the clear, to the Universe; including towards aliens orbiting Alpha-Centuri. If they can see it for free in a few light years time - why can't I see it now?Originally Posted by Hairless Richard
There is no discussion here of pay sites like Netflix or HBO or Amazon, but the principles are the same. The forum rules forbid chat about pay-sites so please don't ask. I don't go after pay sites myself anyway - so have no knowledge to share.
We are going to look for 'patterns'. Patterns are useful to developers and books get published wih patterns of code that does stuff which developers incorporate in their code to do stuff for them too. It seems highly probable to me that Developer X will re-use code produced and shared by Developer Y.
Google also published, and then restricted access to, the Widevine API - all WV developers must conform to the needs of the API so there can only be a few patterns of interaction.
Let's start escaping the Dungeon of Despair.. but with some new tools!
Information is key. We need to see what is going on during the whole process of playing a video in a browser. After-all, we want to mimic some of that process with a python script.
Developer tools, in your browser under F12 allows some insight;
if you select Network to see the traffic;
click on the license url; and
select headers - you can see the contents quite readily.
[Attachment 67054 - Click to enlarge]Basic Headers
So you can sort headers out yourself by looking and thinking.
Think: What you are trying to do is mimic the browser by sending requests to a license server to get it to send you data, so your CDM can use the data to produce keys
OK, just think where you can see an example of what the browser sends? (Clue image above!)
Look: Inside the dev tools of your browser and inspect the request headers for the license url.
Your request headers need to look like that, in the main. With experience you will be able to miss out the junk parts of the the header's contents that make no difference to whether keys are obtained or not.
Headers are found by pasting the license url copied as a cURL from the browsers dev-tools into curlconverter.com. You want only the part
But also look inside params {..} And check there is nothing there that is inside the headers data in your dev-tools request headers. If there is, copy it to a new line in your headers section. Key-words to look out for are bearer, x-authorization-data, key. But, by inspecting the browser's license request you will be able to see exactly what is needed,Code:headers = { .. }
There will be more on headers below - headers can be grouped from not-needed to complex.
But lets move away from browser interactions for a moment, what about out interactions using using l3.py and our CDM?
The command window may show problems with the python code where the routine came to a halt but that is all. A typical beginners response from l3.py with bad headers is 'Check protobufs'.
How can we see the goings on there?
Sorenb kindly pointed me to HTTPToolkit!! Get it and run it. It is the bees knees! Scroll down to the second "Download Free for Windoze" button and clicking reveals versions for Mac and Linux as well as Windoze.
Open it and select "Existing Terminal"
[Attachment 67073 - Click to enlarge]Select Existing Terminal
Copy the code HttpToolKit provides - repeated here:-into the terminal (command) window. Now any Terminal (command) interactions here will show up in the HttpToolKit window.Code:eval "$(curl -sS localhost:8001/setup)"
The code below shows an example terminal command; it will fail and give no screen output. Quite useless for understanding what is happening.
But HttpToolKit captures the interaction between browser and web.Code:curl https://manifest.prod.boltdns.net/license/v1/cenc/widevine/1242911124001/99bf6c70-37ba-434c-9474-55dc2fde25a6/7ca86aec-55cf-4f9f-ba33-c2fccecb52fe?fastly_token=NjM1YWY0MWJfZDRkOWExYzI4NTg3OGFlZmFlMjE1NDA1MTZiZjQ3MjBlMDM0ZDg1Yzc5ODU4ZmQ3OTAwMWU3YzY4ODI2NDZjYw%3D%3D
[Attachment 67096 - Click to enlarge]Method Not Allowed
By looking at the error it says "Method not allowed". The license server was expecting a POST but a GET Request was sent. So instead of no help from the Terminal. HttpToolKit shows our error.
Helpful or what?
There are alternatives to HttpToolkit - some recommend Charles-proxy or Fiddler; choose what you like best. For me, Charles Proxy took no account of my screen resolution - I dumped it.
The beauty of HttpToolKit is that you are not limited to one process (Terminal) to capture from. By choosing another capture process, for example a browser, we can see both ends of the problem. I chose Brave.. the Firefox version at the time wouldn't or couldn't download the Widevine browser add-on. Extensions can be added in the usual way and can persist after closure of HttpToolkit. Edit: sometimes if a browser has been updated, extensions via httptoolkit get lost and need re-installing.
Running our Brave browser now shows all the requests needed to load a page;-
[Attachment 67075 - Click to enlarge]
Filtered on 'manifest'
The images shows interaction from Brave and the last one '?' being from my terminal.
So now we have a more rounded tool than developer options in a browser: so we can see the interactions from many sources at the same time. Adb connected devices, like a phone, may also be intercepted (a rooted phone helps).
Android Studio started life for Developers creating Phone Apps, but it comes with an ability to emulate almost any version of Android phone. The emulation may be rooted.
HttpToolKit is privy to almost all traffic from an emulated phone. Except a few apps; Banking Apps of course; but some media Apps too will not work easily with HttpToolKit.
The general process of getting decryption keys
There are a range of types of interactions to permit the getting of keys depending on the Content Delivery Network.
I asked Google's Gemini to explain the key interations between a User and a Server needed before the User could play a video. It took a while to focus the AI's attention but eventually I got this:-
The diagram below shows the general Widevine interactions with a cloud server - I found this on the web; ignore pollycon - it is just a trade name.Widevine Key Exchange Process for Video Playback
1. Initial Setup and Public Key Exchange:
The user's device establishes a connection with the content server.
The server sends its public key to the user's device. This public key is used to encrypt sensitive information, such as the license.
2. Video Request and Content Encryption:
The user requests to play a specific video.
The server verifies the user's authorization and encrypts the video using a strong encryption algorithm (e.g., AES).
3. License Generation and Encryption:
The server generates a license that grants the user permission to play the encrypted video.
The server uses its private key to encrypt the license. This ensures that only the user with the corresponding public key can decrypt it.
4. License Delivery:
The server sends the encrypted license to the user's device.
5. License Decryption:
The user's device uses the server's public key to decrypt the encrypted license. This reveals the information required to obtain the content key.
6. Content Key Generation and Delivery:
The server generates a content key that is specific to the encrypted video.
The server encrypts the content key using a symmetric encryption algorithm (e.g., AES) and sends it to the user's device.
7. Content Key Decryption:
The user's device uses the information from the license to decrypt the encrypted content key.
8. Content Decryption:
The user's device uses the decrypted content key to decrypt the encrypted video.
9. Playback:
The decrypted video is played on the user's device.
Key Points:
Public Key Cryptography: Widevine relies on public key cryptography for key exchange and license encryption. This ensures that only the intended recipient (the user) can decrypt the license and access the content key.
Symmetric Encryption: The content itself is typically encrypted using a symmetric encryption algorithm like AES. This provides strong encryption while allowing for efficient decryption.
License Protection: The license is encrypted to prevent unauthorized access and modification.
Content Key Protection: The content key is also protected through encryption to prevent unauthorized decryption of the video.
By following this process, Widevine ensures that only authorized users can access and play encrypted content, protecting the intellectual property of content creators and distributors.
This is THE pattern - remember.
[Attachment 67056 - Click to enlarge]
General Widevine Process outlined. The pattern the developer follows:-
Looking and starting top right the lines down indicate time passage and shows the processes happening on the time-line and interactions between Client, Content Delivery Network (CDN) and the License Server. So, for the moment, ignoring the box labelled opt - which is a Widevine option - we see the browser, or our sripted code that mimics the browser, needs to:-
- Send a request with a token to the license server
- Receive the issued license and pass it to the Content Decryption Module to provide keys
One question now is:- where does the token go?
Let's look at that process in more detail with respect to tokens.
Many basic sites add the token to a url:
for example:-
But a token may be passed in a number of ways - the list below shows some of the different requirements or code-patterns to obtain keys using some version of Pywidevine. It is a start and not definitive:-
Some CDN site requirements - as an overview - are:
- pssh + license url + token in the url; - headers are basic and some, apparently, ignored e.g. uktvplay.co.uk, tg4.ie, some of stv.tv
- pssh + license url + token + headers with specific requrements e.g. channel5.com
If the referrer isn't channel5 you won't get your keys.- pssh + license + headers with a token as x-custom-data e.g. npostart.nl (I am told the x-custom-data expires in 45 seconds! So a script is needed to do the interactions).
- pssh + license + headers + a constructed json packet (with an extra interchange before the final license delivered to be decoded in CDM to provide keys) e.g. channel4.com, rte.ie
Note the interchange for this method is the 'opt' variation in the above diagram.
In more detail...
In The Temple of Doom we use curlconverter.com to explore the licence header. Copying the licence as a cURL. allows us to look more closely at header types.
As an aside: I am told copying as cURL is different on Windoze
copy cURL > copy as cURL(Posix) for Firefox and Win 11
copy cURL > copy as curl (Bash) for Chrome.
I inhabit a Linux world and don't see such complication.
We can also see the Request Headers in HttpToolkit for any request made to a server.
Looking more closely at the header patterns above and referring to the patterns by number:
- pssh license and basic headers: I use uktvplay as an example,
l3py works with a basic set of headers. Curlconverter.com's output is below:-
Looking at the cURL converter output there seems to be a token in the copy that curlconverter gave but labelled params. It looks important with a token in there. Why don't we need these headers then? After all we only user the bit 'headers ={...}' in our headers.py file.Code:import requests headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 6.3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.54 Safari/537.36 Edg/101.0.1210.39', 'Accept': '*/*', 'Accept-Language': 'en-GB,en;q=0.5', # 'Accept-Encoding': 'gzip, deflate, br', 'Content-type': 'application/octet-stream', 'Origin': 'https://uktvplay.co.uk', 'DNT': '1', 'Connection': 'keep-alive', 'Referer': 'https://uktvplay.co.uk/', 'Sec-Fetch-Dest': 'empty', 'Sec-Fetch-Mode': 'cors', 'Sec-Fetch-Site': 'cross-site', # Requests doesn't support trailers # 'TE': 'trailers', } params = { 'fastly_token': 'NjM1ZmE2MTlfMjM1ZGUwNTNkMzc1ZWEyZjA4ODU4YWMwOWFlODlkNTRlZWQwNDAwODVlODc2ZDRhN2U2N2ZlYTJlZGI5NGNhMQ==', } data = 'Cut' response = requests.post('https://manifest.prod.boltdns.net/license/v1/cenc/widevine/1242911124001/251dcec2-c1f2-4ade-abc2-a2258f2326d0/56686d88-a8b6-4fee-a08a-925902914abc', params=params, headers=headers, data=data)
Let's look at the licence url:-
And as a parameter at the end of the url is:-Code:https://manifest.prod.boltdns.net/license/v1/cenc/widevine/1242911124001/251dcec2-c1f2-4ade-abc2-a2258f2326d0/56686d88-a8b6-4fee-a08a-925902914abc?fastly_token=NjM1ZmE2MTlfMjM1ZGUwNTNkMzc1ZWEyZjA4ODU4YWMwOWFlODlkNTRlZWQwNDAwODVlODc2ZDRhN2U2N2ZlYTJlZGI5NGNhMQ%3D%3D
fastly_token=NjM1ZmE2MTlfMjM1ZGUwNTNkMzc1ZWEyZjA4O DU4YWMwOWFlODlkNTRlZWQwNDAwODVlODc2ZDRhN2U2N2ZlYTJ lZGI5NGNhMQ.
So to answer our question;
.... if the url carries the tokens then the headers.py can simply be:
Unless, of course, you see the browser sending a token in the headers too! So use your eyes spot what the browser sends and ensure you send similar.Code:headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 6.3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.54 Safari/537.36 Edg/101.0.1210.39', 'Accept': '*/*', 'Accept-Language': 'en-GB,en;q=0.5', # 'Accept-Encoding': 'gzip, deflate, br', 'Content-type': 'application/octet-stream', 'Origin': 'https://uktvplay.co.uk', 'DNT': '1', 'Connection': 'keep-alive', 'Referer': 'https://uktvplay.co.uk/', 'Sec-Fetch-Dest': 'empty', 'Sec-Fetch-Mode': 'cors', 'Sec-Fetch-Site': 'cross-site', # Requests doesn't support trailers # 'TE': 'trailers', }- pssh license and headers: I use channel5.com as an example.
Code:https://cassie.channel5.com/api/v2/licences/widevine/208/C5273420001?expiry=1664877498&tag=36323362323161363462613635613037363534376230346534663433653537376636303738363438Code:import requests headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 6.3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.54 Safari/537.36 Edg/101.0.1210.39', 'Accept': '*/*', 'Accept-Language': 'en-GB,en;q=0.5', # 'Accept-Encoding': 'gzip, deflate, br', 'Origin': 'https://www.channel5.com', 'DNT': '1', 'Connection': 'keep-alive', 'Referer': 'https://www.channel5.com/', 'Sec-Fetch-Dest': 'empty', 'Sec-Fetch-Mode': 'cors', 'Sec-Fetch-Site': 'same-site', 'Content-Type': 'application/x-www-form-urlencoded', } params = { 'expiry': '1664877498', 'tag': '36323362323161363462613635613037363534376230346534663433653537376636303738363438', } data = '<snipped as it was an encrypted message from the browser>}' response = requests.post('https://cassie.channel5.com/api/v2/licences/widevine/208/C5273420001', params=params, headers=headers, data=data)
[Attachment 67055 - Click to enlarge] Channel 5 Headers
Looking at the curlconverter output for the parameters carried on the url, it lists the 'tag' (token) and also an expiry time. Expiry times are another trap to fall into and I will deal with them in a little more detail later.- X-custom -data passed in header: I use npostart.nl as an example
[Attachment 67057 - Click to enlarge]npostart.nl showing x-custom-data
And that was simply enough to achieve. But site developers seem to want to prevent us taking a copy - the x-custom-data is limited to 45 second lifetime. There is a working script by Pkp on Notaghost's site if your Dutch is fluent and you want access to npostart.nl's output.- Option:License Token Integration: The last of our list is the most complex and requires a special script. It is the 'opt box' from the WV process diagram above.
For example rte.ie/player/
I have started my browser from HttpToolKit and navigated to the RTE site. I found a video to play, then cleared all the earlier interactions. Then played the video. The process is below:-
[Attachment 67092 - Click to enlarge]rte.ie/player/ with the mpd link selected and pssh highlighted
But we are more concerned with the license interactions:
[Attachment 67093 - Click to enlarge]The license request contains json code - highlighted.
Let us look at that in bit closer
[Attachment 67094 - Click to enlarge]a json packet sent with the request to the license server.
How to do that?
In my early days I wondered why I couldn't just send all the stuff I see the browser sending. until I realised all the browser sends is encrypted for the browser's CDM not ours. In our scripts we mimic the browser but make out own calls to the license server using data created by our CDM.
Partial code for copying:
same code for reading:Code:def WV_Function(pssh, lic_url, cert_b64=None): """ Func, emulates license request and then decrypts obtained license. Fields that changes every new request is signature, expirationTimestamp, watchSessionId, puid, and rawLicenseRequestBase64 Every DRM provider provides its own 1.mechanism to create a license request (using the KeyID, device identifier, signing the request, etc.) 2.mechanism to understand the license response received from the DRM License Server \ (the response is encrypted too) and extract the decryption key. 3.rules around storing the license locally on the client, license renewal, expiry, etc. """ '''First create a session passing pssh to our CDM''' wvdecrypt = WvDecrypt(init_data_b64=pssh, cert_data_b64=cert_b64, device=deviceconfig.device_android_generic) '''ask for an initial encrypted WidevineChallenge string to use in our call to the license server produced using pssh and data about our CDM - see point 1 above''' raw_request = wvdecrypt.get_challenge() request = b64encode(raw_request) # rte.ie support # read headers.py produced from cURL of up-to-date license # to extract releasePid from RTE licence url myjson = json.loads(headers.data) pid = myjson['getWidevineLicense']['releasePid'] responses.append(requests.post(url=lic_url, headers=headers.headers, json={ "getWidevineLicense": { 'releasePid': pid, 'widevineChallenge': str(request, "utf-8" ) }, }))
[Attachment 67103 - Click to enlarge]Rte.ie code
I took this snippet from https://github.com/medvm/widevine_keys/blob/main/l3.py and added comments to try to make sense of the process.
A big thank you to Medvm.
The code above mimics the license request - see below.
Other patterns of license interaction are included in his l3.py version. Select the one you want that suits the site you are after. Comment out or remove all the other methods otherwise you will get a python error.
Edit: Medvm noticed that 'releasePid' is the same given to the browser as would be to us. He chooses to take it from the full curl copy of the license making a liar of me saying that headers.data is irrelevant. But note the Widevine Challenge string is generated by our CDM. /edit
Channel 4 uses an initial interaction to get a license from the license server to get a key-decode-license from the license server! Again a json packet is exchanged. Find an example on the web - or here in the forum and adapt for your own use.
There will be riffs on the above all set-up to prevent access but the patterns will be similar. With the right tools you can follow along and detect what is happening.
It is a thoughtful process. Do not expect it to be easy.
The final wrinkle is time.
Some sites severely limit for how long you may access an mpd or call the license server after being served the link. There will be a time-string in the url and if you find your request failing and l3 complaining. You might have run out of time.
Here is a short rough and ready timestamp reader called mrwolf.py
It gives a 'Now' output time as a Date string and the Timestamp time or a short remaining time - depending on input. So if you see a timestamp, Mr Wolf will tell the time. Only don't expect it to say 'dinner time' and chase you!Code:''' timestrings from license/mpd urls can be decoded For example: Channel5.com url 'expiry=1664793250' gave a 24 hour expiry time. https://cassie.channel5.com/api/v2/licences/widevine/208/C5433700001?expiry=1664793250&tag=3565343430643230353865363735623135633561633761366232626433316433366238653239383 Channel4.com (mpd) https://cf.jos.c4assets.com/CH4_42_6_900_72841002001001_001/CH4_42_6_900_72841002001001_001_J01.ism/ stream.mpd?c3.ri=13631889843531360042&mpd_segment_template=time&filter=(type=="video"&& ((DisplayHeight>=288)&&(systemBitrate<4800000)))||type!="video"& ts=1664707499&e=600&st=RTQ-kSJkavQpKB2y8vxNIQP2lyvzB3Zs2QEVFyS5kWk gives a 'now' timestamp (ts) with 600s (10 minutes) extension (e). So yt-dlp fails if not used within time. ''' import time t = time.localtime() current_time = time.strftime("%Y-%m-%d %H:%M:%S", t) timestamp = input("Enter the timestamp \n>") my_time = time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(int(timestamp))) print(f"The time now is: {current_time}") if "1970" in my_time: my_time = time.strftime('%H:%M:%S', time.localtime(int(timestamp))) print(f"h:m:s remaining: {my_time}") else: print(f"The timestamp is: {my_time}")
Gotcha
Increasingly there are traps for the unwary. Pywidwvine is a Python Module now at version 1.5. it insists on installing a python-module 'protobuf 4.x.x', which breaks a lot of things.
EDIT There is now a way to deal with protobuf 3.x.x being incompatible with protobut 4.x.x see https://forum.videohelp.com/threads/409040-Correcting-Protobuf-Downgrade-to-3-19-0-error
My thread "Decryption And The Temple of Doom" showed the use of WKS-KEYS - which had its own included scripts for pywidevine.
Some code released in the forum uses pywidevine installed by Python pip as al module. The module is then available globally within your system.
Having local and global modules with the same name creates python problems; the documented python method of defining a local module seemed not to work for me. Choose one system and stick to it.
EDITIf the protobufs used in WKS-KEYS is compiled to version 4 see https://forum.videohelp.com/threads/409040-Correcting-Protobuf-Downgrade-to-3-19-0-error then pywidevine module and WKS-KEYS can exist together. Just ensure L3 is in the WKS-KEYS path - it stops confusion between pywidevine/L3/cdm and global module pywidevine/cdm path.
I tend to get lazy and on curlConverter.com select to copy the whole curl rather than just headers{}. So in my headers.py file there may be headers{}, params{} and data{}. In that case in my python code I address headers=headers.headers (and data=headers.data for the rte.ie example - otherwise not used). Simple once you know.
That is all I know!
As usual please obey netiquette and ask questions here in this thread, or forum for general queries, and not by PM
Support our site by donate $5 directly to us Thanks!!!
Try StreamFab Downloader and download streaming video from Netflix, Amazon!
Try StreamFab Downloader and download streaming video from Netflix, Amazon!
+ Reply to Thread
Results 1 to 30 of 223
-
Last edited by A_n_g_e_l_a; 18th Aug 2024 at 07:41. Reason: Added AI response to questions about Widevine key exchange.
-
Here is the grand-daddy of l3s, this L3.py gets everything.
It is a bold claim but any site that uses widevine and uses an mpd and cURL for its media delivery and encryption communications - this will download it.
PHP Code:# A_n_g_e_l_a June:2024
# Uses a code fragment from Obo on Videohelp which is rather novel.
# The method of taking the browser's license request and just swapping their cdm challenge for ours,
# leaving other stuff as is, seems to be very powerful, with potentially wide application.
"""
Effectively this is the grown-up version of l3.py with inputs of mpd and cURL of license URL. If no pssh is found in the mpd then
it will either generate one from the Default_KID in the mpd. Or in rare cases if no Default _Kid is found
in the mpd, then it will download an init.m4f - the first video fragment - and extract the PSSH from it.
Retrieves the keys from a license server using the provided mpd and cURL of the license URL.
Args:
mpd (url): The PSSH (Protection System Specific Header) of the content is extracted from the MPD url.
cURL of licence (str): The cURL of the license server request.
Returns:
str: A string containing the keys in the format "--key <kid>:<key>".
str: A string of the N_m3u8DL-RE command for command line use.
list: A list of strings of the N_m3u8DL-RE command for python process.run() use
Optional: .
Runs: N_m3u8DL-RE command to download the video.
Raises:
httpx.HTTPStatusError: If the HTTP request to the license server fails.
"""
# uses N_m3u8DL-RE, ffmpeg, mkvmerge and mp4decrypt
# see https://github.com/nilaoda/N_m3u8DL-RE
# see https://www.videohelp.com/software/ffmpeg
# see https://www.videohelp.com/software/MKVToolNix
# see https://www.bento4.com/downloads/
from pywidevine.cdm import Cdm
from pywidevine.device import Device
from pywidevine.pssh import PSSH
import base64
from base64 import b64encode
import httpx
import re
import urllib.parse
import codecs
import getpass
import xml.etree.ElementTree as ET
import subprocess
import os
from pathlib import Path
from termcolor import colored
import pyfiglet as PF
####################
#BE SURE TO EDIT THIS !!!!
####################
#WVD_PATH = "./WVD/google_aosp_on_ia_emulator_14.0.0_dcd562de_4464_l3.wvd"
WVD_PATH = "./device.wvd"
global header, data
# Widevine System ID
WIDEVINE_SYSTEM_ID = 'EDEF8BA9-79D6-4ACE-A3C8-27DCD51D21ED'
def fetch_mpd_content(url):
response = httpx.get(url)
response.raise_for_status() # Ensure we notice bad responses
return response.text
def find_default_kid_with_regex(mpd_content):
# Regular expression to find cenc:default_KID
match = re.search(r'cenc:default_KID="([A-F0-9-]+)"', mpd_content)
if match:
return match.group(1)
return None
def extract_or_generate_pssh(mpd_content):
# Parse the MPD content using ElementTree
# deal with:-
# the cenc namespace varitions
# the default_KID ``
# Provide a regex fallback
try:
tree = ET.ElementTree(ET.fromstring(mpd_content))
root = tree.getroot()
# Namespace map to handle the cenc namespace
namespaces = {
'cenc': 'urn:mpeg:cenc:2013',
'': 'urn:mpeg:dash:schema:mpd:2011'
}
# Extract cenc:default_KID using XML parsing
default_kid = None
for elem in root.findall('.//ContentProtection', namespaces):
scheme_id_uri = elem.attrib.get('schemeIdUri', '').upper()
if scheme_id_uri == 'URN:MPEG:DASH:MP4PROTECTION:2011':
default_kid = elem.attrib.get('cenc:default_KID')
if default_kid:
print(f"Found default_KID using XML parsing: {default_kid}")
break
# If default_kid is not found using XML parsing, use regex
if not default_kid:
default_kid = find_default_kid_with_regex(mpd_content)
if default_kid:
print(f"Found default_KID using regex: {default_kid}")
# Extract Widevine cenc:pssh
pssh = None
for elem in root.findall('.//ContentProtection', namespaces):
scheme_id_uri = elem.attrib.get('schemeIdUri', '').upper()
if scheme_id_uri == f'URN:UUID:{WIDEVINE_SYSTEM_ID}':
pssh_elem = elem.find('cenc:pssh', namespaces)
if pssh_elem is not None:
pssh = pssh_elem.text
print(f"Found pssh element: {pssh}")
break
if pssh is not None:
return pssh
elif default_kid is not None:
# Generate pssh from default_kid
default_kid = default_kid.replace('-', '')
s = f'000000387073736800000000edef8ba979d64acea3c827dcd51d21ed000000181210{default_kid}48e3dc959b06'
return b64encode(bytes.fromhex(s)).decode()
else:
# No pssh or default_KID found
try:
pssh = get_pssh_from_mpd(mpd_url) # init.m4f method
except:
return None
except ET.ParseError as e:
print(f"Error parsing MPD content: {e}")
return None
except httpx.HTTPError as e:
print(f"Error fetching MPD content: {e}")
return None
def get_key(pssh, license_url):
"""
Retrieves a license key for a given PSSH and license URL.
Args:
pssh (str): The PSSH value.
license_url (str): The URL of the license server.
Returns:
str: A string containing the license keys, separated by newlines.
Raises:
httpx.HTTPStatusError: If there is an HTTP status error while making the request.
Note:
This function uses the Cdm class to interact with the device and retrieve the license key.
It first calls the `get_license_challenge` method of the Cdm instance to obtain the challenge.
If the `data` parameter is not None, it modifies the challenge based on the pattern found in `data`.
It then prepares the payload by using the modified challenge or the original challenge if `data` is None.
The payload is sent to the license server using an HTTP POST request.
The response content is then parsed to extract the license content
The license content is then parsed using the `parse_license` method of the Cdm instance.
The `get_keys` method of the Cdm instance is then used to retrieve the license keys.
The license keys are returned as a string separated by newlines.
"""
device = Device.load(WVD_PATH)
cdm = Cdm.from_device(device)
session_id = cdm.open()
challenge = cdm.get_license_challenge(session_id, PSSH(pssh))
if data:
# deal with sites that need to return data with the challenge
if match := re.search(r'"(CAQ=.*?)"', data): # fix for windows
challenge = data.replace(match.group(1), base64.b64encode(challenge).decode())
elif match := re.search(r'"(CAES.*?)"', data):
challenge = data.replace(match.group(1), base64.b64encode(challenge).decode())
elif match := re.search(r'=(CAES.*?)(&.*)?$', data):
b64challenge = base64.b64encode(challenge).decode()
quoted = urllib.parse.quote_plus(b64challenge)
challenge = data.replace(match.group(1), quoted)
# Prepare the final payload
payload = challenge if data is None else challenge
license_response = httpx.post(url=license_url, data=payload, headers=headers)
try:
license_response.raise_for_status()
except httpx.HTTPStatusError as e:
raise e
license_content = license_response.content
try:
# if content is returned as JSON object:
match = re.search(r'"(CAIS.*?)"', license_response.content.decode('utf-8'))
if match:
license_content = base64.b64decode(match.group(1))
except:
pass
# Ensure license_content is in the correct format
if isinstance(license_content, str):
license_content = base64.b64decode(license_content)
cdm.parse_license(session_id, license_content)
keys = []
for key in cdm.get_keys(session_id):
if key.type == 'CONTENT':
keys.append(f"--key {key.kid.hex}:{key.key.hex()}")
cdm.close(session_id)
return "\n".join(keys)
def parse_curl(curl_command):
"""
Parse a cURL command and extract the URL, HTTP method, headers, and data.
Parameters:
curl_command (str): The cURL command string.
Returns:
tuple: A tuple containing the URL, method, headers (as a dictionary), and data.
"""
# Extract URL
url_match = re.search(r"curl\s+'(.*?)'", curl_command)
url = url_match.group(1) if url_match else ""
print(f"URL: {url}")
# Extract method
method_match = re.search(r"-X\s+(\w+)", curl_command)
method = method_match.group(1) if method_match else "GET"
print(f"Method: {method}")
# Extract headers
headers = {}
headers_matches = re.findall(r"-H\s+'([^:]+):\s*(.*?)'", curl_command)
for header in headers_matches:
headers[header[0]] = header[1]
print(f"Headers: {headers}")
# Extract data
data_match = re.search(r"--data(?:-raw)?\s+(?:(\$?')|(\$?{?))(.*?)'", curl_command, re.DOTALL)
if data_match:
raw_prefix = data_match.group(1)
data = data_match.group(3)
if raw_prefix and raw_prefix.startswith('$'):
data = None
else:
# Replace escaped sequences if needed
data = data.replace('\\\\', '\\').replace('\\x', '\\\\x')
print(f"Escaped Data: {data}")
# Decode the escaped sequences
try:
data = codecs.decode(data, 'unicode_escape')
except Exception as e:
print(f"Error decoding data: {e}")
data = ""
else:
data = ""
print(f"Data: {data}")
return url, method, headers, data
# deal with getting pssh from init.m4f as last resort
def find_wv_pssh_offsets(raw: bytes) -> list:
offsets = []
offset = 0
while True:
offset = raw.find(b'pssh', offset)
if offset == -1:
break
size = int.from_bytes(raw[offset-4:offset], byteorder='big')
pssh_offset = offset - 4
offsets.append(raw[pssh_offset:pssh_offset+size])
offset += size
return offsets
def to_pssh(content: bytes) -> list:
wv_offsets = find_wv_pssh_offsets(content)
return [base64.b64encode(wv_offset).decode() for wv_offset in wv_offsets]
def extract_pssh_from_file(file_path: str) -> list:
print('Extracting PSSHs from init file:', file_path)
return to_pssh(Path(file_path).read_bytes())
def get_pssh_from_mpd(mpd: str):
print("Extracting PSSH from MPD...")
yt_dl = 'yt-dlp'
init = 'init.m4f'
files_to_delete = [init]
for file_name in files_to_delete:
if os.path.exists(file_name):
os.remove(file_name)
print(f"{file_name} file successfully deleted.")
try:
subprocess.run([yt_dl, '-q', '--no-warning', '--test', '--allow-u', '-f', 'bestvideo[ext=mp4]/bestaudio[ext=m4a]/best', '-o', init, mpd])
except FileNotFoundError:
print("yt-dlp not found. Trying to download it...")
subprocess.run(['pip', 'install', 'yt-dlp'])
import yt_dlp
ydl_opts = {
'format': 'bestvideo[ext=mp4]/bestaudio[ext=m4a]/best',
'allow_unplayable_formats': True,
'user_agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3',
'no_warnings': True,
'quiet': True,
'outtmpl': init,
'no_merge': True,
'test': True,
}
with yt_dlp.YoutubeDL(ydl_opts) as ydl:
info_dict = ydl.extract_info(url, download=True)
url = info_dict.get("url", None)
if url is None:
raise ValueError("Failed to download the video")
video_file_name = ydl.prepare_filename(info_dict)
pssh_list = extract_pssh_from_file('init.m4f')
pssh = None
for target_pssh in pssh_list:
if 20 < len(target_pssh) < 220:
pssh = target_pssh
print(f'\n{pssh}\n')
# with open("pssh.txt", "a") as f:
# f.write(f"{pssh}\n {mpd}\n")
for file_name in files_to_delete:
if os.path.exists(file_name):
os.remove(file_name)
print(f"{file_name} file successfully deleted.")
return pssh
if __name__ == "__main__":
title = PF.figlet_format(' L 3 ', font='smslant')
print(colored(title, 'green'))
strapline = "A Generic L3 Downloader:\n"
print(colored(strapline, 'red'))
strapline = "For DRM content only.\n\n"
print(colored(strapline, 'red'))
print('Prepare three inputs.\n 1. MPD URL\n 2. cURL of license server request\n 3. Video name\n\n')
mpd_url = input("MPD URL? ")
mpd_content = fetch_mpd_content(mpd_url)
if (mpd_content):
pssh = extract_or_generate_pssh(mpd_content)
print("Extracted or generated PSSH:", pssh)
else:
print("Failed to fetch or parse MPD content.")
print("\nPaste cURL of license server request:" )
print("And press ENTER. \ncURL is saved but not displayed.")
# Use getpass to hide the input as data pasted to screen can play havoc otherwise
# DO NOT USE cURL = input("cURL? ") here!!
cURL = getpass.getpass(prompt="cURL? ")
# extract license URL, method, headers, and data
lic_url, method, headers, data = parse_curl(cURL)
# get key from pssh and license URL
key_results = get_key(pssh, lic_url)
print('\n' + key_results + '\n')
# ask user for video name
video = input("Save Video as? ")
# use N_m3u8DL-RE to download video provide the command
print(f"\nN_m3u8DL-RE '{mpd_url}' {key_results} --save-name {video} -M:format=mkv")
# Split key_results into lines and then split each line into components
key_components = []
for line in key_results.strip().split('\n'):
# Split each line by spaces and add the components to the key_components list
key_components.extend(line.split())
# Build the command list
m3u8dl = 'N_m3u8DL-RE'
command = [
m3u8dl, # The command to run
mpd_url, # First argument
*key_components, # Unpack key_components list into individual arguments
'--save-name', # Additional fixed argument
video, # Value for the save-name argument
#'--save-dir', # uncomment and add save path in quotes
#'C:/Users/User/Downloads/', # Value for the save-dir argument
'-M', # Additional fixed argument to mux
'format=mkv', # Value for the format argument may also be mp4
]
print(f"\n{command}\n")
input("Press Enter to run the download-command or ctrl+C to exit.")
subprocess.run(command)
requirements.txt
Code:httpx==0.27.0 pyfiglet==1.0.2 pywidevine==1.8.0 termcolor==2.4.0
install pip modules
Code:pip install -r requirements.txt
Last edited by A_n_g_e_l_a; 2nd Jul 2024 at 10:16. Reason: changes to boltdnsnet.py and ITVX Batch. Updated and checked working 7 Jan 2024
-
Very nice, especially the HTTPToolkit hint!
Do I get this right: The whole json-thing is an additional request before getting the license itself only to get the token that is used later?
Also, what I didn't understand up to now (still a bit new to Python...):
For the license request you have to copy the request parameters all in one, convert it all to Python, put it all into headers.py (WKS-KEYS), ok.
I did this several times and up to now everything worked fine. Also if there had been additonal data:
headers=
{
blah
}
data='extra blah'
And sometimes there is a lot of data to pass! But in l3.py "headers.data" is not used in any way!
Does that mean the data is unnecessary? As far as I remember when I used getwvkeys.cc (while it was still normal there) it did NOT work without the data. -
nice, more valuable info to digest
one question about headers.py - if l3.py is only taking headers and data from it, can the "response = requests.post(...)" line be removed? -
You misunderstand; headers.py is only supportive of the request and passed as an extra field in the request method
widevine_license = requests.post(url=lic_url, data=wvdecrypt.get_challenge(), headers=headers)
What you can do for sites that do not appear to read headers is to set 'headers=None' in that request line.
I just tried it with uktvplay.co.uk and it works without headers being sent. -
Yes.
If the data is on the url we have no need. I cannot be definitive that data=headers.data is an unnecessary field. It may depend on the site - but use your eyes and see what the browser headers in the request have. We mimic remember.
I have no idea what wvgetkeys.cc were doing. -
-
Forget the headers 'data' field. Who know what the author of l3 was thinking. It may be used somewhere, but for the small number of sites I frequent, I haven't seen it used. And the way I have seen addressed it wouldn't get found anyway ''data=data', params=params' is meaningless.
Python allows multiple uses of the same label so long as it can understand the context. So 'data' in licence challenge is different from 'data' in headers.
My l3.py has this line:
widevine_license = requests.post(url=lic_url, data=wvdecrypt.get_challenge(), headers=headers.headers)
I am an experimenter by nature. The best way to find out is to try. So if you think there are too many data fields, drop one and see what happens.
Everyone: I really don't want this to become a tutorial or an individual coaching session. Read, analyze and experiment. It is such a better feeling when you get there yourself. -
I fear we misunderstood each other, may be my English again. "headers.data" isn't addressed AT ALL anywhere. So if you put it in headers.py it only can be obsolete - at least in WKS-KEYS.
My l3.py has this line:
widevine_license = requests.post(url=lic_url, data=wvdecrypt.get_challenge(), headers=headers.headers)
I am an experimenter by nature. The best way to find out is to try. So if you think there are too many data fields, drop one and see what happens.
And learned that the license request in l3.py
widevine_license = requests.post(url=lic_url, data=wvdecrypt.get_challenge(), headers=headers.headers)
refers to wvdecrypt.py:
def get_challenge(self):
return self.cdm.get_license_request(self.session)
that refers to cdm.py:
def get_license_request(self, session_id):
Blah... (complicated)
So the "data" comes from the CDM each time new. The question that confuses me is still: Why is it not simply taken from the "data" in headers.py IF specified there. I am sure there is a good reason (the reason that explains everything this is all about, I fear), but I still don't get the point.
But you are very right: Time to learn and experiment! -
The Wvdecrypt object is initilized with pssh and data about the CDM. So when a request to the license server is made this information is included. Headers.data is irrelevant - forget it, it doesn't get used. Any anyway it is the browser's request headers and nothing to do with our license request from python.
l3.py, as supplied, imports cdm erroneously; it never gets called directly in my l3.py and I have deleted that part of the import. I haven't had need to look in any detail beyond l3.py -
I was talking about the the "response = requests.post(...)" line from headers.py:
Code:response = requests.post('https://cassie.channel5.com/api/v2/licences/widevine/208/C5273420001', params=params, headers=headers, data=data)
-
OK, sorry for my misunderstanding I think I'm with you now and I believe Quint may be making a similar point.
Curlconverter.com converts the licence request from the browser And the request carries data (encrypted) about the browser's CDM in 'data'. That is no use to us. For we need to make a request relevant to our CDM but we do need the bit headers={...} in headers.py.
Because I'm lazy and it is easier to copy curlconverter.com's output in its entirety, that is what I choose to do. But the request in there is never called.
The request.post line in the headers is similar to to the one we use. Only we are asked to input license in l3 py and it is then no longer "https://cassie.channel5.com/api/v2/licences/widevi.." but lic_url as it is used here:-
Code:widevine_license = requests.post(url=lic_url, data=wvdecrypt.get_challenge(), headers=headers.headers)
-
Ehm... - yes. That's what I pointed out. I only asked, why.
Any anyway it is the browser's request headers and nothing to do with our license request from python.
l3.py, as supplied, imports cdm erroneously; it never gets called directly in my l3.py and I have deleted that part of the import. I haven't had need to look in any detail beyond l3.py
But it seems you don't like to discuss, so will find out myself and don't disturb any longer.
Thanks a lot by the way for your other great posting, that let me learn how to dump the above! -
I make the point in my response to ampersand
Try and separate out in your mind what the browser does, and what we mimic with l3.py All we take from the browser is the headers field, the license url and the mpd url and nothing else. I too stumbled here; separating browser process and l3.py process; I mentioned it in my post.
If it is unchanging data, every time we try some episode download in the browser, then we know it copyable for our python use.
If it has a messsage = <some code in base64> that will need to change ,from what the browser sends to what our cdm sends. Try decoding some of the base64 code from browser and CDM - use Httptoolkit to watch the terminal - watch the browser and see the differences in message contents.
To say headers.data is 'never used' is not quite true; I have made an edit to point 4 dealing with the Option and the RTE.ie example. If its on the url no need to be in headers.data is probably a safer way of saying it. -
Yes, there are constant and changing things f. e. a token that worked yesterday and has to be "renewed" today, and constant parts.
Seems that the "final" call of "get_license_request" must be unique each time, because the response has to be decrypted in the context that started it. Still don't get the point of this exactly. But thanks for your answer. -
At the root of it all is SSL key-sharing, public/private keys and pretty good privacy . CDM keys and Browser keys are different. So the messages can only be read by one decryptor with the correct private key.
As I see the process: our CDM sends data to the server which includes pssh and our CDM's public key and data about our CDM - serial nos etc. It is not encrypted in this direction - only encoded base64, The server checks the pssh finds the decryption-keys for the particular media and encrypts those using our public key and sends the encrypted data back. The CDM decrypts the data with its private key to reveal the media decryption keys.Last edited by A_n_g_e_l_a; 8th Oct 2022 at 11:37. Reason: base64 encoding
-
And thus the browser data is useless because in the end you would need the browser's private key to decrypt the answer. And the Decrypter/-guesser used chrome's private key which is now totally obfuscated. I begin to understand, thanks for the explanation.
-
Thanks for your comprehensive guide! Finally I can obtain the key from the case 4 with the constructed json packet.
-
Hello Angela
tries to understand your famous step 4 JSON. At step of httptoolkit your code obtained you use with which terminal httptoolkit or windows CMD?.
if use the terminal of httptolkit and enter the code after I cannot enter any command. -
-
Last edited by k2000; 20th Nov 2022 at 19:10.
-
An awful lot for me to take it since I'm still learning how all these processes work. But, exceptionally informative and it has given me far more insight into the inner workings of these processes than anything else I've found on line. Superb information. A huge thank you.
-
As I said above, superb explanation as to how each of those webplayers respond between client and server. I've learnt a lot from that. Now, how to tranlate that information to get a key is way beyond me at the moment, but at least we have the help of the various packages that have been discussed for over a year now like WKS-KEYS and their derivatives.
Now, I have studied various tutorials (including YT) describing how to scrape actual web pages (eg. by using requests and beautifulsoup). But that's not what I want to do. What I would dearly love to be guided with is information on how to get information that's extracted from the various requests and responses as displayed in Developer Tools in Chrome browser. In particular things like mpd URL and license URL.
This would save me having to input those values manually. I want to write a script where I input the main page URL and that information like mpd url & licence url are extracted using python.
I just cannot find anything along those lines on the web.
Is it possible that anybody could guide me? I'm sure that others would find that useful as well. I'm not suggesting you hand me out a freebie, just some guidance or useful sites to get the information from would be superb.
Many thanks.Last edited by deccavox; 24th Nov 2022 at 14:21.
Similar Threads
-
Decryption and the Temple of Doom
By A_n_g_e_l_a in forum Video Streaming DownloadingReplies: 608Last Post: 13th Apr 2025, 09:29 -
An issue with mp4 decryption
By CrymanChen in forum Video Streaming DownloadingReplies: 16Last Post: 27th Apr 2022, 06:43 -
widevine decryption help
By birbal1 in forum Video Streaming DownloadingReplies: 2Last Post: 5th Dec 2021, 10:11 -
Help with video download and decryption
By herschel in forum Video Streaming DownloadingReplies: 4Last Post: 26th Jul 2021, 04:31 -
How do I get the decryption key
By Bakekalu in forum Video Streaming DownloadingReplies: 6Last Post: 5th Jul 2021, 01:25