Would anybody be able to help write a script to be executed on Windows that generates .m3u8 streams from https://thetvapp.to/tv/. This will require the script to be automated and run every hour or so, because there is a generated token with the m3u8 that has an expiry. I would like to then load these m3u8 files into an IPTV software, so ideally multiple m3u8 files or single file/playlist with multiple streams (if possible) that is compatible with that.
For example: https://thetvapp.to/tv/cnn-live-stream/
The only way I've been able to find them is via chrome, the m3u8 finder/hls player extension identified this address:
https://v-edge-4.thetvapp.to/hls/CNN.m3u8?token=J-lOBqOFbl5-vaG-QjqUsw&expires=1698055869
Any assistance would be greatly appreciated! Also, I am very beginner with programming, but can follow instructions precisely and utilize powershell/cmd in windows.
Support our site by donate $5 directly to us Thanks!!!
Try StreamFab Downloader and download streaming video from Netflix, Amazon!
Try StreamFab Downloader and download streaming video from Netflix, Amazon!
+ Reply to Thread
Results 1 to 29 of 29
-
-
Hey I know this is months later and you've probably figured this out already but I thought I'd output what I did to ge this automated .
It grabs the udated m3u8 URL with tokens and outputs it to a text file
Below is a Python script I wrote for the MTV channel on that site.
To edit this to work on different streams youll need 2 URLS :
1) The normal site stream link which itll search through "https://thetvapp.to/tv/mtv-live-stream/"
2) The actual desired header URL. "thetvapp.to/live/streams/MTVEast.m3u8?token="
Since this is formatted differently we need to tell the program what it looks like. You can find this on your respective channel by:
Opening Dev tools
Network tab
Refresh
Browse through the results on the site and there should be one with an M3u8 URL. Copy that URL up until the point where it specifies the token as that will change every time
import re
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from webdriver_manager.chrome import ChromeDriverManager
def extract_desired_url(requests):
# Search for the desired URL in the requests
for request in requests:
if "thetvapp.to/live/streams/MTVEast.m3u8?token=" in request:
return request
return None
url = "https://thetvapp.to/tv/mtv-live-stream/"
# Set Chrome options
options = webdriver.ChromeOptions()
options.add_argument("--headless") # To run Chrome in headless mode
# Initialize the ChromeDriver service with the executable path
service = webdriver.ChromeService(ChromeDriverManager().inst all())
# Initialize Selenium WebDriver with the service and Chrome options
driver = webdriver.Chrome(service=service, options=options)
# Navigate to the URL
driver.get(url)
# Wait for the video player element to be present
try:
video_player = WebDriverWait(driver, 20).until(EC.presence_of_element_located((By.CLASS _NAME, "video-player")))
print("Video player loaded successfully.")
except:
pass # Do nothing if the video player is not found, the message will not be printed
def get_get_requests():
try:
# Execute JavaScript to capture network requests
requests = driver.execute_script("""
var performance = window.performance || window.webkitPerformance || window.msPerformance || window.mozPerformance;
if (!performance) {
return [];
}
var entries = performance.getEntriesByType("resource");
var urls = [];
for (var i = 0; i < entries.length; i++) {
urls.push(entries[i].name);
}
return urls;
""")
return requests
except Exception as e:
print("An error occurred:", e)
return None
finally:
driver.quit()
# Call the function to get GET requests
get_requests = get_get_requests()
# Extract the desired URL
if get_requests:
desired_url = extract_desired_url(get_requests)
if desired_url:
print("Desired URL found:", desired_url)
# Write the desired URL to the file
with open("mtvurl.txt", "w") as file:
file.write(desired_url)
print("Desired URL written to mtvurl.txt")
else:
print("No desired URL found in the requests.")
else:
print("No GET requests found.") -
I am not much of a programmer but does this output to a file? I see in your code where you have the MTV channel located so if I wanted to add say HBO to it I what lines would I need to copy and change for HBO or any other channel? I would love to be able to take your script and extract all of the channels from Thetvapp.to so that I can then import it into VLC. -
I am attempting to run your code but I am getting the following error:
DevTools listening on ws://127.0.0.1:54862/devtools/browser/c3ad0a70-bb0f-491e-9a65-7256718ce539
An error occurred while loading the video player: Message:
Stacktrace:
GetHandleVerifier [0x00E48D03+51395]
(No symbol) [0x00DB5F61]
(No symbol) [0x00C6E13A]
(No symbol) [0x00CA62BB]
(No symbol) [0x00CA63EB]
(No symbol) [0x00CDC162]
(No symbol) [0x00CC3ED4]
(No symbol) [0x00CDA570]
(No symbol) [0x00CC3C26]
(No symbol) [0x00C9C629]
(No symbol) [0x00C9D40D]
GetHandleVerifier [0x011C68D3+3712147]
GetHandleVerifier [0x01205CBA+3971194]
GetHandleVerifier [0x01200FA8+3951464]
GetHandleVerifier [0x00EF9D09+776393]
(No symbol) [0x00DC1734]
(No symbol) [0x00DBC618]
(No symbol) [0x00DBC7C9]
(No symbol) [0x00DADDF0]
BaseThreadInitThunk [0x765CFCC9+25]
RtlGetAppContainerNamedObjectPath [0x777A7C5E+286]
RtlGetAppContainerNamedObjectPath [0x777A7C2E+238]
An error occurred: Message: javascript error: Invalid or unexpected token
(Session info: chrome-headless-shell=122.0.6261.113)
Stacktrace:
GetHandleVerifier [0x00E48D03+51395]
(No symbol) [0x00DB5F61]
(No symbol) [0x00C6E13A]
(No symbol) [0x00C72480]
(No symbol) [0x00C7408D]
(No symbol) [0x00CDAEAC]
(No symbol) [0x00CC3E8C]
(No symbol) [0x00CDA570]
(No symbol) [0x00CC3C26]
(No symbol) [0x00C9C629]
(No symbol) [0x00C9D40D]
GetHandleVerifier [0x011C68D3+3712147]
GetHandleVerifier [0x01205CBA+3971194]
GetHandleVerifier [0x01200FA8+3951464]
GetHandleVerifier [0x00EF9D09+776393]
(No symbol) [0x00DC1734]
(No symbol) [0x00DBC618]
(No symbol) [0x00DBC7C9]
(No symbol) [0x00DADDF0]
BaseThreadInitThunk [0x765CFCC9+25]
RtlGetAppContainerNamedObjectPath [0x777A7C5E+286]
RtlGetAppContainerNamedObjectPath [0x777A7C2E+238]
No GET requests found.
Press any key to continue . . . -
-
I did as you instructed but I am still not getting the expected results. Here is the latest output:
DevTools listening on ws://127.0.0.1:38304/devtools/browser/d63eddb5-0b27-4fcb-a79f-80f6a124f774
An error occurred while loading the video player: Message:
Stacktrace:
GetHandleVerifier [0x00854CE3+225091]
(No symbol) [0x00784E31]
(No symbol) [0x00629A7A]
(No symbol) [0x0066175B]
(No symbol) [0x0066188B]
(No symbol) [0x00697882]
(No symbol) [0x0067F5A4]
(No symbol) [0x00695CB0]
(No symbol) [0x0067F2F6]
(No symbol) [0x006579B9]
(No symbol) [0x0065879D]
sqlite3_dbdata_init [0x00CC9A83+4064547]
sqlite3_dbdata_init [0x00CD108A+4094762]
sqlite3_dbdata_init [0x00CCB988+4072488]
sqlite3_dbdata_init [0x009CC9E9+930953]
(No symbol) [0x00790804]
(No symbol) [0x0078AD28]
(No symbol) [0x0078AE51]
(No symbol) [0x0077CAC0]
BaseThreadInitThunk [0x765CFCC9+25]
RtlGetAppContainerNamedObjectPath [0x777A7C5E+286]
RtlGetAppContainerNamedObjectPath [0x777A7C2E+238]
An error occurred: Message: javascript error: Invalid or unexpected token
(Session info: chrome-headless-shell=123.0.6312.58)
Stacktrace:
GetHandleVerifier [0x00854CE3+225091]
(No symbol) [0x00784E31]
(No symbol) [0x00629A7A]
(No symbol) [0x0062DEB0]
(No symbol) [0x0062FA76]
(No symbol) [0x006965E2]
(No symbol) [0x0067F55C]
(No symbol) [0x00695CB0]
(No symbol) [0x0067F2F6]
(No symbol) [0x006579B9]
(No symbol) [0x0065879D]
sqlite3_dbdata_init [0x00CC9A83+4064547]
sqlite3_dbdata_init [0x00CD108A+4094762]
sqlite3_dbdata_init [0x00CCB988+4072488]
sqlite3_dbdata_init [0x009CC9E9+930953]
(No symbol) [0x00790804]
(No symbol) [0x0078AD28]
(No symbol) [0x0078AE51]
(No symbol) [0x0077CAC0]
BaseThreadInitThunk [0x765CFCC9+25]
RtlGetAppContainerNamedObjectPath [0x777A7C5E+286]
RtlGetAppContainerNamedObjectPath [0x777A7C2E+238]
No GET requests found.
Press any key to continue . . . -
Try to make sure any instance of chrome.exe using that same User Data folder is also closed (or just close any chrome.exe process) before running the script.
-
Thanks for getting back to me so quickly.
I reviewed my task manager ensuring that there were no Chrome processes running and again checked the details which did not show any Chrome running. To be sure that I did not miss anything I even rebooted my system but I am still not getting the expected results.
Here is the latest output:
DevTools listening on ws://127.0.0.1:11191/devtools/browser/ce5d1f30-f194-4856-9feb-8bbd1c71eb0a
An error occurred while loading the video player: Message:
Stacktrace:
GetHandleVerifier [0x00614CE3+225091]
(No symbol) [0x00544E31]
(No symbol) [0x003E9A7A]
(No symbol) [0x0042175B]
(No symbol) [0x0042188B]
(No symbol) [0x00457882]
(No symbol) [0x0043F5A4]
(No symbol) [0x00455CB0]
(No symbol) [0x0043F2F6]
(No symbol) [0x004179B9]
(No symbol) [0x0041879D]
sqlite3_dbdata_init [0x00A89A83+4064547]
sqlite3_dbdata_init [0x00A9108A+4094762]
sqlite3_dbdata_init [0x00A8B988+4072488]
sqlite3_dbdata_init [0x0078C9E9+930953]
(No symbol) [0x00550804]
(No symbol) [0x0054AD28]
(No symbol) [0x0054AE51]
(No symbol) [0x0053CAC0]
BaseThreadInitThunk [0x75FEFCC9+25]
RtlGetAppContainerNamedObjectPath [0x77237C5E+286]
RtlGetAppContainerNamedObjectPath [0x77237C2E+238]
An error occurred: Message: javascript error: Invalid or unexpected token
(Session info: chrome-headless-shell=123.0.6312.58)
Stacktrace:
GetHandleVerifier [0x00614CE3+225091]
(No symbol) [0x00544E31]
(No symbol) [0x003E9A7A]
(No symbol) [0x003EDEB0]
(No symbol) [0x003EFA76]
(No symbol) [0x004565E2]
(No symbol) [0x0043F55C]
(No symbol) [0x00455CB0]
(No symbol) [0x0043F2F6]
(No symbol) [0x004179B9]
(No symbol) [0x0041879D]
sqlite3_dbdata_init [0x00A89A83+4064547]
sqlite3_dbdata_init [0x00A9108A+4094762]
sqlite3_dbdata_init [0x00A8B988+4072488]
sqlite3_dbdata_init [0x0078C9E9+930953]
(No symbol) [0x00550804]
(No symbol) [0x0054AD28]
(No symbol) [0x0054AE51]
(No symbol) [0x0053CAC0]
BaseThreadInitThunk [0x75FEFCC9+25]
RtlGetAppContainerNamedObjectPath [0x77237C5E+286]
RtlGetAppContainerNamedObjectPath [0x77237C2E+238]
No GET requests found.
Press any key to continue . . . -
It looks like the script is for older selenium versions, I updated the script a little bit, try this:
Code:from selenium import webdriver from selenium.webdriver.common.by import By from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions as EC url = "https://thetvapp.to/tv/mtv-live-stream/" def extract_desired_url(requests): # Search for the desired URL in the requests for request in requests: if "MTVEast.m3u8?token=" in request: return request return None def get_get_requests(): try: # Execute JavaScript to capture network requests requests = driver.execute_script(""" var performance = window.performance || window.webkitPerformance || window.msPerformance || window.mozPerformance; if (!performance) { return []; } var entries = performance.getEntriesByType("resource"); var urls = []; for (var i = 0; i < entries.length; i++) { urls.push(entries[i].name); } return urls; """) return requests except Exception as e: print("An error occurred:", e) return None finally: driver.quit() # Set Chrome options options = webdriver.ChromeOptions() options.add_argument("--headless") # To run Chrome in headless mode # Initialize Selenium WebDriver with the service and Chrome options driver = webdriver.Chrome(options=options) # Navigate to the URL driver.get(url) # Wait for the video player element to be present try: video_player = WebDriverWait(driver, 20).until(EC.presence_of_element_located((By.CLASS_NAME, "video-player"))) print("Video player loaded successfully.") except: pass # Do nothing if the video player is not found, the message will not be printed # Call the function to get GET requests get_requests = get_get_requests() # Extract the desired URL if get_requests: desired_url = extract_desired_url(get_requests) if desired_url: print("Desired URL found:", desired_url) # Write the desired URL to the file with open("mtvurl.txt", "w") as file: file.write(desired_url) print("Desired URL written to mtvurl.txt") else: print("No desired URL found in the requests.") else: print("No GET requests found.")
-
It works!
Thank you so much for your assistance on getting this to work. Now I just need to figure out how to get it read all of the URLs from a file or the site itself to acquire the needed URL with the token rather than it being hard coded. -
I keep getting "No desired URL found in the requests." Is this python script still working for everyone? No success so far.
-
Thats the easy part,
Code:import requests import re url = "https://thetvapp.to/tv" headers = { 'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36' } response = requests.get(url, headers=headers ) channels = re.findall(r'a href=\"/tv/(.*?)\"', response.text) for channel in channels: print('https://thetvapp.to/tv/' +str(channel))
-
I updated it so it works again. I also made it ask what channel and then spit out the url.
Code:from selenium import webdriver from selenium.webdriver.common.by import By from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions as EC import time import re # Setup main options options = webdriver.ChromeOptions() options.add_argument("--headless") # To run Chrome in headless mode options.add_argument("--no-sandbox") options.add_argument("--disable-dev-shm-usage") user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.85 Safari/537.36" options.add_argument(f"user-agent={user_agent}") driver = webdriver.Chrome(options=options) # First get all the live channels into a list homepage = "https://thetvapp.to/tv/" driver.get(homepage) channels = re.findall('a href=\"/tv/(.*?)/\"', driver.page_source) # Enumerate and print the list then wait for user to input a number for i, channel in enumerate(channels): print(i, channel.replace('-', ' ')) while True: try: selection = int(input("Please select a channel number: ")) if selection < 0 or selection >= len(channels): print(f"Please select a number between 0 and {len(channels) - 1}.") continue except ValueError: print("Sorry, numbers only.") continue else: break url = homepage + str(channels[selection]) print(f'Scraping page for playlist at {url}') def extract_desired_url(requests): # Search for the desired URL in the requests for request in requests: if "m3u8?token=" in request: return request return None def get_get_requests(): global driver try: # Execute JavaScript to capture network requests requests = driver.execute_script(""" var performance = window.performance || window.webkitPerformance || window.msPerformance || window.mozPerformance; if (!performance) { return []; } var entries = performance.getEntriesByType("resource"); var urls = []; for (var i = 0; i < entries.length; i++) { urls.push(entries[i].name); } return urls; """) return requests except Exception as e: print("An error occurred:", e) return None driver.get(url) time.sleep(1) # Adjust this if needed - this is the wait for the player to receive the decoded url get_requests = get_get_requests() # Extract the desired URL if get_requests: desired_url = extract_desired_url(get_requests) if desired_url: print("Playlist URL found:", desired_url) else: print("No Playlist URL found in the requests.") else: print("No GET requests found.") driver.quit()
-
-
It did work, I think they might have changed the way it works. I think the m3u only gets delivered after pressing play now. I'll look at the script again over the weekend.
-
Had a quick look, they now wait for the play button to be pressed which called a url like token/channelname. it needs a nicely crafted header to receive the m3u url. I'll look again when I have more time, the data required is in the first load of the page.
You can get the token url from the individual channel page withCode:driver.get(url) time.sleep(1) chanpage = re.findall('data=\"/token/(.*?)\"', driver.page_source) newdata = "https://thetvapp.to/token/" + str(chanpage[0])
-
start your journey from this code
CNN
Code:import re import requests start_seasson = requests.session() # get crf-token headers1 = { 'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7', 'cache-control': 'max-age=0', 'dnt': '1', 'upgrade-insecure-requests': '1', 'user-agent': 'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Mobile Safari/537.36', } webpage = start_seasson.get('https://thetvapp.to/tv/cnn-live-stream/', headers=headers1).text csrf_token = re.search(r'<meta name="csrf-token" content="(.*?)">', webpage).group(1) headers2 = { 'content-type': 'application/json', 'dnt': '1', 'origin': 'https://thetvapp.to', 'referer': 'https://thetvapp.to/', 'user-agent': 'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Mobile Safari/537.36', 'x-csrf-token': csrf_token, } json_data = { 'KSCahyafDAfniqhjUdDlvpUB': 'GKeVZHYqyAKAjyWUapyLKKEctt', } hls = start_seasson.post('https://thetvapp.to/token/CNN', headers=headers2, json=json_data).text url_corrected = hls.replace("\\", "") print(url_corrected) start_seasson.close()
Last edited by imr_saleh; 3rd Aug 2024 at 00:41.
-
Using this code directly gives a forbidden error.
The json_data you have hardcoded in the script is dynamic. Its only generated after the play button is clicked, however I was unable to find it. If i load the page with my debugger open and click play I can then see the payload data for my connection. If I put this key/value into your script it works fine.
I was using selenium.
Code:wait = WebDriverWait(driver, 10) play_button = wait.until(EC.element_to_be_clickable((By.ID, 'loadVideoBtnOne'))) play_button.click()
I'm still learning python and webscraping. -
--[----->+<]>.++++++++++++.---.--------.
[*drm mass downloader: widefrog*]~~~~~~~~~~~[*how to make your own mass downloader: guide*] -
-
Ah, it seems that not only the x-csrf-token header has expired, but it also needs a new payload.
I had a look and found that the payload is generated via JavaScript,
The parameter (KSCahyafDAfniqhjUdDlvpUB) can be obtained directly inside the code
But the biggest challenge is how to generate its key value (GKeVZHYqyAKAjyWUapyLKKEctt)
Because The JavaScript file is heavily obfuscated, making it difficult to directly analyze the logic. However, the presence of certain patterns, such as function calls and variable manipulations, can guide us in locating the correct payload
Code:async function L5() { const n = U , t = { vSpJA: n(239), Wgwlp: n(288), vSNPf: "Network response was not ok " }; try { const x = await fetch(as, { method: t[n(240)], headers: { "Content-Type": t[n(259)], "X-CSRF-TOKEN": cs() }, body: JSON[n(295)]({ KSCahyafDAfniqhjUdDlvpUB: R5 }) }); if (!x.ok) throw new Error(t[n(263)] + x[n(219)]); return await x[n(228)]() } catch (x) { console.error(n(226), x) } }
I prefer to use the code directly instead of using selenium webdriver -
Not completely happy with the result, but eh, could be better
Code:import ast import re from html import unescape from http.cookies import SimpleCookie from time import sleep import requests from bs4 import BeautifulSoup BASE_URL = "https://thetvapp.to" PAYLOAD = None def evaluate(a, o, b): if o == "-": return int(a) - int(b) if o == "+": return int(a) + int(b) if o == "*": return int(a) * int(b) return int(a) // int(b) def get_key_values(page_soup): app_js = page_soup.find_all('script', src=True) app_js = [ source['src'] for source in app_js if source['src'].endswith('.js') and 'app' in source['src'].split("/")[-1] ][0] app_js = requests.get(app_js).content.decode() fixed_js_key = re.findall( r'headers:{[^{}]*"X-CSRF-TOKEN"[^{}]*},body:[^{}]*{([^{}]*)}', app_js )[0].split(":")[0] all_operations = re.findall( r'const\s[^=\s]+=\[];([^;]+);', app_js ) possible_key_values = [] for list_operations in all_operations: if "{" in list_operations or "}" in list_operations: continue if '](' not in list_operations or '),' not in list_operations: continue function_name = list_operations.split("[")[1].split("(")[0] function_name = re.findall( fr'const\s{function_name}=([^;]+);', app_js )[0] offset_operation = re.findall( r"function {function_name}\(.*?\){(.*?)}".replace("{function_name}", function_name), app_js, re.DOTALL )[0] function_name = re.findall( r'const\s[^=]+=([^()]+)\(', offset_operation )[0] offset_operation = re.findall( r"(\w+)=\1([-+*/])(\d+),", offset_operation )[0] list_operations = list_operations.split(",") list_operations = [ re.findall(r'\(([^()]+)\)', op)[-1].replace('"', "").replace("'", "") for op in list_operations ] fixed_js_words = re.findall( r"function {function_name}\(.*?\){.*?(\[.*?\]);return\s.*?}".replace( "{function_name}", function_name ), app_js, re.DOTALL )[0] try: fixed_js_words = ast.literal_eval(fixed_js_words) except: continue try: max_op_len = len(max(list(filter(lambda o: not o.isdigit(), list_operations)), key=len)) except: max_op_len = None for _ in range(0, len(fixed_js_words) - 1): current_key_value = [] fail_key_value = False for operation in list_operations: if operation.isdigit(): operation = evaluate(operation, offset_operation[1], offset_operation[2]) operation = fixed_js_words[operation] if not bool(re.match(r'^[a-zA-Z]+$', operation)): fail_key_value = True break elif max_op_len is not None and len(operation) > 2 * max_op_len: fail_key_value = True break elif len("".join(current_key_value)) > 2 * len(fixed_js_key): fail_key_value = True break current_key_value.append(operation) if len(current_key_value) == 0: fail_key_value = True elif len("".join(current_key_value)) < len(fixed_js_key): fail_key_value = True elif len(min(current_key_value, key=len)) * 2 < len(max(current_key_value, key=len)): fail_key_value = True if not fail_key_value: current_key_value = "".join(current_key_value) possible_key_values.append(current_key_value) fixed_js_words.append(fixed_js_words.pop(0)) return { "key": fixed_js_key, "value": possible_key_values } def get_m3u8(source_url): global PAYLOAD response = requests.get(source_url) soup = BeautifulSoup(response.text, 'html.parser') csrf_token = soup.find_all('meta', attrs={'name': 'csrf-token'})[0]["content"] get_m3u8_endpoint = soup.find_all("div", attrs={"id": "get-m3u8-link"})[0]["data"] if not get_m3u8_endpoint.startswith(BASE_URL): get_m3u8_endpoint = f'{BASE_URL}{get_m3u8_endpoint}' response = dict(response.headers) cookies = SimpleCookie() cookies.load(response["set-cookie"]) app_session = {k: v.value for k, v in cookies.items()}["thetvapp_session"] payload = PAYLOAD if payload is None: payload = get_key_values(soup) for key_value in payload["value"]: js_key = payload["key"] response = requests.post( get_m3u8_endpoint, cookies={'thetvapp_session': app_session}, headers={'X-CSRF-TOKEN': csrf_token}, json={js_key: key_value} ) if response.status_code == 200: PAYLOAD = { "key": js_key, "value": [key_value] } return response.json() sleep(0.5) print("Failed to obtain the m3u8 with any payload... Debug the script") exit(0) if __name__ == '__main__': r = requests.get(BASE_URL) s = BeautifulSoup(r.text, 'html.parser') links = s.find_all('a', class_='list-group-item') index = 0 for link in links: href = link.get('href') if not href or not href.startswith('/tv/'): continue href = f"{BASE_URL}{href}" text = unescape(link.text) index += 1 try: print(index, text, get_m3u8(href)) except: print(index, "possible vpn issues: ", href)
Code:1 A&E https://v1.thetvapp.to/hls/AEEast/index.m3u8?token=YnRka1dnbkx2Uko1eUw5bzU0MUlBbHJEdjBRQTJNNmxCWnBZWENIeA== 2 ACC Network https://v1.thetvapp.to/hls/ACCNetwork/index.m3u8?token=RTRHNUNiQ0VxZWtMSmIyQXlPcG1MbkRpb1RsUHJ2b3c1WTNhakkxMQ== 3 AMC https://v1.thetvapp.to/hls/AMCEast/index.m3u8?token=VWFiSGNjMkFLUlM5a085ekMwU1pBMzFmMU1qSDBZRVRINllURHBkTw== . . . 30 Disney XD https://v3.thetvapp.to/hls/DisneyXDEast/index.m3u8?token=UTRabEd6QUx5bmFCUmNCTU5VOVNyam1LYjhvbEZVRXJuQTMwY2hMWg== 31 E! https://v3.thetvapp.to/hls/EEast/index.m3u8?token=ckViSWxqYnk5cnVYNGd0S3g3TmdVQUI3Vk5DdGtheFdsdk82S3A0Rw== 32 possible vpn issues: https://thetvapp.to/tv/espn-live-stream/ 33 possible vpn issues: https://thetvapp.to/tv/espn2-live-stream/ 34 ESPNews https://v3.thetvapp.to/hls/ESPNews/index.m3u8?token=ZHJIMGVHeVRoYk0yenZReDBQUnVlRjZoOTFwTGZEekZPaUNnNERDMQ== 35 ESPNU https://v3.thetvapp.to/hls/ESPNU/index.m3u8?token=SzYyTlhhV0l0RWw4OTdTeUlKR0xZcGJRUkVWT0hZVHFUM0hxem9MRg== . . . 113 WE tv https://v2.thetvapp.to/hls/WeTVEast/index.m3u8?token=Q2FrNFpQOW5JODlNdlg0ODBIZ2h5TmhQRVlrUk9LR3NwN2lZMDhmcA== 114 WNBC (New York) NBC East https://v2.thetvapp.to/hls/WNBCDT1/index.m3u8?token=d2JyMEJjTXBpZjlyRUwya3Uxa0ZQN2NFUlNvNnF3ZE5DMVI0SzF5eQ== 115 WNYW (New York) FOX East https://v3.thetvapp.to/hls/WNYWDT1/index.m3u8?token=a1pveTNzTEZ2YzZUTmNQcTdvWDJuUE5TVW1HMEpPMGthWUxiVE9Hcw==
Last edited by 2nHxWW6GkN1l916N3ayz8HQoi; 8th Aug 2024 at 09:33.
--[----->+<]>.++++++++++++.---.--------.
[*drm mass downloader: widefrog*]~~~~~~~~~~~[*how to make your own mass downloader: guide*] -
-
It was all in the JavaScript code, and I was going to share it if he hadn’t. I’m not sure why you couldn’t figure it out—maybe you’re still getting familiar with JavaScript? Either way, he did an amazing job.
discord=notaghost9997 -
Since you seem like a nice fella who genuinely likes learning and scripting, I'm gonna explain my line of thought that led me to that mediocre solution. After all, what's the point of all these fancy scripts if people can't write new ones when they're gonna inevitably fail after a few days/weeks, especially if the site dev is lurking like a rat somewhere.
I'm gonna skip over the data scraping basics since you know them probably, and they have also been explained on videohelp forum guides. Since the hardest challenge is obtaining that magic payload pair, key/value, I'm gonna focus on it. The problem is gonna be split into 2 smaller issues, the key and the value.
By using the HAR trick on that key (in my case it is "amOJQwpfeNEMtHDipfKCfmshvqSZ"), you can instantly find it in a JS file. I'm gonna use a formatted JS source code on Chrome to showcase the code snippets.
Code:await t[n(333)](fetch, i1, { method: t[n(369)], headers: { "Content-Type": n(339), "X-CSRF-TOKEN": c1() }, body: JSON[n(340)]({ amOJQwpfeNEMtHDipfKCfmshvqSZ: S5 }) }) .... or .... const x = await fetch(i1, { method: t[n(326)], headers: { "Content-Type": n(339), "X-CSRF-TOKEN": c1() }, body: JSON.stringify({ amOJQwpfeNEMtHDipfKCfmshvqSZ: C5 }) });
Now the hard part is the value. A search in the HAR file brings no results, so maybe it's hidden in some encoding??? Before doing drastic things and checking all requests manually in the network tab, let's just take a look at the snippets. It seems to be building a payload, and the value of that fixed key should be the one we're looking for. I'm gonna focus on the 2nd code snippet (const x = blabla) and place a debugger breakpoint there.
[Attachment 81234 - Click to enlarge]
Jackpot. So the value is also found there (its content might be different for you). The question becomes now, from where is taken / how is it generated. That "C5" is not a function call, but instead a variable. By going a little backward and seeing the biggest function that encapsulates all, we get this:
Code:async function O5() { const n = W , t = { fcVHD: n(337), HtOev: function(x, r) { return x + r } }; try { const x = await fetch(i1, { method: t[n(326)], headers: { "Content-Type": n(339), "X-CSRF-TOKEN": c1() }, body: JSON.stringify({ amOJQwpfeNEMtHDipfKCfmshvqSZ: C5 }) });
[Attachment 81235 - Click to enlarge]
Seems like the value of m0 is a list and its contents can be appended into 1 big string that represents the key value we want. I'm gonna completely ignore what that function "Co" is doing since it's useless. We already know what it's supposed to do. So, from "C5" we go to "m0" which is another variable, only now it is a list, not a string, but equivalent nonetheless. By searching from the start of the js file for "m0 = " we get
Code:const m0 = []; m0.push(W(396)), m0[W(343)](W(376)), m0[W(343)](W(362)), m0[W(343)]("tkm"), m0[W(343)]("gvv"), m0[W(343)](W(325)), m0[W(343)]("tO"), m0[W(343)]("yS"), m0[W(343)]("my"), m0[W(343)]("ZO"); const Rx = []; ...
Code:const W = dt; function dt(n, t) { const x = gi(); return dt = function(r, e) { return r = r - 319, x[r] } , dt(n, t) }
Code:function gi() { const n = ["my-jwplayer", ... blabla ...,"4xIvEzO", "TGL", "OFlaR"]; return gi = function() { return n } , gi() }
Understanding the flow of the found code is now a matter of experience. You could try your luck with ChatGPT but for most other sites you're gonna have to know Javascript and have some coding knowledge. Now we're lucky that the site dev didn't bother obfuscating it that much. In short, the function W() returns elements from the fixed "n" list by receiving an index. That index also has an offset since you have the operation "r - 319".
343 - 319 = 24 => W(343) is actually "push" which is what we deduced previously. If you had access to the fixed string list, the offset operation, and the push operation, you could build the list yourself. The offset operation can be ignored if you realize that the js code is not gonna use ALL of the possible strings from the word list. At first glance, it just looks like the only ones used have a length <= 3. So the problem becomes about data scraping again: how to get the fixed list and the list of operations. Both can be done by doing the same thing. Find an anchor and start building your regex.
Now what exactly I don't find good about this approach? It's kinda hardcoded and if the JS code is obfuscated even more, you're gonna have to change it. I think in this case, a selenium approach might be better.
You can always post it if you want. I doubt people are gonna mind. I still think it could be further generalized.--[----->+<]>.++++++++++++.---.--------.
[*drm mass downloader: widefrog*]~~~~~~~~~~~[*how to make your own mass downloader: guide*] -
discord=notaghost9997
-
i'm not familiar with JS , not as much as i focus on Python, but at least i try
Thank you for the detailed explanation, the script has stopped working. Maybe the site developers use new tricks every day, anyway as you said selenium approach might be better in this case.Last edited by imr_saleh; 5th Aug 2024 at 22:53.
-
Doesn't surprise me. There's this piece of code now
Code:(function(n, t) { const x = _x , r = n(); for (; []; ) try { if (parseInt(x(457)) / 1 + parseInt(x(464)) / 2 + -parseInt(x(477)) / 3 * (parseInt(x(491)) / 4) + -parseInt(x(452)) / 5 + parseInt(x(448)) / 6 + parseInt(x(456)) / 7 * (-parseInt(x(454)) / 8) + parseInt(x(461)) / 9 * (parseInt(x(480)) / 10) === t) break; r.push(r.shift()) } catch { r.push(r.shift()) } } )(bc, 191828);
If there's one thing to learn from this, is just how to debug a JS and what to look for.
Edit: Just to entertain that silly idea for a second, I edited the previous script from post #22. I wonder what's gonna break now.Last edited by 2nHxWW6GkN1l916N3ayz8HQoi; 6th Aug 2024 at 08:07.
--[----->+<]>.++++++++++++.---.--------.
[*drm mass downloader: widefrog*]~~~~~~~~~~~[*how to make your own mass downloader: guide*] -
There's something on Github too: https://github.com/gelvetica/televisionapplication
Similar Threads
-
How to merge video m3u8 url and audio m3u8 url live channel
By sairaj in forum DVB / IPTVReplies: 2Last Post: 3rd Jul 2023, 06:29 -
How to extract the m3u8 or iptv link from this website ?
By icebreaker101010 in forum Video Streaming DownloadingReplies: 4Last Post: 15th Apr 2022, 23:42 -
Extracting token for m3u8 stream
By aeregeneratel38 in forum Video Streaming DownloadingReplies: 3Last Post: 27th May 2021, 12:20 -
Extracting token for stream url.
By aeregeneratel38 in forum DVB / IPTVReplies: 0Last Post: 26th May 2021, 09:41 -
i need php script to grab/generate Auth token for m3u8 url (kodi or vlc)
By morif96 in forum Video Streaming DownloadingReplies: 0Last Post: 3rd Apr 2020, 21:41