VideoHelp Forum
+ Reply to Thread
Results 1 to 29 of 29
Thread
  1. Would anybody be able to help write a script to be executed on Windows that generates .m3u8 streams from https://thetvapp.to/tv/. This will require the script to be automated and run every hour or so, because there is a generated token with the m3u8 that has an expiry. I would like to then load these m3u8 files into an IPTV software, so ideally multiple m3u8 files or single file/playlist with multiple streams (if possible) that is compatible with that.

    For example: https://thetvapp.to/tv/cnn-live-stream/

    The only way I've been able to find them is via chrome, the m3u8 finder/hls player extension identified this address:

    https://v-edge-4.thetvapp.to/hls/CNN.m3u8?token=J-lOBqOFbl5-vaG-QjqUsw&expires=1698055869

    Any assistance would be greatly appreciated! Also, I am very beginner with programming, but can follow instructions precisely and utilize powershell/cmd in windows.
    Quote Quote  
  2. Hey I know this is months later and you've probably figured this out already but I thought I'd output what I did to ge this automated .

    It grabs the udated m3u8 URL with tokens and outputs it to a text file

    Below is a Python script I wrote for the MTV channel on that site.

    To edit this to work on different streams youll need 2 URLS :

    1) The normal site stream link which itll search through "https://thetvapp.to/tv/mtv-live-stream/"

    2) The actual desired header URL. "thetvapp.to/live/streams/MTVEast.m3u8?token="

    Since this is formatted differently we need to tell the program what it looks like. You can find this on your respective channel by:
    Opening Dev tools
    Network tab
    Refresh
    Browse through the results on the site and there should be one with an M3u8 URL. Copy that URL up until the point where it specifies the token as that will change every time



    import re
    from selenium import webdriver
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    from webdriver_manager.chrome import ChromeDriverManager

    def extract_desired_url(requests):
    # Search for the desired URL in the requests
    for request in requests:
    if "thetvapp.to/live/streams/MTVEast.m3u8?token=" in request:
    return request
    return None

    url = "https://thetvapp.to/tv/mtv-live-stream/"

    # Set Chrome options
    options = webdriver.ChromeOptions()
    options.add_argument("--headless") # To run Chrome in headless mode

    # Initialize the ChromeDriver service with the executable path
    service = webdriver.ChromeService(ChromeDriverManager().inst all())

    # Initialize Selenium WebDriver with the service and Chrome options
    driver = webdriver.Chrome(service=service, options=options)

    # Navigate to the URL
    driver.get(url)

    # Wait for the video player element to be present
    try:
    video_player = WebDriverWait(driver, 20).until(EC.presence_of_element_located((By.CLASS _NAME, "video-player")))
    print("Video player loaded successfully.")
    except:
    pass # Do nothing if the video player is not found, the message will not be printed

    def get_get_requests():
    try:
    # Execute JavaScript to capture network requests
    requests = driver.execute_script("""
    var performance = window.performance || window.webkitPerformance || window.msPerformance || window.mozPerformance;
    if (!performance) {
    return [];
    }
    var entries = performance.getEntriesByType("resource");
    var urls = [];
    for (var i = 0; i < entries.length; i++) {
    urls.push(entries[i].name);
    }
    return urls;
    """)
    return requests
    except Exception as e:
    print("An error occurred:", e)
    return None
    finally:
    driver.quit()

    # Call the function to get GET requests
    get_requests = get_get_requests()

    # Extract the desired URL
    if get_requests:
    desired_url = extract_desired_url(get_requests)
    if desired_url:
    print("Desired URL found:", desired_url)
    # Write the desired URL to the file
    with open("mtvurl.txt", "w") as file:
    file.write(desired_url)
    print("Desired URL written to mtvurl.txt")
    else:
    print("No desired URL found in the requests.")
    else:
    print("No GET requests found.")
    Quote Quote  
  3. Originally Posted by torahslut353 View Post
    Hey I know this is months later and you've probably figured this out already but I thought I'd output what I did to ge this automated .

    It grabs the udated m3u8 URL with tokens and outputs it to a text file

    Below is a Python script I wrote for the MTV channel on that site.

    To edit this to work on different streams youll need 2 URLS :

    1) The normal site stream link which itll search through "https://thetvapp.to/tv/mtv-live-stream/"

    2) The actual desired header URL. "thetvapp.to/live/streams/MTVEast.m3u8?token="

    Since this is formatted differently we need to tell the program what it looks like. You can find this on your respective channel by:
    Opening Dev tools
    Network tab
    Refresh
    Browse through the results on the site and there should be one with an M3u8 URL. Copy that URL up until the point where it specifies the token as that will change every time



    import re
    from selenium import webdriver
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    from webdriver_manager.chrome import ChromeDriverManager

    def extract_desired_url(requests):
    # Search for the desired URL in the requests
    for request in requests:
    if "thetvapp.to/live/streams/MTVEast.m3u8?token=" in request:
    return request
    return None

    url = "https://thetvapp.to/tv/mtv-live-stream/"

    # Set Chrome options
    options = webdriver.ChromeOptions()
    options.add_argument("--headless") # To run Chrome in headless mode

    # Initialize the ChromeDriver service with the executable path
    service = webdriver.ChromeService(ChromeDriverManager().inst all())

    # Initialize Selenium WebDriver with the service and Chrome options
    driver = webdriver.Chrome(service=service, options=options)

    # Navigate to the URL
    driver.get(url)

    # Wait for the video player element to be present
    try:
    video_player = WebDriverWait(driver, 20).until(EC.presence_of_element_located((By.CLASS _NAME, "video-player")))
    print("Video player loaded successfully.")
    except:
    pass # Do nothing if the video player is not found, the message will not be printed

    def get_get_requests():
    try:
    # Execute JavaScript to capture network requests
    requests = driver.execute_script("""
    var performance = window.performance || window.webkitPerformance || window.msPerformance || window.mozPerformance;
    if (!performance) {
    return [];
    }
    var entries = performance.getEntriesByType("resource");
    var urls = [];
    for (var i = 0; i < entries.length; i++) {
    urls.push(entries[i].name);
    }
    return urls;
    """)
    return requests
    except Exception as e:
    print("An error occurred:", e)
    return None
    finally:
    driver.quit()

    # Call the function to get GET requests
    get_requests = get_get_requests()

    # Extract the desired URL
    if get_requests:
    desired_url = extract_desired_url(get_requests)
    if desired_url:
    print("Desired URL found:", desired_url)
    # Write the desired URL to the file
    with open("mtvurl.txt", "w") as file:
    file.write(desired_url)
    print("Desired URL written to mtvurl.txt")
    else:
    print("No desired URL found in the requests.")
    else:
    print("No GET requests found.")

    I am not much of a programmer but does this output to a file? I see in your code where you have the MTV channel located so if I wanted to add say HBO to it I what lines would I need to copy and change for HBO or any other channel? I would love to be able to take your script and extract all of the channels from Thetvapp.to so that I can then import it into VLC.
    Quote Quote  
  4. I am attempting to run your code but I am getting the following error:

    DevTools listening on ws://127.0.0.1:54862/devtools/browser/c3ad0a70-bb0f-491e-9a65-7256718ce539
    An error occurred while loading the video player: Message:
    Stacktrace:
    GetHandleVerifier [0x00E48D03+51395]
    (No symbol) [0x00DB5F61]
    (No symbol) [0x00C6E13A]
    (No symbol) [0x00CA62BB]
    (No symbol) [0x00CA63EB]
    (No symbol) [0x00CDC162]
    (No symbol) [0x00CC3ED4]
    (No symbol) [0x00CDA570]
    (No symbol) [0x00CC3C26]
    (No symbol) [0x00C9C629]
    (No symbol) [0x00C9D40D]
    GetHandleVerifier [0x011C68D3+3712147]
    GetHandleVerifier [0x01205CBA+3971194]
    GetHandleVerifier [0x01200FA8+3951464]
    GetHandleVerifier [0x00EF9D09+776393]
    (No symbol) [0x00DC1734]
    (No symbol) [0x00DBC618]
    (No symbol) [0x00DBC7C9]
    (No symbol) [0x00DADDF0]
    BaseThreadInitThunk [0x765CFCC9+25]
    RtlGetAppContainerNamedObjectPath [0x777A7C5E+286]
    RtlGetAppContainerNamedObjectPath [0x777A7C2E+238]

    An error occurred: Message: javascript error: Invalid or unexpected token
    (Session info: chrome-headless-shell=122.0.6261.113)
    Stacktrace:
    GetHandleVerifier [0x00E48D03+51395]
    (No symbol) [0x00DB5F61]
    (No symbol) [0x00C6E13A]
    (No symbol) [0x00C72480]
    (No symbol) [0x00C7408D]
    (No symbol) [0x00CDAEAC]
    (No symbol) [0x00CC3E8C]
    (No symbol) [0x00CDA570]
    (No symbol) [0x00CC3C26]
    (No symbol) [0x00C9C629]
    (No symbol) [0x00C9D40D]
    GetHandleVerifier [0x011C68D3+3712147]
    GetHandleVerifier [0x01205CBA+3971194]
    GetHandleVerifier [0x01200FA8+3951464]
    GetHandleVerifier [0x00EF9D09+776393]
    (No symbol) [0x00DC1734]
    (No symbol) [0x00DBC618]
    (No symbol) [0x00DBC7C9]
    (No symbol) [0x00DADDF0]
    BaseThreadInitThunk [0x765CFCC9+25]
    RtlGetAppContainerNamedObjectPath [0x777A7C5E+286]
    RtlGetAppContainerNamedObjectPath [0x777A7C2E+238]

    No GET requests found.
    Press any key to continue . . .
    Quote Quote  
  5. Originally Posted by Dimension02000 View Post
    I am attempting to run your code but I am getting the following error:

    DevTools listening on ws://127.0.0.1:54862/devtools/browser/c3ad0a70-bb0f-491e-9a65-7256718ce539
    An error occurred while loading the video player: Message:
    Stacktrace:
    GetHandleVerifier [0x00E48D03+51395]
    (No symbol) [0x00DB5F61]
    (No symbol) [0x00C6E13A]
    (No symbol) [0x00CA62BB]
    (No symbol) [0x00CA63EB]
    (No symbol) [0x00CDC162]
    (No symbol) [0x00CC3ED4]
    (No symbol) [0x00CDA570]
    (No symbol) [0x00CC3C26]
    (No symbol) [0x00C9C629]
    (No symbol) [0x00C9D40D]
    GetHandleVerifier [0x011C68D3+3712147]
    GetHandleVerifier [0x01205CBA+3971194]
    GetHandleVerifier [0x01200FA8+3951464]
    GetHandleVerifier [0x00EF9D09+776393]
    (No symbol) [0x00DC1734]
    (No symbol) [0x00DBC618]
    (No symbol) [0x00DBC7C9]
    (No symbol) [0x00DADDF0]
    BaseThreadInitThunk [0x765CFCC9+25]
    RtlGetAppContainerNamedObjectPath [0x777A7C5E+286]
    RtlGetAppContainerNamedObjectPath [0x777A7C2E+238]
    If you're on Windows, go to your task manager's Details tab, look for any chromedriver.exe instance that might be running, right-click on it and choose "End process tree".
    Quote Quote  
  6. Originally Posted by white_snake View Post
    Originally Posted by Dimension02000 View Post
    I am attempting to run your code but I am getting the following error:

    DevTools listening on ws://127.0.0.1:54862/devtools/browser/c3ad0a70-bb0f-491e-9a65-7256718ce539
    An error occurred while loading the video player: Message:
    Stacktrace:
    GetHandleVerifier [0x00E48D03+51395]
    (No symbol) [0x00DB5F61]
    (No symbol) [0x00C6E13A]
    (No symbol) [0x00CA62BB]
    (No symbol) [0x00CA63EB]
    (No symbol) [0x00CDC162]
    (No symbol) [0x00CC3ED4]
    (No symbol) [0x00CDA570]
    (No symbol) [0x00CC3C26]
    (No symbol) [0x00C9C629]
    (No symbol) [0x00C9D40D]
    GetHandleVerifier [0x011C68D3+3712147]
    GetHandleVerifier [0x01205CBA+3971194]
    GetHandleVerifier [0x01200FA8+3951464]
    GetHandleVerifier [0x00EF9D09+776393]
    (No symbol) [0x00DC1734]
    (No symbol) [0x00DBC618]
    (No symbol) [0x00DBC7C9]
    (No symbol) [0x00DADDF0]
    BaseThreadInitThunk [0x765CFCC9+25]
    RtlGetAppContainerNamedObjectPath [0x777A7C5E+286]
    RtlGetAppContainerNamedObjectPath [0x777A7C2E+238]
    If you're on Windows, go to your task manager's Details tab, look for any chromedriver.exe instance that might be running, right-click on it and choose "End process tree".
    I did as you instructed but I am still not getting the expected results. Here is the latest output:

    DevTools listening on ws://127.0.0.1:38304/devtools/browser/d63eddb5-0b27-4fcb-a79f-80f6a124f774
    An error occurred while loading the video player: Message:
    Stacktrace:
    GetHandleVerifier [0x00854CE3+225091]
    (No symbol) [0x00784E31]
    (No symbol) [0x00629A7A]
    (No symbol) [0x0066175B]
    (No symbol) [0x0066188B]
    (No symbol) [0x00697882]
    (No symbol) [0x0067F5A4]
    (No symbol) [0x00695CB0]
    (No symbol) [0x0067F2F6]
    (No symbol) [0x006579B9]
    (No symbol) [0x0065879D]
    sqlite3_dbdata_init [0x00CC9A83+4064547]
    sqlite3_dbdata_init [0x00CD108A+4094762]
    sqlite3_dbdata_init [0x00CCB988+4072488]
    sqlite3_dbdata_init [0x009CC9E9+930953]
    (No symbol) [0x00790804]
    (No symbol) [0x0078AD28]
    (No symbol) [0x0078AE51]
    (No symbol) [0x0077CAC0]
    BaseThreadInitThunk [0x765CFCC9+25]
    RtlGetAppContainerNamedObjectPath [0x777A7C5E+286]
    RtlGetAppContainerNamedObjectPath [0x777A7C2E+238]

    An error occurred: Message: javascript error: Invalid or unexpected token
    (Session info: chrome-headless-shell=123.0.6312.58)
    Stacktrace:
    GetHandleVerifier [0x00854CE3+225091]
    (No symbol) [0x00784E31]
    (No symbol) [0x00629A7A]
    (No symbol) [0x0062DEB0]
    (No symbol) [0x0062FA76]
    (No symbol) [0x006965E2]
    (No symbol) [0x0067F55C]
    (No symbol) [0x00695CB0]
    (No symbol) [0x0067F2F6]
    (No symbol) [0x006579B9]
    (No symbol) [0x0065879D]
    sqlite3_dbdata_init [0x00CC9A83+4064547]
    sqlite3_dbdata_init [0x00CD108A+4094762]
    sqlite3_dbdata_init [0x00CCB988+4072488]
    sqlite3_dbdata_init [0x009CC9E9+930953]
    (No symbol) [0x00790804]
    (No symbol) [0x0078AD28]
    (No symbol) [0x0078AE51]
    (No symbol) [0x0077CAC0]
    BaseThreadInitThunk [0x765CFCC9+25]
    RtlGetAppContainerNamedObjectPath [0x777A7C5E+286]
    RtlGetAppContainerNamedObjectPath [0x777A7C2E+238]

    No GET requests found.
    Press any key to continue . . .
    Quote Quote  
  7. Try to make sure any instance of chrome.exe using that same User Data folder is also closed (or just close any chrome.exe process) before running the script.
    Quote Quote  
  8. Originally Posted by white_snake View Post
    Try to make sure any instance of chrome.exe using that same User Data folder is also closed (or just close any chrome.exe process) before running the script.
    Thanks for getting back to me so quickly.

    I reviewed my task manager ensuring that there were no Chrome processes running and again checked the details which did not show any Chrome running. To be sure that I did not miss anything I even rebooted my system but I am still not getting the expected results.

    Here is the latest output:

    DevTools listening on ws://127.0.0.1:11191/devtools/browser/ce5d1f30-f194-4856-9feb-8bbd1c71eb0a
    An error occurred while loading the video player: Message:
    Stacktrace:
    GetHandleVerifier [0x00614CE3+225091]
    (No symbol) [0x00544E31]
    (No symbol) [0x003E9A7A]
    (No symbol) [0x0042175B]
    (No symbol) [0x0042188B]
    (No symbol) [0x00457882]
    (No symbol) [0x0043F5A4]
    (No symbol) [0x00455CB0]
    (No symbol) [0x0043F2F6]
    (No symbol) [0x004179B9]
    (No symbol) [0x0041879D]
    sqlite3_dbdata_init [0x00A89A83+4064547]
    sqlite3_dbdata_init [0x00A9108A+4094762]
    sqlite3_dbdata_init [0x00A8B988+4072488]
    sqlite3_dbdata_init [0x0078C9E9+930953]
    (No symbol) [0x00550804]
    (No symbol) [0x0054AD28]
    (No symbol) [0x0054AE51]
    (No symbol) [0x0053CAC0]
    BaseThreadInitThunk [0x75FEFCC9+25]
    RtlGetAppContainerNamedObjectPath [0x77237C5E+286]
    RtlGetAppContainerNamedObjectPath [0x77237C2E+238]

    An error occurred: Message: javascript error: Invalid or unexpected token
    (Session info: chrome-headless-shell=123.0.6312.58)
    Stacktrace:
    GetHandleVerifier [0x00614CE3+225091]
    (No symbol) [0x00544E31]
    (No symbol) [0x003E9A7A]
    (No symbol) [0x003EDEB0]
    (No symbol) [0x003EFA76]
    (No symbol) [0x004565E2]
    (No symbol) [0x0043F55C]
    (No symbol) [0x00455CB0]
    (No symbol) [0x0043F2F6]
    (No symbol) [0x004179B9]
    (No symbol) [0x0041879D]
    sqlite3_dbdata_init [0x00A89A83+4064547]
    sqlite3_dbdata_init [0x00A9108A+4094762]
    sqlite3_dbdata_init [0x00A8B988+4072488]
    sqlite3_dbdata_init [0x0078C9E9+930953]
    (No symbol) [0x00550804]
    (No symbol) [0x0054AD28]
    (No symbol) [0x0054AE51]
    (No symbol) [0x0053CAC0]
    BaseThreadInitThunk [0x75FEFCC9+25]
    RtlGetAppContainerNamedObjectPath [0x77237C5E+286]
    RtlGetAppContainerNamedObjectPath [0x77237C2E+238]

    No GET requests found.
    Press any key to continue . . .
    Quote Quote  
  9. Originally Posted by Dimension02000 View Post
    Originally Posted by white_snake View Post
    Try to make sure any instance of chrome.exe using that same User Data folder is also closed (or just close any chrome.exe process) before running the script.
    Thanks for getting back to me so quickly.

    I reviewed my task manager ensuring that there were no Chrome processes running and again checked the details which did not show any Chrome running. To be sure that I did not miss anything I even rebooted my system but I am still not getting the expected results.
    It looks like the script is for older selenium versions, I updated the script a little bit, try this:

    Code:
    from selenium import webdriver
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    
    url = "https://thetvapp.to/tv/mtv-live-stream/"
    
    
    def extract_desired_url(requests):
        # Search for the desired URL in the requests
        for request in requests:
            if "MTVEast.m3u8?token=" in request:
                return request
        return None
    
    
    def get_get_requests():
        try:
            # Execute JavaScript to capture network requests
            requests = driver.execute_script("""
                var performance = window.performance || window.webkitPerformance || window.msPerformance || window.mozPerformance;
                if (!performance) {
                    return [];
                }
                var entries = performance.getEntriesByType("resource");
                var urls = [];
                for (var i = 0; i < entries.length; i++) {
                    urls.push(entries[i].name);
                }
                return urls;
            """)
            return requests
        except Exception as e:
            print("An error occurred:", e)
            return None
        finally:
            driver.quit()
    
    
    # Set Chrome options
    options = webdriver.ChromeOptions()
    options.add_argument("--headless")  # To run Chrome in headless mode
    
    # Initialize Selenium WebDriver with the service and Chrome options
    driver = webdriver.Chrome(options=options)
    
    # Navigate to the URL
    driver.get(url)
    
    # Wait for the video player element to be present
    try:
        video_player = WebDriverWait(driver, 20).until(EC.presence_of_element_located((By.CLASS_NAME, "video-player")))
        print("Video player loaded successfully.")
    except:
        pass  # Do nothing if the video player is not found, the message will not be printed
    
    
    # Call the function to get GET requests
    get_requests = get_get_requests()
    
    # Extract the desired URL
    if get_requests:
        desired_url = extract_desired_url(get_requests)
        if desired_url:
            print("Desired URL found:", desired_url)
            # Write the desired URL to the file
            with open("mtvurl.txt", "w") as file:
                file.write(desired_url)
                print("Desired URL written to mtvurl.txt")
        else:
            print("No desired URL found in the requests.")
    else:
        print("No GET requests found.")
    Quote Quote  
  10. It works!

    Thank you so much for your assistance on getting this to work. Now I just need to figure out how to get it read all of the URLs from a file or the site itself to acquire the needed URL with the token rather than it being hard coded.
    Quote Quote  
  11. Member
    Join Date
    May 2024
    Location
    Ohio
    Search PM
    I keep getting "No desired URL found in the requests." Is this python script still working for everyone? No success so far.
    Quote Quote  
  12. Originally Posted by Dimension02000 View Post
    It works!

    Thank you so much for your assistance on getting this to work. Now I just need to figure out how to get it read all of the URLs from a file or the site itself to acquire the needed URL with the token rather than it being hard coded.
    Thats the easy part,

    Code:
    import requests
    import re
    
    url = "https://thetvapp.to/tv"
    headers = {
        'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36'
    }
    response = requests.get(url, headers=headers )
    
    channels = re.findall(r'a href=\"/tv/(.*?)\"', response.text)
    
    for channel in channels:
        print('https://thetvapp.to/tv/' +str(channel))
    Quote Quote  
  13. I updated it so it works again. I also made it ask what channel and then spit out the url.

    Code:
    from selenium import webdriver
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    import time
    import re
    
    # Setup main options
    options = webdriver.ChromeOptions()
    options.add_argument("--headless")  # To run Chrome in headless mode
    options.add_argument("--no-sandbox")
    options.add_argument("--disable-dev-shm-usage")
    user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.85 Safari/537.36"
    options.add_argument(f"user-agent={user_agent}")
    driver = webdriver.Chrome(options=options)
    
    # First get all the live channels into a list
    homepage = "https://thetvapp.to/tv/"
    driver.get(homepage)
    channels = re.findall('a href=\"/tv/(.*?)/\"', driver.page_source)
    
    # Enumerate and print the list then wait for user to input a number
    for i, channel in enumerate(channels):
        print(i, channel.replace('-', ' '))
    while True:
        try:
            selection = int(input("Please select a channel number: "))
            if selection < 0 or selection >= len(channels):
                print(f"Please select a number between 0 and {len(channels) - 1}.")
                continue
        except ValueError:
            print("Sorry, numbers only.")
            continue
        else:
            break
    
    url = homepage + str(channels[selection])
    print(f'Scraping page for playlist at {url}')
    
    
    
    def extract_desired_url(requests):
        # Search for the desired URL in the requests
        for request in requests:
            if "m3u8?token=" in request:
                return request
        return None
    
    
    def get_get_requests():
        global driver
        try:
            # Execute JavaScript to capture network requests
            requests = driver.execute_script("""
                var performance = window.performance || window.webkitPerformance || window.msPerformance || window.mozPerformance;
                if (!performance) {
                    return [];
                }
                var entries = performance.getEntriesByType("resource");
                var urls = [];
                for (var i = 0; i < entries.length; i++) {
                    urls.push(entries[i].name);
                }
                return urls;
            """)
            return requests
        except Exception as e:
            print("An error occurred:", e)
            return None
    
    
    
    driver.get(url)
    time.sleep(1)  # Adjust this if needed - this is the wait for the player to receive the decoded url
    get_requests = get_get_requests()
    
    # Extract the desired URL
    if get_requests:
        desired_url = extract_desired_url(get_requests)
        if desired_url:
            print("Playlist URL found:", desired_url)
        else:
            print("No Playlist URL found in the requests.")
    else:
        print("No GET requests found.")
    
    driver.quit()
    Quote Quote  
  14. Originally Posted by SpaceBallz View Post
    I updated it so it works again. I also made it ask what channel and then spit out the url.

    [/code]

    So were you able to successfully generate m3u files for thetvapp.to ?
    Quote Quote  
  15. It did work, I think they might have changed the way it works. I think the m3u only gets delivered after pressing play now. I'll look at the script again over the weekend.
    Quote Quote  
  16. Had a quick look, they now wait for the play button to be pressed which called a url like token/channelname. it needs a nicely crafted header to receive the m3u url. I'll look again when I have more time, the data required is in the first load of the page.
    You can get the token url from the individual channel page with
    Code:
    driver.get(url)
    time.sleep(1) 
    chanpage = re.findall('data=\"/token/(.*?)\"', driver.page_source)
    newdata = "https://thetvapp.to/token/" + str(chanpage[0])
    Quote Quote  
  17. start your journey from this code

    CNN
    Code:
    import re
    import requests
    
    start_seasson = requests.session()
    
    
    # get crf-token
    headers1 = {
        'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7',
        'cache-control': 'max-age=0',
        'dnt': '1',
        'upgrade-insecure-requests': '1',
        'user-agent': 'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Mobile Safari/537.36',
    }
    
    
    webpage = start_seasson.get('https://thetvapp.to/tv/cnn-live-stream/', headers=headers1).text
    
    csrf_token = re.search(r'<meta name="csrf-token" content="(.*?)">', webpage).group(1)
    
    headers2 = {
        'content-type': 'application/json',
        'dnt': '1',
        'origin': 'https://thetvapp.to',
        'referer': 'https://thetvapp.to/',
        'user-agent': 'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Mobile Safari/537.36',
        'x-csrf-token': csrf_token,
    }
    
    json_data = {
        'KSCahyafDAfniqhjUdDlvpUB': 'GKeVZHYqyAKAjyWUapyLKKEctt',
    }
    
    hls = start_seasson.post('https://thetvapp.to/token/CNN', headers=headers2, json=json_data).text
    url_corrected = hls.replace("\\", "")
    print(url_corrected)
    
    start_seasson.close()
    Last edited by imr_saleh; 3rd Aug 2024 at 00:41.
    Quote Quote  
  18. Originally Posted by imr_saleh View Post
    start your journey from this code

    CNN
    Code:
    import re
    import requests
    
    start_seasson = requests.session()
    
    
    # get crf-token
    headers1 = {
        'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7',
        'cache-control': 'max-age=0',
        'dnt': '1',
        'upgrade-insecure-requests': '1',
        'user-agent': 'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Mobile Safari/537.36',
    }
    
    
    webpage = start_seasson.get('https://thetvapp.to/tv/cnn-live-stream/', headers=headers1).text
    
    csrf_token = re.search(r'<meta name="csrf-token" content="(.*?)">', webpage).group(1)
    
    headers2 = {
        'content-type': 'application/json',
        'dnt': '1',
        'origin': 'https://thetvapp.to',
        'referer': 'https://thetvapp.to/',
        'user-agent': 'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Mobile Safari/537.36',
        'x-csrf-token': csrf_token,
    }
    
    json_data = {
        'KSCahyafDAfniqhjUdDlvpUB': 'GKeVZHYqyAKAjyWUapyLKKEctt',
    }
    
    hls = start_seasson.post('https://thetvapp.to/token/CNN', headers=headers2, json=json_data).text
    url_corrected = hls.replace("\\", "")
    print(url_corrected)
    
    start_seasson.close()
    Using this code directly gives a forbidden error.

    The json_data you have hardcoded in the script is dynamic. Its only generated after the play button is clicked, however I was unable to find it. If i load the page with my debugger open and click play I can then see the payload data for my connection. If I put this key/value into your script it works fine.

    I was using selenium.

    Code:
    wait = WebDriverWait(driver, 10)
    play_button = wait.until(EC.element_to_be_clickable((By.ID, 'loadVideoBtnOne')))
    play_button.click()
    I was able to load the page, get all the cookies and click the button but like I said, where is the payload key:value created/stored.

    I'm still learning python and webscraping.
    Quote Quote  
  19. Feels Good Man 2nHxWW6GkN1l916N3ayz8HQoi's Avatar
    Join Date
    Jan 2024
    Location
    Pepe Island
    Search Comp PM
    Originally Posted by SpaceBallz View Post
    I'm still learning python and webscraping.
    If only there was a guide for webscraping somewhere...
    --[----->+<]>.++++++++++++.---.--------.
    [*drm mass downloader: widefrog*]~~~[*how to make your own mass downloader: guide*]
    Quote Quote  
  20. Originally Posted by SpaceBallz View Post
    Had a quick look, they now wait for the play button to be pressed which called a url like token/channelname. it needs a nicely crafted header to receive the m3u url. I'll look again when I have more time, the data required is in the first load of the page.
    You can get the token url from the individual channel page with
    Code:
    driver.get(url)
    time.sleep(1) 
    chanpage = re.findall('data=\"/token/(.*?)\"', driver.page_source)
    newdata = "https://thetvapp.to/token/" + str(chanpage[0])

    Hey, thanks. Luckily, found another source with Fortv that works really well.
    Quote Quote  
  21. Originally Posted by SpaceBallz View Post
    Originally Posted by imr_saleh View Post
    start your journey from this code

    CNN
    Code:
    import re
    import requests
    
    start_seasson = requests.session()
    
    
    # get crf-token
    headers1 = {
        'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7',
        'cache-control': 'max-age=0',
        'dnt': '1',
        'upgrade-insecure-requests': '1',
        'user-agent': 'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Mobile Safari/537.36',
    }
    
    
    webpage = start_seasson.get('https://thetvapp.to/tv/cnn-live-stream/', headers=headers1).text
    
    csrf_token = re.search(r'<meta name="csrf-token" content="(.*?)">', webpage).group(1)
    
    headers2 = {
        'content-type': 'application/json',
        'dnt': '1',
        'origin': 'https://thetvapp.to',
        'referer': 'https://thetvapp.to/',
        'user-agent': 'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Mobile Safari/537.36',
        'x-csrf-token': csrf_token,
    }
    
    json_data = {
        'KSCahyafDAfniqhjUdDlvpUB': 'GKeVZHYqyAKAjyWUapyLKKEctt',
    }
    
    hls = start_seasson.post('https://thetvapp.to/token/CNN', headers=headers2, json=json_data).text
    url_corrected = hls.replace("\\", "")
    print(url_corrected)
    
    start_seasson.close()
    Using this code directly gives a forbidden error.

    The json_data you have hardcoded in the script is dynamic. Its only generated after the play button is clicked, however I was unable to find it. If i load the page with my debugger open and click play I can then see the payload data for my connection. If I put this key/value into your script it works fine.

    I was using selenium.

    Code:
    wait = WebDriverWait(driver, 10)
    play_button = wait.until(EC.element_to_be_clickable((By.ID, 'loadVideoBtnOne')))
    play_button.click()
    I was able to load the page, get all the cookies and click the button but like I said, where is the payload key:value created/stored.

    I'm still learning python and webscraping.


    Ah, it seems that not only the x-csrf-token header has expired, but it also needs a new payload.

    I had a look and found that the payload is generated via JavaScript,
    The parameter (KSCahyafDAfniqhjUdDlvpUB) can be obtained directly inside the code
    But the biggest challenge is how to generate its key value (GKeVZHYqyAKAjyWUapyLKKEctt)
    Because The JavaScript file is heavily obfuscated, making it difficult to directly analyze the logic. However, the presence of certain patterns, such as function calls and variable manipulations, can guide us in locating the correct payload

    Code:
    async function L5() {
        const n = U
          , t = {
            vSpJA: n(239),
            Wgwlp: n(288),
            vSNPf: "Network response was not ok "
        };
        try {
            const x = await fetch(as, {
                method: t[n(240)],
                headers: {
                    "Content-Type": t[n(259)],
                    "X-CSRF-TOKEN": cs()
                },
                body: JSON[n(295)]({
                    KSCahyafDAfniqhjUdDlvpUB: R5  
                })
            });
            if (!x.ok)
                throw new Error(t[n(263)] + x[n(219)]);
            return await x[n(228)]()
        } catch (x) {
            console.error(n(226), x)
        }
    }
    I'll continue to check how the key is generated.

    I prefer to use the code directly instead of using selenium webdriver
    Quote Quote  
  22. Feels Good Man 2nHxWW6GkN1l916N3ayz8HQoi's Avatar
    Join Date
    Jan 2024
    Location
    Pepe Island
    Search Comp PM
    Not completely happy with the result, but eh, could be better

    Code:
    import ast
    import re
    from html import unescape
    from http.cookies import SimpleCookie
    from time import sleep
    
    import requests
    from bs4 import BeautifulSoup
    
    BASE_URL = "https://thetvapp.to"
    PAYLOAD = None
    
    
    def evaluate(a, o, b):
        if o == "-":
            return int(a) - int(b)
        if o == "+":
            return int(a) + int(b)
        if o == "*":
            return int(a) * int(b)
        return int(a) // int(b)
    
    
    def get_key_values(page_soup):
        app_js = page_soup.find_all('script', src=True)
        app_js = [
            source['src'] for source in app_js
            if source['src'].endswith('.js') and 'app' in source['src'].split("/")[-1]
        ][0]
    
        app_js = requests.get(app_js).content.decode()
        fixed_js_key = re.findall(
            r'headers:{[^{}]*"X-CSRF-TOKEN"[^{}]*},body:[^{}]*{([^{}]*)}',
            app_js
        )[0].split(":")[0]
    
        all_operations = re.findall(
            r'const\s[^=\s]+=\[];([^;]+);',
            app_js
        )
        possible_key_values = []
    
        for list_operations in all_operations:
            if "{" in list_operations or "}" in list_operations:
                continue
            if '](' not in list_operations or '),' not in list_operations:
                continue
    
            function_name = list_operations.split("[")[1].split("(")[0]
            function_name = re.findall(
                fr'const\s{function_name}=([^;]+);',
                app_js
            )[0]
    
            offset_operation = re.findall(
                r"function {function_name}\(.*?\){(.*?)}".replace("{function_name}", function_name),
                app_js,
                re.DOTALL
            )[0]
    
            function_name = re.findall(
                r'const\s[^=]+=([^()]+)\(',
                offset_operation
            )[0]
    
            offset_operation = re.findall(
                r"(\w+)=\1([-+*/])(\d+),",
                offset_operation
            )[0]
    
            list_operations = list_operations.split(",")
            list_operations = [
                re.findall(r'\(([^()]+)\)', op)[-1].replace('"', "").replace("'", "")
                for op in list_operations
            ]
    
            fixed_js_words = re.findall(
                r"function {function_name}\(.*?\){.*?(\[.*?\]);return\s.*?}".replace(
                    "{function_name}",
                    function_name
                ),
                app_js,
                re.DOTALL
            )[0]
            try:
                fixed_js_words = ast.literal_eval(fixed_js_words)
            except:
                continue
    
            try:
                max_op_len = len(max(list(filter(lambda o: not o.isdigit(), list_operations)), key=len))
            except:
                max_op_len = None
    
            for _ in range(0, len(fixed_js_words) - 1):
                current_key_value = []
                fail_key_value = False
    
                for operation in list_operations:
                    if operation.isdigit():
                        operation = evaluate(operation, offset_operation[1], offset_operation[2])
                        operation = fixed_js_words[operation]
    
                    if not bool(re.match(r'^[a-zA-Z]+$', operation)):
                        fail_key_value = True
                        break
                    elif max_op_len is not None and len(operation) > 2 * max_op_len:
                        fail_key_value = True
                        break
                    elif len("".join(current_key_value)) > 2 * len(fixed_js_key):
                        fail_key_value = True
                        break
                    current_key_value.append(operation)
    
                if len(current_key_value) == 0:
                    fail_key_value = True
                elif len("".join(current_key_value)) < len(fixed_js_key):
                    fail_key_value = True
                elif len(min(current_key_value, key=len)) * 2 < len(max(current_key_value, key=len)):
                    fail_key_value = True
    
                if not fail_key_value:
                    current_key_value = "".join(current_key_value)
                    possible_key_values.append(current_key_value)
    
                fixed_js_words.append(fixed_js_words.pop(0))
    
        return {
            "key": fixed_js_key,
            "value": possible_key_values
        }
    
    
    def get_m3u8(source_url):
        global PAYLOAD
        response = requests.get(source_url)
        soup = BeautifulSoup(response.text, 'html.parser')
    
        csrf_token = soup.find_all('meta', attrs={'name': 'csrf-token'})[0]["content"]
        get_m3u8_endpoint = soup.find_all("div", attrs={"id": "get-m3u8-link"})[0]["data"]
        if not get_m3u8_endpoint.startswith(BASE_URL):
            get_m3u8_endpoint = f'{BASE_URL}{get_m3u8_endpoint}'
    
        response = dict(response.headers)
        cookies = SimpleCookie()
        cookies.load(response["set-cookie"])
        app_session = {k: v.value for k, v in cookies.items()}["thetvapp_session"]
    
        payload = PAYLOAD
        if payload is None:
            payload = get_key_values(soup)
    
        for key_value in payload["value"]:
            js_key = payload["key"]
            response = requests.post(
                get_m3u8_endpoint,
                cookies={'thetvapp_session': app_session},
                headers={'X-CSRF-TOKEN': csrf_token},
                json={js_key: key_value}
            )
    
            if response.status_code == 200:
                PAYLOAD = {
                    "key": js_key,
                    "value": [key_value]
                }
                return response.json()
            sleep(0.5)
    
        print("Failed to obtain the m3u8 with any payload... Debug the script")
        exit(0)
    
    
    if __name__ == '__main__':
        r = requests.get(BASE_URL)
        s = BeautifulSoup(r.text, 'html.parser')
    
        links = s.find_all('a', class_='list-group-item')
        index = 0
        for link in links:
            href = link.get('href')
            if not href or not href.startswith('/tv/'):
                continue
    
            href = f"{BASE_URL}{href}"
            text = unescape(link.text)
            index += 1
    
            try:
                print(index, text, get_m3u8(href))
            except:
                print(index, "possible vpn issues: ", href)
    Code:
    1 A&E https://v1.thetvapp.to/hls/AEEast/index.m3u8?token=YnRka1dnbkx2Uko1eUw5bzU0MUlBbHJEdjBRQTJNNmxCWnBZWENIeA==
    2 ACC Network https://v1.thetvapp.to/hls/ACCNetwork/index.m3u8?token=RTRHNUNiQ0VxZWtMSmIyQXlPcG1MbkRpb1RsUHJ2b3c1WTNhakkxMQ==
    3 AMC https://v1.thetvapp.to/hls/AMCEast/index.m3u8?token=VWFiSGNjMkFLUlM5a085ekMwU1pBMzFmMU1qSDBZRVRINllURHBkTw==
    .
    .
    .
    30 Disney XD https://v3.thetvapp.to/hls/DisneyXDEast/index.m3u8?token=UTRabEd6QUx5bmFCUmNCTU5VOVNyam1LYjhvbEZVRXJuQTMwY2hMWg==
    31 E! https://v3.thetvapp.to/hls/EEast/index.m3u8?token=ckViSWxqYnk5cnVYNGd0S3g3TmdVQUI3Vk5DdGtheFdsdk82S3A0Rw==
    32 possible vpn issues:  https://thetvapp.to/tv/espn-live-stream/
    33 possible vpn issues:  https://thetvapp.to/tv/espn2-live-stream/
    34 ESPNews https://v3.thetvapp.to/hls/ESPNews/index.m3u8?token=ZHJIMGVHeVRoYk0yenZReDBQUnVlRjZoOTFwTGZEekZPaUNnNERDMQ==
    35 ESPNU https://v3.thetvapp.to/hls/ESPNU/index.m3u8?token=SzYyTlhhV0l0RWw4OTdTeUlKR0xZcGJRUkVWT0hZVHFUM0hxem9MRg==
    .
    .
    .
    113 WE tv https://v2.thetvapp.to/hls/WeTVEast/index.m3u8?token=Q2FrNFpQOW5JODlNdlg0ODBIZ2h5TmhQRVlrUk9LR3NwN2lZMDhmcA==
    114 WNBC (New York) NBC East https://v2.thetvapp.to/hls/WNBCDT1/index.m3u8?token=d2JyMEJjTXBpZjlyRUwya3Uxa0ZQN2NFUlNvNnF3ZE5DMVI0SzF5eQ==
    115 WNYW (New York) FOX East https://v3.thetvapp.to/hls/WNYWDT1/index.m3u8?token=a1pveTNzTEZ2YzZUTmNQcTdvWDJuUE5TVW1HMEpPMGthWUxiVE9Hcw==
    Last edited by 2nHxWW6GkN1l916N3ayz8HQoi; 8th Aug 2024 at 09:33.
    --[----->+<]>.++++++++++++.---.--------.
    [*drm mass downloader: widefrog*]~~~[*how to make your own mass downloader: guide*]
    Quote Quote  
  23. Originally Posted by 2nHxWW6GkN1l916N3ayz8HQoi View Post
    Not completely happy with the result, but eh, could be better

    Code:
    import ast
    import re
    from html import unescape
    from http.cookies import SimpleCookie
    
    import requests
    from bs4 import BeautifulSoup
    
    BASE_URL = "https://thetvapp.to"
    CHECK_KEYWORDS = ["Network response was not ok ", "my-jwplayer"]
    
    
    def get_m3u8(source_url):
        response = requests.get(source_url)
        soup = BeautifulSoup(response.text, 'html.parser')
    
        app_js = soup.find_all('script', src=True)
        app_js = [
            s['src'] for s in app_js
            if s['src'].endswith('.js') and 'app' in s['src'].split("/")[-1]
        ][0]
    
        app_js = requests.get(app_js).content.decode()
        key = re.findall(
            r'headers:{[^{}]*"X-CSRF-TOKEN"[^{}]*},body:[^{}]*{([^{}]*)}',
            app_js
        )[0].split(":")[0]
    
        operations = re.findall(
            r'const\s[^=\s]+=\[];([^;]+);',
            app_js
        )
    
        operations = sorted(operations, key=lambda o: o.count(","), reverse=True)[0].split(",")
        operations = [
            re.findall(r'\(([^()]+)\)', o)[-1].replace('"', "").replace("'", "")
            for o in operations
        ]
    
        num = [elem for elem in operations if elem.isdigit()]
        sort_num = sorted(num, key=int, reverse=True)
        index_map = {elem: sort_num.index(elem) for elem in num}
        operations = [(elem, index_map[elem]) if elem.isdigit() else (elem, -1) for elem in operations]
    
        matches = re.findall(
            r"\[\s*(?:[^][]*|\[(?:[^][]*|\[[^]]*])*])*\s*]",
            app_js
        )
        matches = sorted(matches, key=lambda m: len(m), reverse=True)
    
        list_words = None
        for match in matches:
            if match[0] != "[" or match[-1] != "]":
                continue
            try:
                match = ast.literal_eval(match)
                for m in match:
                    if type(m) is not str:
                        raise
    
                for word in CHECK_KEYWORDS:
                    if word not in match:
                        raise
                list_words = match
                break
            except:
                continue
    
        list_words = list(reversed(list_words))
        list_words = [l for l in list_words if len(l) <= 3]
    
        key_value = []
        for v, i in operations:
            if i >= 0:
                i = list_words[i]
            else:
                i = v
            key_value.append(i)
    
        key_value = "".join(key_value)
    
        csrf_token = soup.find_all('meta', attrs={'name': 'csrf-token'})[0]["content"]
        get_m3u8_endpoint = soup.find_all("div", attrs={"id": "get-m3u8-link"})[0]["data"]
        if not get_m3u8_endpoint.startswith(BASE_URL):
            get_m3u8_endpoint = f'{BASE_URL}{get_m3u8_endpoint}'
    
        response = dict(response.headers)
        cookies = SimpleCookie()
        cookies.load(response["set-cookie"])
        app_session = {k: v.value for k, v in cookies.items()}["thetvapp_session"]
    
        response = requests.post(
            get_m3u8_endpoint,
            cookies={'thetvapp_session': app_session},
            headers={'X-CSRF-TOKEN': csrf_token},
            json={key: key_value}
        )
        m3u8_url = response.json()
        return m3u8_url
    
    
    if __name__ == '__main__':
        response = requests.get(BASE_URL)
        soup = BeautifulSoup(response.text, 'html.parser')
    
        links = soup.find_all('a', class_='list-group-item')
        index = 0
        for link in links:
            href = link.get('href')
            if not href or not href.startswith('/tv/'):
                continue
    
            href = f"{BASE_URL}{href}"
            text = unescape(link.text)
            index += 1
    
            try:
                print(index, text, get_m3u8(href))
            except:
                print("possible vpn issues: ", href)
    Code:
    1 A&E https://v1.thetvapp.to/hls/AEEast/index.m3u8?token=YnRka1dnbkx2Uko1eUw5bzU0MUlBbHJEdjBRQTJNNmxCWnBZWENIeA==
    2 ACC Network https://v1.thetvapp.to/hls/ACCNetwork/index.m3u8?token=RTRHNUNiQ0VxZWtMSmIyQXlPcG1MbkRpb1RsUHJ2b3c1WTNhakkxMQ==
    3 AMC https://v1.thetvapp.to/hls/AMCEast/index.m3u8?token=VWFiSGNjMkFLUlM5a085ekMwU1pBMzFmMU1qSDBZRVRINllURHBkTw==
    .
    .
    .
    30 Disney XD https://v3.thetvapp.to/hls/DisneyXDEast/index.m3u8?token=UTRabEd6QUx5bmFCUmNCTU5VOVNyam1LYjhvbEZVRXJuQTMwY2hMWg==
    31 E! https://v3.thetvapp.to/hls/EEast/index.m3u8?token=ckViSWxqYnk5cnVYNGd0S3g3TmdVQUI3Vk5DdGtheFdsdk82S3A0Rw==
    possible vpn issues:  https://thetvapp.to/tv/espn-live-stream/
    possible vpn issues:  https://thetvapp.to/tv/espn2-live-stream/
    34 ESPNews https://v3.thetvapp.to/hls/ESPNews/index.m3u8?token=ZHJIMGVHeVRoYk0yenZReDBQUnVlRjZoOTFwTGZEekZPaUNnNERDMQ==
    35 ESPNU https://v3.thetvapp.to/hls/ESPNU/index.m3u8?token=SzYyTlhhV0l0RWw4OTdTeUlKR0xZcGJRUkVWT0hZVHFUM0hxem9MRg==
    .
    .
    .
    113 WE tv https://v2.thetvapp.to/hls/WeTVEast/index.m3u8?token=Q2FrNFpQOW5JODlNdlg0ODBIZ2h5TmhQRVlrUk9LR3NwN2lZMDhmcA==
    114 WNBC (New York) NBC East https://v2.thetvapp.to/hls/WNBCDT1/index.m3u8?token=d2JyMEJjTXBpZjlyRUwya3Uxa0ZQN2NFUlNvNnF3ZE5DMVI0SzF5eQ==
    115 WNYW (New York) FOX East https://v3.thetvapp.to/hls/WNYWDT1/index.m3u8?token=a1pveTNzTEZ2YzZUTmNQcTdvWDJuUE5TVW1HMEpPMGthWUxiVE9Hcw==


    i was struggling for 6 hrs trying to figure out how the javascript works

    i didn't know about the reversed words
    Code:
    list_words = list(reversed(list_words))
    list_words = [l for l in list_words if len(l) <= 3]

    what a genius widefrog
    great work
    Quote Quote  
  24. It was all in the JavaScript code, and I was going to share it if he hadn’t. I’m not sure why you couldn’t figure it out—maybe you’re still getting familiar with JavaScript? Either way, he did an amazing job.
    discord=notaghost9997
    Quote Quote  
  25. Feels Good Man 2nHxWW6GkN1l916N3ayz8HQoi's Avatar
    Join Date
    Jan 2024
    Location
    Pepe Island
    Search Comp PM
    Originally Posted by imr_saleh View Post
    i was struggling for 6 hrs trying to figure out how the javascript works

    i didn't know about the reversed words
    Since you seem like a nice fella who genuinely likes learning and scripting, I'm gonna explain my line of thought that led me to that mediocre solution. After all, what's the point of all these fancy scripts if people can't write new ones when they're gonna inevitably fail after a few days/weeks, especially if the site dev is lurking like a rat somewhere.

    I'm gonna skip over the data scraping basics since you know them probably, and they have also been explained on videohelp forum guides. Since the hardest challenge is obtaining that magic payload pair, key/value, I'm gonna focus on it. The problem is gonna be split into 2 smaller issues, the key and the value.

    By using the HAR trick on that key (in my case it is "amOJQwpfeNEMtHDipfKCfmshvqSZ"), you can instantly find it in a JS file. I'm gonna use a formatted JS source code on Chrome to showcase the code snippets.

    Code:
    await t[n(333)](fetch, i1, {
            method: t[n(369)],
            headers: {
                "Content-Type": n(339),
                "X-CSRF-TOKEN": c1()
            },
            body: JSON[n(340)]({
                amOJQwpfeNEMtHDipfKCfmshvqSZ: S5
            })
        })
    
    .... or ....
    
    const x = await fetch(i1, {
                method: t[n(326)],
                headers: {
                    "Content-Type": n(339),
                    "X-CSRF-TOKEN": c1()
                },
                body: JSON.stringify({
                    amOJQwpfeNEMtHDipfKCfmshvqSZ: C5
                })
            });
    Since that value "amOJQwpfeNEMtHDipfKCfmshvqSZ" is magic, you're gonna have to extract it as well. You could make a regex that picks 28 character strings in length, but it's horrible since you could also pick false solutions. Instead, try to find a fixed anchor point where you can start building a regex pattern. By ctrl+f searching "X-CSRF-TOKEN" in that JS file will bring you back to the same code snippets, so now you have a fixed point where you can develop your regex. It's up to you how you do it. Just keep in mind, don't build your regex based on the formatted code, look at the raw code (which is one long continuous string) once you know what you wanna do.

    Now the hard part is the value. A search in the HAR file brings no results, so maybe it's hidden in some encoding??? Before doing drastic things and checking all requests manually in the network tab, let's just take a look at the snippets. It seems to be building a payload, and the value of that fixed key should be the one we're looking for. I'm gonna focus on the 2nd code snippet (const x = blabla) and place a debugger breakpoint there.

    Image
    [Attachment 81234 - Click to enlarge]


    Jackpot. So the value is also found there (its content might be different for you). The question becomes now, from where is taken / how is it generated. That "C5" is not a function call, but instead a variable. By going a little backward and seeing the biggest function that encapsulates all, we get this:

    Code:
    async function O5() {
        const n = W
          , t = {
            fcVHD: n(337),
            HtOev: function(x, r) {
                return x + r
            }
        };
        try {
            const x = await fetch(i1, {
                method: t[n(326)],
                headers: {
                    "Content-Type": n(339),
                    "X-CSRF-TOKEN": c1()
                },
                body: JSON.stringify({
                    amOJQwpfeNEMtHDipfKCfmshvqSZ: C5
                })
            });
    So "C5" is neither a variable created in the function, nor a received parameter. It means it's created somewhere outside the function in the JS file. That's good. Progress. Since it is a variable, that must mean it has to be assigned somewhere something. Click on the first line of the JS file and you can search the first appearance of "C5 = " since the js file is formatted (enable case-sensitive search). You can also place a debugger breakpoint on the line where you find something and refresh the page.

    Image
    [Attachment 81235 - Click to enlarge]


    Seems like the value of m0 is a list and its contents can be appended into 1 big string that represents the key value we want. I'm gonna completely ignore what that function "Co" is doing since it's useless. We already know what it's supposed to do. So, from "C5" we go to "m0" which is another variable, only now it is a list, not a string, but equivalent nonetheless. By searching from the start of the js file for "m0 = " we get

    Code:
    const m0 = [];
    m0.push(W(396)),
    m0[W(343)](W(376)),
    m0[W(343)](W(362)),
    m0[W(343)]("tkm"),
    m0[W(343)]("gvv"),
    m0[W(343)](W(325)),
    m0[W(343)]("tO"),
    m0[W(343)]("yS"),
    m0[W(343)]("my"),
    m0[W(343)]("ZO");
    const Rx = [];
    ...
    We can already see some fixed parts of the known string there: "tkm", "gvv", etc. But what is W() supposed to do? Also, what is m0[W(343)] even doing? In javascript, m0 is an object. Like all objects, you can access their inner methods like you would access the value of a dictionary, by using keys. So W(343) is a string that is supposed to represent an inner function of the list object. Since m0 is a list and the JS code is building that list, then obviously W(343) should represent the "push" method. But we're gonna confirm that later. Let's place a debugger breakpoint at the first push and see where the call of W() leads us:

    Code:
    const W = dt;
    function dt(n, t) {
        const x = gi();
        return dt = function(r, e) {
            return r = r - 319,
            x[r]
        }
        ,
        dt(n, t)
    }
    We enter into a "dt" function. Weirdly, it's not called "W" but you can already see a little higher that the function is assigned to a variable. So it makes sense now. The "dt" function seems to be doing mostly things on its own without external variables, except gi(). Let's continue the debugging and see where gi() leads us.

    Code:
    function gi() {
        const n = ["my-jwplayer", ... blabla ...,"4xIvEzO", "TGL", "OFlaR"];
        return gi = function() {
            return n
        }
        ,
        gi()
    }
    Jackpot. The value of n is just a list of constant strings and the gi function is not using anything else external. This stops being a data scraping issue and now becomes a problem of understanding the logic flow since you have all the necessary snippet codes that don't use anything else external. This is no longer a "you don't know what you don't know" scenario which is good.

    Understanding the flow of the found code is now a matter of experience. You could try your luck with ChatGPT but for most other sites you're gonna have to know Javascript and have some coding knowledge. Now we're lucky that the site dev didn't bother obfuscating it that much. In short, the function W() returns elements from the fixed "n" list by receiving an index. That index also has an offset since you have the operation "r - 319".

    343 - 319 = 24 => W(343) is actually "push" which is what we deduced previously. If you had access to the fixed string list, the offset operation, and the push operation, you could build the list yourself. The offset operation can be ignored if you realize that the js code is not gonna use ALL of the possible strings from the word list. At first glance, it just looks like the only ones used have a length <= 3. So the problem becomes about data scraping again: how to get the fixed list and the list of operations. Both can be done by doing the same thing. Find an anchor and start building your regex.

    Now what exactly I don't find good about this approach? It's kinda hardcoded and if the JS code is obfuscated even more, you're gonna have to change it. I think in this case, a selenium approach might be better.

    Originally Posted by notaghost View Post
    I was going to share it if he hadn’t
    You can always post it if you want. I doubt people are gonna mind. I still think it could be further generalized.
    --[----->+<]>.++++++++++++.---.--------.
    [*drm mass downloader: widefrog*]~~~[*how to make your own mass downloader: guide*]
    Quote Quote  
  26. Originally Posted by 2nHxWW6GkN1l916N3ayz8HQoi View Post

    Originally Posted by notaghost View Post
    I was going to share it if he hadn’t
    You can always post it if you want. I doubt people are gonna mind. I still think it could be further generalized.
    I’m not sure what’s left to share.
    discord=notaghost9997
    Quote Quote  
  27. Originally Posted by notaghost View Post
    It was all in the JavaScript code, and I was going to share it if he hadn’t. I’m not sure why you couldn’t figure it out—maybe you’re still getting familiar with JavaScript? Either way, he did an amazing job.
    i'm not familiar with JS , not as much as i focus on Python, but at least i try


    Originally Posted by 2nHxWW6GkN1l916N3ayz8HQoi View Post

    Since you seem like a nice fella who genuinely likes learning and scripting, I'm gonna explain my line of thought that led me to that mediocre solution. After all, what's the point of all these fancy scripts if people can't write new ones when they're gonna inevitably fail after a few days/weeks, especially if the site dev is lurking like a rat somewhere.
    Thank you for the detailed explanation, the script has stopped working. Maybe the site developers use new tricks every day, anyway as you said selenium approach might be better in this case.
    Last edited by imr_saleh; 5th Aug 2024 at 22:53.
    Quote Quote  
  28. Feels Good Man 2nHxWW6GkN1l916N3ayz8HQoi's Avatar
    Join Date
    Jan 2024
    Location
    Pepe Island
    Search Comp PM
    Originally Posted by imr_saleh View Post
    Thank you for the detailed explanation, the script has stopped working. Maybe the site developers use new tricks every day, anyway as you said selenium approach might be better in this case.
    Doesn't surprise me. There's this piece of code now
    Code:
    (function(n, t) {
        const x = _x
          , r = n();
        for (; []; )
            try {
                if (parseInt(x(457)) / 1 + parseInt(x(464)) / 2 + -parseInt(x(477)) / 3 * (parseInt(x(491)) / 4) + -parseInt(x(452)) / 5 + parseInt(x(448)) / 6 + parseInt(x(456)) / 7 * (-parseInt(x(454)) / 8) + parseInt(x(461)) / 9 * (parseInt(x(480)) / 10) === t)
                    break;
                r.push(r.shift())
            } catch {
                r.push(r.shift())
            }
    }
    )(bc, 191828);
    that modifies the "fixed" list of strings. So you also need to data scrape how it's modified, and also the offset operation. It is possible to fix but it's a silly game. It's one thing extracting a fixed value from a JS file. Another kind of problem when it comes to "extracting" the logic flow from a JS file. Better to just go for selenium until they introduce captchas. ¯\_(ツ)_/¯

    If there's one thing to learn from this, is just how to debug a JS and what to look for.

    Edit: Just to entertain that silly idea for a second, I edited the previous script from post #22. I wonder what's gonna break now.
    Last edited by 2nHxWW6GkN1l916N3ayz8HQoi; 6th Aug 2024 at 08:07.
    --[----->+<]>.++++++++++++.---.--------.
    [*drm mass downloader: widefrog*]~~~[*how to make your own mass downloader: guide*]
    Quote Quote  



Similar Threads

Visit our sponsor! Try DVDFab and backup Blu-rays!