There are batch files to download video and audio streams, but I haven't seen any for subtitle streams.
In order to download subtitles, sometimes it takes long to write yt-dlp command lines.
To make the things simplier and faster, I need a batch file with two functions - just as video-audio downloading batch files.
1- to list all the subtitles from a specific MPD link given
2- to choose subtitles for downloading and saving to a specific folder location
+ Reply to Thread
Results 1 to 5 of 5
-
-
Glad to see another stupid troll is banned, its name was @Gen99 - who tried to hijack this thread with silly links.
I am sure it will get another nickname and continue to attack other guys who need help. Because that's its way of life. It loves to give headache to people. I say "it", not he or she. Not a human. Just a meaningless creature.Last edited by ridibunda; 6th Apr 2022 at 08:44.
-
Using a batch will be an extra step (unless you want them all)
You will have to use yt-dlp to list the available formats for subs.
Then you will have to select the ones you want.
Then you will have to write a bat
https://github.com/yt-dlp/yt-dlp#subtitle-options
Why dont you just incorporate the subs download along with the video / audio you want to download ?
Its easy to do. And anyone will help you with this syntax.Last edited by codehound; 3rd Apr 2022 at 14:32.
-
"Easy" is relative.
The famous batch called "Encrypted Video Downloader II" does not have subtitle downloading option.
I know I will have to use yt-dlp to list and download the subs. The thing was how to make it faster with a batch file. "Faster" = "writing command lines as low as possible".
Just double click on a batch, copy&paste the mpd link there, then let the batch list all subtitles automatically, then choose the needed subtitle just writing "en" or "fr", then download begins, that's what I want. -
You don't have to make a batch file. You can make python create a list of urls for yt-dlp to download one at a time.
I have a program getUK.py which will greedy download from some of the main UK tv providers. Feel free to adapt to your needs
Code:from __future__ import print_function from bs4 import BeautifulSoup import requests import os import re import os.path import json ################ # need yt-dlp and aria2c as external programs in PATH # pip package updates have been known to break this program!! # Some programmes on STV are now encrypted - this will not work with those # ################ my_set = set() # sets do not allow duplicates # thanks;- # https://hackersandslackers.com/extract-data-from-complex-json-python/ # # must use json.loads to provide this fn with a dict object def json_extract(obj, key): """Recursively fetch values from nested JSON.""" arr = [] def extract(obj, arr, key): """Recursively search for values of key in JSON tree.""" if isinstance(obj, dict): for k, v in obj.items(): if isinstance(v, (dict, list)): extract(v, arr, key) elif k == key: arr.append(v) elif isinstance(obj, list): for item in obj: extract(item, arr, key) return arr values = extract(obj, arr, key) # return list object return values def getBBCMedia(url): # creates a set of urls for yt-dlp to use soup = BeautifulSoup(requests.get(url).text, 'lxml') domain = ("https://" + url.split('/')[-5]) for link in soup.find_all('a'): if 'href' in link.attrs and 'episode' in link.attrs['href']: my_set.add(domain + link.attrs['href'] + '\r') for x in my_set: if re.search("^.*/ad/.*$", x): continue str_val = "".join(x) print(str_val) getMedia(my_set) def getSTVSummaryMedia(url): soup = BeautifulSoup(requests.get(url).text, 'lxml') # stv links are relative, thus capture domain domain = ("https://" + url.split('/')[2]) # specific json code on stv pages in __NEXT_DATA__ -- May change!!!!! my_json = json.loads(soup.find(id="__NEXT_DATA__").get_text()) # print(f"jason data collected = {my_json}") # uncomment for testing correct json returned my_values = json_extract(my_json, 'link') # my_values is a list # remove summary urls and duplicate items in list by # adding to a set, my_set for x in my_values: if re.search('summary', x): pass else: my_set.add(f'{domain}{x}') getMedia(my_set) def getITVMedia(url): soup = BeautifulSoup(requests.get(url).text, 'lxml') series = (url.split('/')[-2]) for link in soup.find_all('a'): if 'href' in link.attrs and series in link.attrs['href']: if re.search("^.*facebook.*$", str(link.attrs)): continue if re.search("^.*twitter.*$", str(link.attrs)): continue ''' if prog ref too short ''' # if not re.search("[a-zA-Z0-9]{8,11}$", str(link.attrs)): ''' close file and return ''' my_set.add(link.attrs['href'] + '\r') getMedia(my_set) def getMedia(my_set): for x in my_set: os.system(f"yt-dlp --downloader aria2c --convert-subs srt --embed-subs {x}") def getSingleMedia(url): os.system(f"yt-dlp --downloader aria2c --convert-subs srt --embed-subs {url}") def delineate(url): if re.search(f"bbc.co.uk", url): # pass url to getBBCMedia print("BBC media") return getBBCMedia(url) if re.search(f"itv.com", url): # pass url to getITVMedia print('ITV media') return getITVMedia(url) if re.search(f"stv.tv", url) and re.search('summary', url): # pass url to getStv return getSTVSummaryMedia(url) else: print('getting single media') return getSingleMedia(url) if __name__ == '__main__': from sys import argv if argv and len(argv) > 1: # print(f"{argv[1]}") delineate(argv[1]) else: print("Usage: getUK requires a URL to be passed as an argument \n \ URLs currently processed are BBC, ITV and STV ")
Similar Threads
-
Mediainfo batch script create help
By iKron in forum Newbie / General discussionsReplies: 3Last Post: 5th Jul 2021, 03:53 -
Create batch file for ffmpeg to increment the file output
By Bassquake in forum Video ConversionReplies: 2Last Post: 25th Nov 2019, 05:03 -
help batch downloading from a given website
By aniel in forum Video Streaming DownloadingReplies: 5Last Post: 7th Nov 2019, 04:47 -
Batch file to create nfo files for multiple videos
By olpdog in forum ProgrammingReplies: 2Last Post: 16th Jul 2019, 12:47 -
Downloading new OCR dictionary to Subtitle Edit 3.5.4
By flipside555 in forum SubtitleReplies: 2Last Post: 5th Jun 2019, 11:08