Thanks for the merge python script, I'll try that.
You can also merge with SE: tools -> join subtitles
Just add the 2 subs and click "join".
Then: tools > Sort > By start time
Looking at your subs I see what happens: the missing subs come from the merge, but that has a penalty: there are parts where you have both subs generated by whisper and those merged for the same dialog.
Example starting at 47:42:507 for E02 (timestamp with my version):
One sub with 2 lines:
And then 2 subs with 2 lines and one line:Code:I always believed that I had a future in this country.
Code:I always believed we had a futureThat's a bit of a mess!Code:in this country.
About syncing subs.
Most US shows are designed to have ads. These ads are generally inserted at scene changes, when the video goes dark briefly.
2 different versions of a same show (web-dl and FOX edited to remove ads for example) will show a timing discrepancy at the scene changes.
My process is:
1) Use comskip, a tool to find ads, to detect scene changes to produce a VideoRedo .vprj file.
I use:
comskip82_010_donators\comskip.exe --threads=20 --videoredo --detectmethod=95 --verbose=0 "Bones S01-E01.mkv"
Even if the adds have already been removed, comskip usually finds where
2) Open the VideoRedo project. VideoRedo will clearly show where the ads where. It will show many false positions, but using F6 to navigate from one to the next shows the image, and when it's dark, it's likely where the adds were.
Example with an old FOX show where the adds were already removed:
[Attachment 89434 - Click to enlarge]
Use SE to adjust at the beginning using "Set start and offset the rest". Then navigate to the next scene change that you see with VideoRedo. Check before the timestamp if it's in sync (usually it is) and there after. If it's not, sync it at this location with ""Set start and offset the rest"".
That does not work with all the shows (not with Kabul for example that never had embedded ads anyway), but in my experience with many of them.
Not sure if the donator version of comskip is still available. Or if it's actually needed.
VideoRedo is no longer sold, it's hard to activate it now but I read somewhere that somebody has the rights to it now and can provide a legal way. You would need to do a search for that.
If I get more information, I'll send you a PM.
		
			+ Reply to Thread
			
		
		
		
			 
		
			
	
	
				Results 61 to 74 of 74
			
		- 
	
- 
	SE engine is the culprit. No problems with CMD>CLIExample starting at 47:42:507 for E02 (timestamp with my version):
 
 One sub with 2 lines:
 
 Code:
 I always believed that I
 had a future in this country.
 And then 2 subs with 2 lines and one line:
 
 Code:
 I always believed
 we had a future
 Code:
 in this country.
 
 
 [Attachment 89439 - Click to enlarge]
 
 Doing the transcription again.Will update you with new subs soon.
 
 Meanwhile, read my PM
- 
	Uploaded complete clean subs of Kabul (Mini TV Series) 2025 
 
 
 https://www.opensubtitles.org
 
 Uploader: SamGer
 
 https://www.opensubtitles.org/en/ssearch/sublanguageid-eng/idmovie-2350514
 
 
 https://sub-scene.com/
 
 https://sub-scene.com/subtitle/3363962
- 
	Great, that will save me the trouble doing all the subs myself. 
 
 I actually tried to upload the Assembly ones, they were rejected because it's too obvious that they are AI generated.
 
 Edit:
 
 But there are still subs that overlap:
 
 
 [Attachment 89446 - Click to enlarge]Last edited by robena; 29th Oct 2025 at 12:53. 
- 
	Hi, 
 
 So, here is a perfect way for this Kabul series.
 
 This uses my way to number episodes: "Kabul S01-E01.mkv"
 
 This will likely fail if using another convention such as " "Kabul S01E01.mkv"
 
 1) Use Faster-Whisper-XXL_r245.1_windows with this model (based on a REXX script, easy to transcribe for something else):
 
 
 That outputs a file such as "Kabul S01-E01.srt" that my REXX script renames to "Kabul S01-E01-en.srt"Code:/* Just in case you want something else than English */ if lang = 'en' then do task = ' --task translate' prompt = ' --initial_prompt "Translate everything to English."' end else do task = ' --task transcribe' prompt = ' --initial_prompt "Transcribe in 'lang'."' end '--model large-v3' , task , ' --language 'lang , prompt , ' --device cuda' , ' --compute_type float16' , ' --batch_size 8' , ' --vad_method pyannote_onnx_v3' , ' --vad_device cuda' , ' --beep_off', ' --vad_threshold 0.1' , /* ULTRA LOW */ ' --vad_min_speech_duration_ms 50' , /* 50ms = catch whispers */ ' --vad_min_silence_duration_ms 100' , /* tighter gaps */ ' --hallucination_silence_threshold 0.6' , ' --no_speech_threshold 0.1' , /* catch ANY speech */ ' --logprob_threshold -2.0' , /* keep low-conf */ ' --compression_ratio_threshold 2.4' , ' --beam_size 5' , ' --best_of 5' , ' --temperature 0' , ' --repetition_penalty 1.1' , ' --no_repeat_ngram_size 3' , ' --condition_on_previous_text False' , ' --word_timestamps True' , ' --output_format all' , /* JSON + SRT */ ' --output_dir "'fdd(file)'"'
 
 *** Having a filename ending with "-something" is necessary.
 
 Then, store in the same directory: "Kabul S01-E01-F.srt"
 
 These are the forced subs for the foreign dialogs.
 
 *** Having '-F' is necessary.
 
 Whisper has translated these foreign dialogs, but we want those that are already in "Kabul S01-E01-F.srt".
 
 To get that, merge with this python script:
 
 that will produce "Kabul S01-E01.srt"Code:#!/usr/bin/env python3 # -*- coding: utf-8 -*- """ merge_subtitles.py - - -F.srt loaded FIRST - Every forced sub PRESERVED exactly - Whisper fills gaps OR is replaced on overlap - No duplicates, no loss """ import sys from pathlib import Path from typing import List, Tuple def time_to_seconds(t: str) -> float: h, m, s_ms = t.split(":") s, ms = s_ms.replace(",", ".").split(".") return int(h) * 3600 + int(m) * 60 + float(s) + float(ms) / 1000 def seconds_to_srt(sec: float) -> str: h = int(sec // 3600) m = int((sec % 3600) // 60) s = sec % 60 return f"{h:02}:{m:02}:{s:06.3f}".replace(".", ",")[:12] def parse_srt_robust(content: str, filename: str) -> List[Tuple[float, float, List[str], str]]: entries = [] lines = content.splitlines() i = 0 while i < len(lines): if lines[i].strip().isdigit(): i += 1 if i >= len(lines): break time_line = lines[i].strip() if "-->" not in time_line: i += 1 continue try: start_str, end_str = time_line.split("-->", 1) start = time_to_seconds(start_str.strip()) end = time_to_seconds(end_str.strip()) except: i += 1 continue i += 1 text_lines = [] while i < len(lines) and lines[i].strip() and not lines[i].strip().isdigit(): text_lines.append(lines[i].strip()) i += 1 if text_lines: entries.append((start, end, text_lines, filename)) else: i += 1 return entries def is_forced(filename: str) -> bool: return any(k in filename.lower() for k in ("-f.", "-forced", ".f.", "forced")) def main(mkv_path: str) -> None: mkv = Path(mkv_path) if not mkv.exists(): print(f"[ERROR] File not found: {mkv}") sys.exit(1) folder = mkv.parent base_name = mkv.stem output_srt = folder / f"{base_name}.srt" if output_srt.exists(): print("Skipping merge, file exists") return srt_files = list(folder.glob(f"{base_name}*.srt")) if not srt_files: print(f"[INFO] No SRT files") return forced_file = next((f for f in srt_files if is_forced(f.name)), None) whisper_file = next((f for f in srt_files if not is_forced(f.name)), None) if not forced_file: print("[ERROR] No -F.srt found!") return print(f"[INFO] Forced: {forced_file.name}") print(f"[INFO] Whisper: {whisper_file.name if whisper_file else 'None'}") # Parse forced try: forced_text = forced_file.read_text(encoding="utf-8", errors="replace") forced_subs = parse_srt_robust(forced_text, forced_file.name) print(f" ? {len(forced_subs)} forced lines") except Exception as e: print(f"[ERROR] Failed to read forced: {e}") return # Parse whisper whisper_subs = [] if whisper_file: try: whisper_text = whisper_file.read_text(encoding="utf-8", errors="replace") whisper_subs = parse_srt_robust(whisper_text, whisper_file.name) print(f" ? {len(whisper_subs)} whisper lines") except Exception as e: print(f"[WARNING] Whisper failed: {e}") forced_subs.sort(key=lambda x: x[0]) whisper_subs.sort(key=lambda x: x[0]) final_subs = [] w_idx = 0 W = len(whisper_subs) for f_start, f_end, f_lines, _ in forced_subs: # Add all Whisper subs that END before this forced sub starts while w_idx < W: w_start, w_end, _, _ = whisper_subs[w_idx] if w_end <= f_start: # No overlap final_subs.append(whisper_subs[w_idx][:3]) w_idx += 1 else: break # Now: skip all Whisper subs that overlap this forced sub while w_idx < W: w_start, w_end, _, _ = whisper_subs[w_idx] if w_start < f_end: # Overlaps w_idx += 1 else: break # Add forced sub final_subs.append((f_start, f_end, f_lines)) # Add remaining non-overlapping whisper subs while w_idx < W: final_subs.append(whisper_subs[w_idx][:3]) w_idx += 1 # Write output with open(output_srt, "w", encoding="utf-8") as f: for idx, (start, end, lines) in enumerate(final_subs, 1): f.write(f"{idx}\n") f.write(f"{seconds_to_srt(start)} --> {seconds_to_srt(end)}\n") for line in lines: f.write(f"{line}\n") f.write("\n") print(f"\n[OK] Merged {len(final_subs)} blocks ? {output_srt.name}") print(f" ? {len(forced_subs)} forced subs preserved (100%)") if __name__ == "__main__": if len(sys.argv) != 2: print("Usage: merge_subtitles.py <mkv_path>") sys.exit(1) main(sys.argv[1])
 
 It will contain:
 
 - English dialogs transcribed in English
 - Foreign dialogs that are already in the '-F' forced subs "as is".
 - Foreign dialogs missing in the '-F' forced subs translated by whisper.
 
 I did all that with a mix of ChatGPT, Grok and Deepseek.
 
 Here are all the episode batch processed:
 
 https://limewire.com/d/9JGEX#qH5IhI1Y1x
 
 
 Keep in mind that my SE settings are different than yours, so formating might not be 100% to your liking.
 
 Also, it seems that you don't have exactly the same version, you will likely need to sync the start of the subs.Last edited by robena; 29th Oct 2025 at 20:47. 
- 
	No.The lines are too big to read. Update the code with 
 
 --max_line_count 2 ^
 --max_line_width 36 ^
- 
	Not too many lines like that, but I'll try and compare the results, that's easy! 
 
 The way I did it with a REXX script, it's just a right click to do everything.
- 
	Updated Assembly Code [Assembly.c] 
 Perfect Two lines break. No need for SE batch for touchup.
- 
	I did not test Assembly yet, but for whisper, I am surprised to see the difference it makes between --max_line_width 36 and --max_line_width 40, it's not only 4 characters. 
 
 I asked why, and got:
 
 --max_line_width does not mean “no line may be longer than N characters”.
 It tells Faster-Whisper the target width that the line-breaker tries to stay under while it is splitting a segment into subtitle lines.
 Because the breaker also respects sentence boundaries, words that are already > N, and minimum-line-length rules, you will still see lines that are much longer than the value you passed – especially when you raise it from 36 to 40.
 
 Thanks for pointing it out, I would never had thought by myself that it would make the subs that better.
 
 Here they are:
 
 https://limewire.com/d/ZpjSg#tNvqHhY3Gl
 
 Whisper gives much more natural looking subs than Assembly. opensubtitles.org rejects Assembly ones.
 
 Edit: I'll reserve judgment until I test your version!
 
 Edit edit: I get "Failed to upload file." with your version. Firewall was open. Don't waste time on it for me, I'll use whisper from now on.
 
 Edit edit edit: stupid, I forgot to update the API key!!!Last edited by robena; 30th Oct 2025 at 03:50. 
- 
	Sam, 
 
 I remux my subs using a routine that detects the aspect ratio and creates PGS files that are located inside the active video region, just a few pixels above the black bar:
 
 
 [Attachment 89467 - Click to enlarge]
 
 Interested?
- 
	Ofcource YES. 
 
 
 Kindly PM your code for Faster-Whisper-XXL that you usedThe way I did it with a REXX script, it's just a right click to do everything.
- 
	Tehran [Season 03] KAN (Color-Yellow) ENG-NON Hi 
 
 subs uploaded > opensubtitles.org [uploader-SamGer]Last edited by sam12345; 30th Oct 2025 at 06:48. 
Similar Threads
- 
  Subtitle Edit - delete video and subtitle file after processing?By svcds in forum SubtitleReplies: 0Last Post: 4th Jan 2024, 06:45
- 
  Subtitle edit - How to put 'A with a dash on top' in subtitle edit?By SSEN in forum SubtitleReplies: 5Last Post: 21st Sep 2023, 21:57
- 
  Subtitle Edit : Capitalize Subtitle to Normal Subtitle incompleteBy kalemvar1 in forum SubtitleReplies: 6Last Post: 5th Aug 2023, 13:28
- 
  Subtitle Edit - Shortcut to set a subtitle minimum gapBy tren in forum SubtitleReplies: 2Last Post: 1st Aug 2023, 07:44
- 
  Subtitle edit, warning subtitle contains negative timing codes fix pleaseBy jraju in forum Newbie / General discussionsReplies: 1Last Post: 16th Dec 2019, 19:52


 
		
		 View Profile
				View Profile
			 View Forum Posts
				View Forum Posts
			 Private Message
				Private Message
			 
 
			
			

 Quote
 Quote