I just got a subscription to NHK on demand which is a treasure trove of content. I'm also a fansubber, so the subtitles are very important to help with translations.
I recently managed to extract the subtitles from the video stream, which was renamed as a html but is actually TTML. It is obvious the caption text is in the file but all the subtitle editors and converters have problems reading the XML.
Below is an extract of the XML, which shows timecodes, positioning, coloring, etc... If anyone can help to figure out how to get this converted to conventional formats, that would be greatly appreciated!
HTML Code:<?xml version="1.0" encoding="UTF-8"?> <cuepoints x="170" y="450" font="MS ゴシック,Osaka−等幅,ヒラギノ角ゴ ProW3,Osaka" color="0xffffff" size="36" underline="false" bold="false" italic="false" ruby="false"> <cuepoint name="1" time="0.334"/> <cuepoint name="2" time="0.367"> </cuepoint> <cuepoint name="3" time="0.634"/> <cuepoint name="4" time="0.667"> </cuepoint> <cuepoint name="5" time="34.934"/> <cuepoint name="6" time="34.967"> <subtitle id="201" x="198" y="449" xx="198" yy="560" background="0x000000" opacity="0.5" lang="jpn" letterspacing="4" width="234" height="40"> <![CDATA[(トランシーバー・]]> </subtitle> <subtitle id="202" x="432" y="449" xx="432" yy="560" background="0x000000" opacity="0.5" lang="jpn" letterspacing="4" width="84" height="40"> <![CDATA[増渕]]> </subtitle> <subtitle id="203" x="437" xx="437" y="429" yy="540" size="18" background="0x000000" opacity="0.5" lang="jpn" letterspacing="0" width="74" height="20" ruby="true"> <![CDATA[ますぶち]]> </subtitle> <subtitle id="204" x="516" y="449" xx="516" yy="560" background="0x000000" opacity="0.5" lang="jpn" letterspacing="4" width="193" height="40"> <![CDATA[)「どうだ ]]> </subtitle> <subtitle id="205" x="709" y="449" xx="709" yy="560" background="0x000000" opacity="0.5" lang="jpn" letterspacing="4" width="84" height="40"> <![CDATA[水上]]> </subtitle> <subtitle id="206" x="714" xx="714" y="429" yy="540" size="18" background="0x000000" opacity="0.5" lang="jpn" letterspacing="0" width="74" height="20" ruby="true"> <![CDATA[みずかみ]]> </subtitle> <subtitle id="207" x="793" y="449" xx="793" yy="560" background="0x000000" opacity="0.5" lang="jpn" letterspacing="4" width="50" height="40"> <![CDATA[」。]]> </subtitle> </cuepoint> <cuepoint name="7" time="37.967"/> <cuepoint name="8" time="41.634"/> <cuepoint name="9" time="41.667"> <subtitle id="201" x="138" y="449" xx="138" yy="560" background="0x000000" opacity="0.5" lang="jpn" letterspacing="4" width="433" height="40"> <![CDATA[(増渕)水上 応答しろ!]]> </subtitle> </cuepoint> <cuepoint name="10" time="44.367"/> <cuepoint name="11" time="46.400"/> <cuepoint name="12" time="46.433"> <subtitle id="201" x="318" y="449" xx="318" yy="560" background="0x000000" opacity="0.5" lang="jpn" letterspacing="4" width="324" height="40"> <![CDATA[何かあったのか!]]> </subtitle> </cuepoint> <cuepoint name="13" time="49.067"/> <cuepoint name="14" time="49.100"> <subtitle id="201" x="198" y="449" xx="198" yy="560" background="0x000000" opacity="0.5" lang="jpn" letterspacing="4" width="606" height="40"> <![CDATA[(トランシーバー・水上)「白い鳥が…」。]]> </subtitle> </cuepoint> <cuepoint name="15" time="52.400"/> <cuepoint name="16" time="56.734"/> <cuepoint name="17" time="56.767"> <subtitle id="201" x="398" y="449" xx="398" yy="560" background="0x000000" opacity="0.5" lang="jpn" letterspacing="4" width="170" height="40"> <![CDATA[(爆発音)]]> </subtitle> </cuepoint> <cuepoint name="18" time="59.767"/> <cuepoint name="19" time="62.067"/> <cuepoint name="20" time="62.100"> <subtitle id="201" x="418" y="449" xx="418" yy="560" background="0x000000" opacity="0.5" lang="jpn" letterspacing="4" width="124" height="40"> <![CDATA[水上!]]> </subtitle> </cuepoint> <cuepoint name="21" time="64.100"/> <cuepoint name="22" time="68.567"/> <cuepoint name="23" time="68.600"> <subtitle id="201" x="398" y="389" xx="398" yy="560" background="0x000000" opacity="0.5" lang="jpn" letterspacing="4" width="170" height="40"> <![CDATA[(爆発音)]]> </subtitle> <subtitle id="202" x="498" y="449" xx="498" yy="620" background="0x000000" opacity="0.5" lang="jpn" letterspacing="4" width="290" height="40"> <![CDATA[(増渕)うわっ!]]> </subtitle> </cuepoint> <cuepoint name="24" time="70.500"/> <cuepoint name="25" time="70.533"> <subtitle id="201" x="398" y="449" xx="398" yy="560" background="0x000000" opacity="0.5" lang="jpn" letterspacing="4" width="170" height="40"> <![CDATA[(爆発音)]]> </subtitle> </cuepoint> <cuepoint name="26" time="77.333"/> <cuepoint name="27" time="82.400"/> <cuepoint name="28" time="82.433"> <subtitle id="201" x="138" y="359" xx="138" yy="560" background="0x000000" opacity="0.5" lang="jpn" letterspacing="4" width="27" height="40"> <![CDATA[(]]> </subtitle> <subtitle id="202" x="165" y="359" xx="165" yy="560" background="0x000000" opacity="0.5" lang="jpn" letterspacing="4" width="84" height="40"> <![CDATA[小磯]]> </subtitle> <subtitle id="203" x="179" xx="179" y="339" yy="540" size="18" background="0x000000" opacity="0.5" lang="jpn" letterspacing="0" width="56" height="20" ruby="true"> <![CDATA[こいそ]]> </subtitle> <subtitle id="204" x="249" y="359" xx="249" yy="560" background="0x000000" opacity="0.5" lang="jpn" letterspacing="4" width="187" height="40"> <![CDATA[)これより]]> </subtitle> <subtitle id="205" x="158" y="449" xx="158" yy="620" background="0x000000" opacity="0.5" lang="jpn" letterspacing="4" width="107" height="40"> <![CDATA[弊社 ]]> </subtitle> <subtitle id="206" x="265" y="449" xx="265" yy="620" background="0x000000" opacity="0.5" lang="jpn" letterspacing="4" width="124" height="40"> <![CDATA[東國実]]> </subtitle> <subtitle id="207" x="272" xx="272" y="429" yy="600" size="18" background="0x000000" opacity="0.5" lang="jpn" letterspacing="0" width="110" height="20" ruby="true"> <![CDATA[ひがしくにみ]]> </subtitle> <subtitle id="208" x="389" y="449" xx="389" yy="620" background="0x000000" opacity="0.5" lang="jpn" letterspacing="4" width="427" height="40"> <![CDATA[化学 第一工場で起きた]]> </subtitle> <subtitle id="209" x="816" y="449" xx="816" yy="620" background="0x000000" opacity="0.5" lang="jpn" letterspacing="4" substitution_string="→" gaiji_pattern="000000000000000000000000000000000000000000000000004000000006000000007000000007800000007C00000007E00000007F00000007F807FFFFFFC07FFFFFFE07FFFFFFF07FFFFFFF87FFFFFFFC7FFFFFFFE7FFFFFFFC7FFFFFFF87FFFFFFF07FFFFFFE07FFFFFFC0000007F80000007F00000007E00000007C00000007800000007000000006000000004000000000000000000000000000000000000000" gaiji_width="36" gaiji_height="36" width="44" height="40"> <![CDATA[ ]]> </subtitle>
Try StreamFab Downloader and download from Netflix, Amazon, Youtube! Or Try DVDFab and copy Blu-rays! or rip iTunes movies!
+ Reply to Thread
Results 1 to 3 of 3
Thread
-
-
If the subs are captions you could try to extract them with clever Ffmpeg-GUI.
-
With the kind help of a developer, there is now a python script to convert these NHK TTML's into SRT.
Go to https://github.com/nopol10/ttml for the script and instructions.
Similar Threads
-
Need help on downloading from sports.nhk.or.jp
By jgronwall in forum Video Streaming DownloadingReplies: 81Last Post: 25th Mar 2023, 01:57 -
What kind of TTML Subtitles are these?
By RedPenguin in forum SubtitleReplies: 2Last Post: 23rd Oct 2021, 00:02 -
Best MP3 to Subtitle Conversion Website?
By pone44 in forum SubtitleReplies: 0Last Post: 6th May 2020, 01:39 -
Convert ttml+png to .sub/.idx or .sup with Subtitle Edit
By nzhd in forum SubtitleReplies: 2Last Post: 23rd Apr 2020, 23:17 -
TTML/CC to ass with positions?
By Hakunamatata67 in forum SubtitleReplies: 2Last Post: 25th May 2019, 03:36