Looking for a user friendly tool that can extract subtitles from a .ts file.
I tried CCExtractor, TSDocter, SubTitleEdit but probably, or better... I am sure, I must be doing something wrong.
Am not an expert at all.
After hours of trying I gave up and ask here. No doubt folks here are experienced and know what to do/give me suggestions.
First the .ts file: please find details below.
I have added some screenshots of CCExtractor. As for the PDF file, click on the first screenshot, to display the other screenshots.
Maybe someone can tell me what I should tag.
Hope someone can help me.
Or maybe I should use some other tool?
Code:General ID : 133 (0x85) Complete name : bla bla_20180123_2245.ts Format : BDAV Format/Info : Blu-ray Video File size : 9.34 GiB Duration : 2 h 12 min Overall bit rate mode : Variable Overall bit rate : 10.1 Mb/s FileExtension_Invalid : m2ts mts ssif Video ID : 3321 (0xCF9) Menu ID : 33020 (0x80FC) Format : AVC Format/Info : Advanced Video Codec Format profile : High@L4 Format settings, CABAC : Yes Format settings, ReFrames : 4 frames Format settings, GOP : M=3, N=15 Codec ID : 27 Duration : 2 h 12 min Bit rate : 8 680 kb/s Maximum bit rate : 8 582 kb/s Width : 1 920 pixels Height : 1 080 pixels Display aspect ratio : 16:9 Frame rate : 25.000 FPS Standard : Component Color space : YUV Chroma subsampling : 4:2:0 Bit depth : 8 bits Scan type : MBAFF Scan type, store method : Interleaved fields Scan order : Top Field First Bits/(Pixel*Frame) : 0.167 Stream size : 8.06 GiB (86%) Color range : Limited Color primaries : BT.709 Transfer characteristics : BT.709 Matrix coefficients : BT.709 Text #1 ID : 3327 (0xCFF) Menu ID : 33020 (0x80FC) Format : DVB Subtitle Codec ID : 6 Duration : 2 h 12 min Delay relative to video : 3 s 196 ms Language : Dutch Text #2 ID : 3328 (0xD00) Menu ID : 33020 (0x80FC) Format : DVB Subtitle Codec ID : 6 Duration : 2 h 12 min Delay relative to video : 3 s 196 ms Language : 888 Text #3 ID : 3329 (0xD01)-888 Menu ID : 33020 (0x80FC) Format : Teletext Subtitle Language : Dutch Language, more info : For hearing impaired people Other ID : 3329 (0xD01)-100 Menu ID : 33020 (0x80FC) Format : Teletext Language : Dutch
+ Reply to Thread
Results 1 to 28 of 28
In the last picture it looks like you were getting text. Was the text it was outputting wrong or something?
I did not check that, but it was -not- the text belonging to the documentary. I assume it belonged to either the commercials, etc. at the beginning or the stuff at the end. CCextractor only created 1 .srt.
Now, to get rid of confusion on that, I -just- removed the commercial at the beginning and the crap at the end (roughly 5-7 minutes) and am now re-encoding it to a .mkv file (TMPGEnc Video Mastering Works 6).
It'll take some time.
Maybe it will be easier then, meaning: just the actual documentary and a .mkv file.
Any suggestions how to easily extract the subtitles (also with CCExtractor?)
It took a while and then it crashed
An application error occurred in Subtitle Edit 220.127.116.11.
Please report at
httxx://github.com/SubtitleEdit/subtitleedit/issues with the
Error Message: Index was outside the bounds of the array.
at Nikse.SubtitleEdit.Core.BluRaySup.BluRaySupParser. BigEndian
Inti6(Bvte[l buffer, Int32 index)
Happened twice, so I gave up on that. Maybe the file was too large, a little over 10GB.
As written earlier, I am now re-encoding it and see what happens (or better: what comes out)
Yeah, large file can be a problem. You still try ProjectX. You'll have to select the ID of the sub (0xCFF or 0x80FC) and deactivate all other stream (not needed for sub extract). In preferences select export sub. Then it would be good. Maybe try a different color (still in preference).
Let's see what comes out, right now the encoding is at 60% now, the .mkv file is about 8.7GB.
I assume you'll loose any sub track after encode, right?
Didn't know. Good to know.
I use both of the above applications.
CCE serves to extract subtitles that appear in teletext from the stream. Remember to enter teletext page number with subtitles (Decoders / Teletext decoder / Page for subtitles), because sometimes the program does not draw anything. Sometimes, I have to pass a .ts file through VideoRedo, because CCE throws a communique with "too big a file".
I also use Subtitle Edit (only for dvb subtitles). You have to remember to install Tesseract. Subtitles extracted using this method always contain errors. You need to configure the tools contained in the SE for yourself.
Thanks for all the replies folks! Really appreciated.
Now, I have the recordings in both formats:
a. the original .ts file (with commercials at the beginning and the end) - size 10GB
b. an encoded to .mkv file (commercials removed)
Subtitle Edit: opening the .ts file - crashes - twice - it takes verrry long to open the file, it hangs at 99% and then crashes - file probably too big
CCExtractor: using the .ts file - the only thing I managed is to get subtitles of a part that I don't need, but not the part I need
that is the status so far.
Remember: I am not an expert, far from that...
I downloaded and installed MKVToolNix 21.0 - at the end of the setup it says:
If you need a GUI for mkvextract then give these projects a try:
Subtitle Edit with the .mkv file - no subtitles found
CCExtractor - tried various options, but frankly, I don't know exactly what option to use.. Anyway, nothing came out
mkvtoolnix: I have now launched mkvtoolnix and loaded the original .ts file
Got the below screen, but eh ... now what?
The 2nd screenshot shows when I load the .mkv file.
Last edited by Melan; 22nd Mar 2018 at 04:27.
I selected the subtitle track only, but what do I have to click on next...??
I believe tesseract is available.
Thanks for helping me out....
I installed teseract latest version and copied it to Subtitle Edit (delete existing one)
Import video, but no subtitles
MKVTool - did multiplex - file created xyz.mks
Within Subtitle Edit import mks file - nothing
MKVToolNix export to .txt file - nothing
Well, I have spent many hours on this. Apparently it is too complicated for all the tools I used.
CCextractor, TSDoctor, ProjectX (don't know how that works) I let it rest, it seems to be very complicated.
I think there is something peculiar with these subtitles.
I also tried idealshare using the original .ts file.
in my case subtitles are listed, but there are no [Save as] buttons.
when I use the .mkv file (removed commercials), there is no subtitle track, only audio/video, so
my guess is they are burned in.
All these unsuccessful attempts apply to the same .ts file?
Try to record other files from different TV channels.
If you can, put in a small sample of the file for hosting. I will see if something can be learned from it.
You got two files. One with DVB subtitles - you opened it.
The second contains subtitles in the teletext - you must open it in CCExtractor.
Enter the page number of the teletext with subtitles - in this case it is 777, because on this page in Poland subtitles are broadcast.
I thought maybe you do not know exactly what to do?
Once you have achieved what you showed on the screens, press "Lancer OCR".
The OK button (on the right) is only pressed after "reading" the last line.
Set everything up like on the screenshot.
Last edited by Melan; 24th Mar 2018 at 05:46.
Many many thanks for the help so far. The teletext thing indeed was the solution. In my case I had to take 889.
Initially I tried 888 (which 'officially' is teletext here, but in the CCExtractor results it showed 'Notice: Teletext page with possible subtitles detected: 889'
So I tried that one and got the results now. Super!
Again many many thanks.
It is very 'precise' - I mean - one needs to know to select teletext and even the exact page, else, nothing.
There is no room for mistakes. No results for dummies like me..
Of course I have made screenshots of all the steps for 'future cases', should I need it again.
Truly appreciated your help!
Easiest method I've found is just upload the .ts file to YouTube. Make sure it's on a channel you don't care about copyright strikes and upload it to "private" setting. YouTube will automatically parse out the subtitle file and you can then download it.