Is anyone familiar with the ins and outs of DVB subtitle extraction using ProjectX (or anything similar)? I've thus far managed to output some ugly but workable SUP and SUB files from a .TS recording of BBC channels, but supposedly it should also be possible to output .srt (though I'm wondering if this refers to what I'm assuming are now-defunct teletext subtitles, as opposed to digital image-based ones). Furthermore, the SUB/IDX files are supposed to be able to retain the formatting of the original DVB subs, whereas mine look chunky and off-colour.
Suffice to say, if anyone with a wealth of experience in this area can advise (or suggest a better tutorial than I've managed to find thus far), that would be appreciated!
On a purely technical note, what is the difference between closed-captioned subtitles and DVB subtitles?
+ Reply to Thread
Results 1 to 11 of 11
If you have the ProjectX settings correct, you can get the BMP images of the subtitles and, more importantly, the video.sup.idx and its partner video.sup.sub as well as video.son. With DirectVobSub installed Players such as Media Player will play the subtitles if it and video.sup.idx and video.sup.sub are named the same and in the same folder:
If you want the subtiles in SRT format, drop the video.sup.sub OR the video.son into Subtitle Edit and they both read well and can be saved ast SRT. The palette and Timings are also stored in the video.sup.idx file and can be viewed by using a Hex editor such as HxD. Below are the images of what I am talking about:
You can also pull the timings from the video.sup.idx as well:
ProjectX for Australian conditions and can be found under my signature - The Oz system was the same as BBC some years ago but things have changed in Europe but not down here. I would appreciate if you could post a MediaInfo report on one of your TS files. Teletext is defunct here but the page number for extraction remains the same (801) and you can certainly get srt or sup and sub depending on what you choose in the dropdown menu in ProjectX.
Thanks for the advice! I think, based on the above screenshots, I've already got things looking about as good as they're going to get (I'm already using the excellent Subtitle Edit to convert to srt, but I was under the impression - probably mistakenly - that Projext X outputted srt directly).
Re: a MediaInfo report, I'll try and record something this evening with DVB subtitles and post thereafter (I can also upload a small fragment of video if anyone wants a look).
I thought it did too and maybe someone else can help here but I can not ALWAYS get it to. The TS file I used above has teletext/DVB according to MediaInfo but no setting I've tried will save it. May I ask what method you use from ProjectX to get your SRT? Is it similar or different altogether?
ID : 533 (0x215)
Menu ID : 8325 (0x2085)
Format : DVB Subtitle
Codec ID : 6
Duration : 19s 468ms
Delay relative to video : 649ms
Language : English
ID : 258 (0x102)
Menu ID : 8325 (0x2085)
Duration : 22s 294ms
List : 530 (0x212) (MPEG Video) / 531 (0x213) (MPEG Audio, English) /
532 (0x214) (MPEG Audio) / 533 (0x215) (DVB Subtitle, English) /
2100 (0x834) () / 2102 (0x836) () / 2302 (0x8FE) () / 2301 (0x8FD) ()
Language : / English / / English
ProjectX can extract srt subtitles as long as the broadcaster is including text based subs - they may also include graphics format as well.
Reproduced below is a complete MediaInfo report on a typical Australian transmission from ABCTV the local equivalent of the BBC. As you can see there is a text based subtitle stream. ProjectX can extract this either as a srt or similar text based sub and it can produce a graphic version.
If the broadcaster only includes a graphic based subtitle stream then ProjectX cannot convert it to text based, you need an OCR program for that.
Format : MPEG-TS File size : 2.57 GiB Duration : 1h 5mn Overall bit rate mode : Variable Overall bit rate : 5 579 Kbps Video ID : 512 (0x200) Menu ID : 545 (0x221) Format : MPEG Video Format version : Version 2 Format profile : Main@Main Format settings, BVOP : Yes Format settings, Matrix : Custom Format settings, GOP : M=3, N=12 Codec ID : 2 Duration : 1h 5mn Bit rate mode : Variable Bit rate : 5 045 Kbps Maximum bit rate : 10 000 Kbps Width : 720 pixels Height : 576 pixels Display aspect ratio : 16:9 Frame rate : 25.000 fps Standard : PAL Color space : YUV Chroma subsampling : 4:2:0 Bit depth : 8 bits Scan type : Interlaced Scan order : Top Field First Compression mode : Lossy Bits/(Pixel*Frame) : 0.487 Stream size : 2.32 GiB (90%) Audio ID : 650 (0x28A) Menu ID : 545 (0x221) Format : MPEG Audio Format version : Version 1 Format profile : Layer 2 Codec ID : 4 Duration : 1h 5mn Bit rate mode : Constant Bit rate : 256 Kbps Maximum bit rate : 288 Kbps Channel(s) : 2 channels Sampling rate : 48.0 KHz Compression mode : Lossy Delay relative to video : -649ms Stream size : 121 MiB (5%) Language : English Text ID : 576 (0x240)-801 Menu ID : 545 (0x221) Format : Teletext Subtitle Language : English Menu ID : 256 (0x100) Menu ID : 545 (0x221) List : 512 (0x200) (MPEG Video) / 576 (0x240) () / 650 (0x28A) (MPEG Audio, English) Language : / / English Maximum bit rate : 7048000
Thanks, that confirms what I suspected. Unless anyone has reason to believe otherwise, I don't think any of the European channels have text-based subtitles (at least none that I have access to >> U.K., Poland, France and Germany).
Does Australia offer text-based subtitles because they still have analogue broadcasts, or is this just a different kind of service?
Here's the MediaInfo information from a DVB recording I've just made:
ID : 1 (0x1) Complete name : I:\TV Recordings\Attenborough- 60 Years in the Wild_BBC 2 England.ts Format : MPEG-TS File size : 188 MiB Duration : 6mn 0s Overall bit rate mode : Variable Overall bit rate : 4 384 Kbps
ID : 552 (0x228) Menu ID : 1 (0x1) Format : MPEG Video Format version : Version 2 Format profile : Main@Main Format settings, BVOP : Yes Format settings, Matrix : Custom Format settings, GOP : Variable Codec ID : 2 Duration : 6mn 0s Bit rate mode : Variable Bit rate : 3 715 Kbps Maximum bit rate : 15.0 Mbps Width : 720 pixels Height : 576 pixels Display aspect ratio : 16:9 Frame rate : 25.000 fps Standard : PAL Color space : YUV Chroma subsampling : 4:2:0 Bit depth : 8 bits Scan type : Interlaced Scan order : Top Field First Compression mode : Lossy Bits/(Pixel*Frame) : 0.358 Stream size : 160 MiB (85%)
ID : 554 (0x22A) Menu ID : 1 (0x1) Format : MPEG Audio Format version : Version 1 Format profile : Layer 2 Codec ID : 3 Duration : 6mn 0s Bit rate mode : Constant Bit rate : 224 Kbps Channel(s) : 2 channels Sampling rate : 48.0 KHz Compression mode : Lossy Stream size : 9.63 MiB (5%) Language : nar
ID : 555 (0x22B) Menu ID : 1 (0x1) Format : MPEG Audio Format version : Version 1 Format profile : Layer 2 Codec ID : 3 Duration : 6mn 0s Bit rate mode : Constant Bit rate : 224 Kbps Channel(s) : 2 channels Sampling rate : 48.0 KHz Compression mode : Lossy Delay relative to video : 14ms Stream size : 9.63 MiB (5%) Language : English
ID : 551 (0x227) Menu ID : 1 (0x1) Format : DVB Subtitle Codec ID : 6 Duration : 5mn 57s Delay relative to video : 2s 813ms Language : English
They are DVB bitmap subs according to your Mediainfo report (BTW with MediaInfo use Tree - Text format as you can just copy and paste, makes it a lot easier to read than the way you did) therefore all you can do is OCR them. ProjectX can't convert them to text based. There is a thread on Doom from 2009 discussing this. http://forum.doom9.org/showthread.php?p=1303582#post1303582
i know there are 2 types of subtitles (teletext and dvb bitmap). teletext is pretty much soon to be dead in the uk so i'm not really interested in it. i can rip dvb subs with Projectx as bitmaps (.sup ans sup/idx) just fine. the problem lies in the OCR, i haven't found the right solution for that yet (with subrip or anything else). the fonts on the subs are all the same for all the recordings i have made so it should be very possible. here is a link to some .ts files.
Also see this http://www.acessibilidade.net/tdt/DVB_Subtitling_FAQ.pdf
Last edited by netmask56; 30th Nov 2012 at 17:42.