I'm just going through a set of media files and have discovered that the embedded subs are borked in a strange way, pretty much every character has generated a totally new entry, thus you get output like this......

Code:
4
00:00:07,474 --> 00:00:07,507
*P****************************

5
00:00:07,507 --> 00:00:07,540
*Pre**************************

6
00:00:07,540 --> 00:00:07,574
*Previ************************

7
00:00:07,574 --> 00:00:07,607
*Previou**********************

8
00:00:07,607 --> 00:00:07,640
*Previousl********************

9
00:00:07,640 --> 00:00:07,674
*Previously*******************

10
00:00:07,674 --> 00:00:08,074
*Previously*on****************
Dumping the subs into a srt file generates a circa 3mb subs file which is just stupid, not to mention making the subs inside the media files pretty much unusable.

As I can dump the subs and re-embed back using something like MKVToolNix the obvious question is, is this a known problem that's happened before with a easy fix? I realise it would be possible to maybe throw together a bash script to try to make sense of the files, but it would still require a huge amount of user verification to make sure they'd be 'fixed' correctly, almost to the point of having to watch ever file all the way through and there's nearly 50 hours of media files here.

Any ideas?