VideoHelp Forum
+ Reply to Thread
Results 1 to 2 of 2
Thread
  1. Member
    Join Date
    Jul 2010
    Location
    Germany
    Search Comp PM
    How to read timestamps from a chapter file to use them in mkvmerge as splitting timecodes


    Problem: You want to split a matroska file you are about to create. You include chapters anyway and it makes sense to split the file at the begin of each chapter. (Concert, Mixtape, etc.) Copying the timecodes manually from the chapter file works well but it's time consuming.


    Goal: Extract time codes from a chapter file (.XML, .txt) to use them as timecodes for the splitting feature of mkvmerge with the help of some tools.




    Tools:
    mkvmergeGUI (mmg) (part of mkvtoolnix) or just mkvmerge, muxing application
    Grep, a command-line tool to read strings from files, there is a build for windows
    Notepad++, a text editor with some nice functions


    Note that I'm on Windows, but this should work on any system. I'm 100% sure there are text editors you can use. As I'm on Windows this guide explains how to do all that on a Windows system. I'm sure you can transfer those steps to Linux.




    Step 1:


    Get all the tools and install them. You can use portable versions if available.


    Step 2:


    Make sure you have an .XML or .txt file that contains the chapter. In case you create your matroska file from an existing matroska file that contains the chapters you can extract them with mkvextract (command-line tool that comes with mkvtoolnix). There are tools with a GUI that can help you to extract the chapters; like MKVCleaver).


    Look into the file to see what format of chapters has been used. (XML should always be the second)



    Code:
    CHAPTER01=00:00:30.000
    CHAPTER01NAME=Name of first chapter
    CHAPTER02=00:05:00.000
    CHAPTER02NAME=Name of second chapter ?
     ...


    HTML Code:
     	 	 ?<?xml version="1.0" encoding="UTF-8"?>
     ...
     <ChapterAtom>
     <ChapterUID>0123456789</ChapterUID>
     <ChapterTimeStart>00:00:30.000000000</ChapterTimeStart>
     <ChapterFlagHidden>0</ChapterFlagHidden>
     <ChapterFlagEnabled>1</ChapterFlagEnabled>
     <ChapterDisplay>
     <ChapterString>Name of first chapter</ChapterString>
     </ChapterDisplay>
     </ChapterAtom>
     <ChapterAtom>
     <ChapterUID>1234567890</ChapterUID>
     <ChapterTimeStart>00:05:00.000000000</ChapterTimeStart>
     <ChapterFlagHidden>0</ChapterFlagHidden>
     <ChapterFlagEnabled>1</ChapterFlagEnabled>
     <ChapterDisplay>
     <ChapterString>Name of second chapter</ChapterString>
     </ChapterDisplay>



    Maybe a
    HTML Code:
     	 	 <ChapterTimeEnd>
    defines where a chapter ends.


    Note I modified the appearence of the XML version.


    Step 3:


    Launch grep


    "C:\Program Files (x86)\GnuWin32\bin\grep.exe -i" string "C:\location where the chapter file is\chapter file.extenstion"


    "-i" is to ignore the case


    for the simple chapters replace "string" with "Chapter[0-9][0-9]="
    output should look like
    Code:
    CHAPTER01=00:00:30.000
     CHAPTER02=00:05:00.000
    for XML chapter replace "string" with "chaptertimestart"
    output should look like
    HTML Code:
     	 	 <ChapterTimeStart>00:00:30.000000000</ChapterTimeStart>
      <ChapterTimeStart>00:05:00.000000000</ChapterTimeStart>


    Step 4



    Open Notepad++


    Simple chapters:
    Paste the output from the simple chapters. Remove chapters that should not be used as splitting timecodes. (The first or last one for example)
    Mark everything. Press CTRL+H (replace function)
    Select Regular Expression
    Search for "CHAPTER[0-9][0-9]=" (mind the case)
    replace with nothing (leave the box blank) (use replace all)
    Mark everything and press CTRL+J to merge it into a single line
    Now you got
    00:00:30.000 00:05:00.000
    but for mkvmerge you need to have those values separated by a comma.
    Replace the spaces with a comma (the replace function can do that for you)




    XML chapters:
    Paste the output from grep in Notepad++
    There might be spaces at the begin, but you can remove them easily
    Remove chapters that should not be used as splitting timecodes. (The first or last one for example)
    Mark everything. Press CTRL+H (replace function)
    Select Regular Expression
    Search for "[a-zA-Z<>/]"
    Replace with nothing (leave the box blank) (use replace all)
    Mark everything and press CTRL+J to merge it into a single line
    The time codes you got are separated by spaces, but mkvmerge requires commas
    Simply replace the spaces with a comma (the replace function can do that for you)


    Step 5


    Open mkvmergeGUI or just mkvmerge if you like to work with command-line tools.
    Prepare everything for muxing
    Use the line with the timecodes you just created to feed mkvmerge with them.
    For the GUI you find it at [Global] → Splitting → after timecodes


    Mux it. Done.



    Authors Note: Yes you see right no screenshots. I think they are not needed. I'm not working with regular expression normally, but I used them to simplify the guide. I'm sure it's possible to create a script to do that even quicker. Maybe a little application could do that too.

    EDIT: Some stuff got filtered.
    Quote Quote  
  2. Member
    Join Date
    Jul 2010
    Location
    Germany
    Search Comp PM
    I refuse to edit the "How to", because the XML code (from the chapters) is filtered out every time I edit the post. (There's a spelling mistake that does not affect the content)

    EDIT: Not that anybody gets me wrong. I agree that HTML code, XML, JavaScript etc need to filtered. It's a security feature and I can be glad to post here.

    The format especially the alignment of the XML stuff, but it's readable.
    Last edited by bastik; 24th Jul 2010 at 12:44.
    Quote Quote  



Similar Threads

Visit our sponsor! Try DVDFab and backup Blu-rays!