Hi. I have a several hours video of a guy who speaks some words on it.

Also I have a text file *.srt file (from YouTube subtitles downloader) which contains thousands of strings like:
Code:
1
00:00:00,719 --> 00:00:02,919
hello

2
00:00:02,919 --> 00:00:04,759
my name is

3
00:00:04,759 --> 00:00:06,399
Sam
A guy on the video says "Hello my name is Sam". My final goal is to mix up some words to something like "name my hello Sam" and render it as new video file.

If it was so simple as this basic example I would do it manually, but the thing is in my case I need to process about 30 hours long video of guy speech that contains thousand words or more in it *.srt file. So I can't do it manually.

I'm looking for automation solution (Windows software). Some kind of automatic batch video cutter, fragments joiner or similar. Something to help me do it faster. I'm looking for a tool that will quickly pick up some random words from a 30 hour long video based on my time codes from the *.srt file and mix them up, every time it's different words and different time code. Thanks

specs of my source video file:
Code:
Video: MPEG4 Video (H264) 1280x720 30fps 1399kbps [V: h264 main L3.1, yuv420p, 1280x720, 1399 kb/s]
Audio: AAC 48000Hz stereo 126kbps [A: SoundHandler (aac lc, 48000 Hz, stereo, 126 kb/s)]