Hi All,

As part of a German language learning and revision exercise, I would like to be able to convert German text files to mobile-phone (3GP) video files, consisting of an audio track produced by a text-to-speech synthesiser, and a video track showing captions of what is currently being spoken.


I have spent a couple of weeks thinking about this "text-to-video problem", and I have created the following five test videos :-

https://rapidshare.com/files/2226778865/Test1_3GP.ZIP (5.40 MB*) (*click green Download button to download ZIP file)
https://rapidshare.com/files/3027480982/Test2_3GP.ZIP (4.26 MB*)
https://rapidshare.com/files/1404446408/Test3_3GP.ZIP (2.93 MB*)
https://rapidshare.com/files/1727854584/Test4_3GP.ZIP (4.37 MB*)
https://rapidshare.com/files/2463102051/Test5_3GP.ZIP (2.30 MB*)

I have listed below the method I used to create these videos.

My Questions are :-

[1] Is there an "easier" way to create videos similar to these? The reason I'm asking this question is because the method I'm currently using seems - to me - to be "complicated".
[2] Does anyone have any experience creating videos similar to these? What software, etc are you using to create your videos?

The method I used to create my test videos is as follows :-

[a] Create a text file containing the text which is to be converted to video.
[b] Using a little Visual Basic application, "pre-format" the text. (Basically, any sentence which is too long to fit into 176 x 144 is cut into multiple "pieces". Likewise, any groups of consecutive short sentences which are small enough to fit onto 176 x 144 are grouped together into one "piece".)
[c] Using the same Visual Basic application, display each "piece of text" in turn, and play the speech using CEPSTRAL.
[d] Using CAM STUDIO, record the Visual Basic application to an 176 x 144 pixel AVI file.
[e] Observe that the quality(clarity) of the audio track recorded in the AVI is not quite as perfect as the quality which CEPSTRAL is capable of producing. (This is where it starts getting complicated. But basically, on Windows 7, audio loses quality when it is routed via Visual Basic.)
[f] Using CEPSTRAL, re-record the text file directly to a WAV file (with perfect quality).
[g] Using VIRTUAL DUB, extract the "Cam Studio WAV" from the "Cam Studio AVI".
[h] Using AUDACITY, observe that the timing of the "Cam Studio WAV" is not the same as the timing of the "Cepstral quality WAV". (Specifically, step [c] introduces a slight delay between each "piece of text" in the "Cam Studio WAV" , which is not in the "Cepstral quality WAV".)
[i] Using a calculator(!) calculate the "delay difference" between each "piece of text" , between the "Cam Studio WAV" and the "Cepstral quality WAV". (Actually Not as difficult as it sounds!)
[j] Using CEPSTRAL, re-record the text file directly to a WAV file (with perfect quality and also correct timing).
[k] Using VIRTUAL DUB, re-make the AVI using the WAV file from step [j].

Thank you for your time,
Best regards,
James