I am test-converting MPEG video to DV-AVI using both Adobe Media Encoder and VirtualDubMod (using Cedocida). I have converted the same VOB file using both programs, then opened both DV files in MediaInfo to see the differences. There aren't many, but I found a major one that I'm hoping someone can explain: audio interleaving (both files use PCM uncompressed).
In Adobe Media Encoder:
Interleave, duration: 957 ms (28.67 video frames)
Interleave, preload duration: 967 ms
Interleave, duration: 33 ms (1.00 video frame)
Interleave, preload duration: 500 ms
That's a pretty big difference. I don't understand what it means (though I did read a post here that explains the general concept of interleaving so I get that). I can set the two parameters manually in VirtualDubMod, but is that necessary? Also note that on another file comparison, the numbers were similar to the above - AME is always much higher than VDM. So I am wondering why, and if it matters or if it's anything to worry about.
For the record, the only other discrepancies I noted is that the Adobe DV file registers as Commercial Name: DVCPRO in MediaInfo, while the VDM Cedocida file has no commercial name listed. Also, the Adobe file lists the time code of first frame and time code source, while the VDM file does not.
Appreciative of any insight anyone can give. Thanks so much for your help.
+ Reply to Thread
Results 1 to 11 of 11
Audio and video are interleaved in an AVI file. That means you have a little video, a little, audio, a little video, a little audio, etc. Different programs use different interleave chunk sizes VirtualDub is interleaving 1 video frame, one fraem's worth of audio, one video frame, one frame worth of audio (you can change the chunk size in the audio settings). AME is using larger chunks, ~29 video frames, 29 video frames worth of audio, etc.
Like putting the 2 sides' teeth together in a zipper. Just one uses small teeth, another uses larger teeth.
No difference as long as a playback buffer can keep up with realtime needs. In stress situations, there probably would be less issues with the small chunk size, though the larger chunk size could be considered more efficient from a continuous data gathering view.
One may get a better understanding/offer some guidance if one knew what you are attempting to achieve.
Both mpeg2 and DV are compressed (DV somewhat less that mpeg2). Yet you can not recover what is lost in the original mpeg2 by converting it to DV.
It is oft suggested that you convert a lossy format to a lossless one for the purpose of editing/post capture enhancement. But DV is not lossless so there is no real gain in doing that.
When I was actively capturing VHS I transcoded DV from a ADVC to mpeg2 since my final destination was DVD. Even then the transcoded peg2 was not truly DVD compliant and was usually re-encoded.
But that aside state you purpose other than experimentation. DV is a good archive format but IMO if your original is mpeg2 then that is also acceptable.
If your purpose is post-capture enhancement you would be better off converting that mpeg2 to lagarith or another lossless format.
As for your original question there is a vast difference between interleaving per frame and per second.
Thanks so much for the replies and info!
DB83, it really is just an experiment - I was curious to see if the DV file produced by VirtualDubMod was similar in specs to the DV file produced by Adobe Media Encoder. I used a VOB as the input. That's it. I posted because I was curious as to why the interleaves in the audio were so vastly different while almost every single other parameter was identical. If one duration is preferred over the other for some reason, I'd probably prefer to use that software for DV conversions.
You mentioned my original question but did not answer it. Which is preferred, in your opinion - interleaving every 28.98 frames (967 ms) or every frame (33 ms)?
Ok. I did not answer it since there were already replies before me that IMO have a greater knowledge of digital video then I do and they did not answer it either.
But surely it is obvious. A video that interleaves every frame is much more accurate than one that interleaves every second.
When you watch a movie you don't watch all the video first, then listen to all the audio (or vice versa) you watch the video and listen to the audio at the same time. If the file contained all the video first, then all the audio at the end the computer would have to seek back and forth to play video and audio at the same time. That's possible on a hard drive where seek times are a few milliseconds (one frame of video at 30 fps lasts about 33 milliseconds). But it's catastrophic on slow devices like CD and DVD where seek times are measured in seconds. Interleaving the data makes it possible to read the file linearly (without any seeking) to get a bit of each then start playing them. While the first pair (video+audio) is playing then next pair is read (again, without any seeking). Once the first pair is done playing the second pair is already in memory so they can be played, then while that pair is playing the next pair can be read, etc.
The optimal interleave size depends on many things including the frame size, frame rate, the codecs used, the speed/size of the hardware, etc. But it's typical for the interleaving to be anywhere from 1 frame of video to 1 second of video (eg, 30 frames at 30 frames per second). The latter requires more memory (as more video and audio data needs to be stored temporarily in memory) and is slightly more space efficient (fewer audio chunks). The interleave size make no difference in accuracy.
"Preload" is the amount of audio that appears before the first frame of video (not played before the video, but as it appears in within the file). Again, this is done to make it easier for players to handle the data.
Last edited by jagabo; 16th Jan 2018 at 18:45.
To be clear, "accuracy" has nothing to do with it. Assuming competent, non-impared, matched subsystems, both should be fully accurate, fully smooth experiences. Particularly with I-frame (DV in this case) material.
I stand corrected. I was just imagining the scenario of an interleave for DV of more than one frame and the possibility of errors creeping in.
So are you actually saying that for DV it is feasible to have an interleave of more than one frame - one second in the OP's case - or am I simply confusing I-frame for DV with interleaving which I accept can alter according to the codec.
The interleaving of DV AVI is no different than that of any other AVI. The program creating the AVI file uses whatever interleaving it wants.
Note that the DV stream that arrives from the camcorder via firewire has audio and video already interleaved (I don't know at what size). In a Type 1 DV AVI that interleaved data is saved as a single stream in the file, marked as a video stream. There is no explicit audio stream. This was done to minimize the amount of processing the capture program had to perform. But it means many programs see no audio since there is no explicit audio stream. In a Type 2 DV AVI file the program creating the file extracts a copy of the audio from the DV stream and saves it as a separate audio stream.
Last edited by jagabo; 17th Jan 2018 at 08:58.