I am working on a project that uses video encoded in real-time from a webcam input. Normally, this wouldn't be an extremely complicated task, however, the nature of the project requires the encoded frames to be positioned precisely at the same time (there is wiggle room of about +-25ms, but that's it) in the final video stream as the time they were captured.

I.e. if our video is 10 mins long then the final frame absolutely has to be at 10:00 and a frame captured at 5:37 has to be at 5:37 in the video.

We are using VP8 in it's vpxenc implementation as our video encoder and so far the best solution we have found to this problem was to separately store timecode data and remux the video after the encoding using mkvmerge. But doing this puts several extra steps in our already complicated workflow as well as bring additional server load.

So, what I would like to ask is if there is a way to put those timecodes in directly at encode time, so we can skip the remuxing step in the end. Hopefully, someone has already encountered a similar problem and can be of help.