Hello, I am looking for some help extracting the audio of some footage that was recorded with a Logitech C615 webcam. Thus far I am able to playback the file and get video (however it plays back too fast), but I cannot figure out how to de-mux or somehow extract the audio stream.
Here is the background, while recording the webcam using the Logitech Webcam software, the raw video stream (which is 1920х1080 @ 15 fps MJPG) and raw audio stream (which I believe is sampled at 44.1 kHz, is 16-bit, LPCM format) are stored as a .tmp file and not encapsulated into a standard container. If the webcam recording is stopped properly using the Logitech software controls, the software then encodes the raw video and audio data into a compressed .wmv file. This encoding process takes nearly as long as the duration of the recording. If for some reason this process is terminated, it is not possible to resume encoding. All that is left behind is this .tmp file. This is what happened to me. I realize this is really a short coming of the Logitech software, and I will not be using it in the future. Logitech is of zero help, and I really need to recover the audio from the raw .tmp file.
Interestingly, if I simply rename the .tmp file to .mjpg, the video plays successfully using VLC (but plays too fast), but I am not able to get the audio.
I have tried tons of video/audio repair tools and video/audio editors (over 20 different programs in total) to try to extract the audio, but without success. The closest I have gotten to success is by renaming the .tmp file to .raw and importing it into Adobe Audition, where it has you choose the sample rate, bit depth, and other parameters of the audio stream, but this only results in a bunch of white noise.
I had posted a thread in the video conversion section about this same file I am trying to recover, which is how I figured out the video part of it. But, for the life of me, I cannot figure out how to get the audio. I have pasted below some guidance I received in the other thread, but I don't know how to go about doing what was suggested, sorry I am kind of a newbie at all this.
From what I gather, the audio is somehow multiplexed into the video data, which is why I am guessing that I just get white noise when importing it into Audition.
I have uploaded a shorter sample of a similar .tmp file, but I had to rename it to .wmv in order to be allowed to upload it.
Any help on this would be greatly appreciated.
Some guidance I received in a private message is below. It is related to this thread:
Not sure which app you used to do the recording, but the app and its methods are key to why it is doing it in this way.
Btw, webcams such as logitech cams, are universally available to any recording app that supports the uvc protocol, and it is not the cam, nor the protocol, that determines the codec or the container, it is the app. It picks from one of the available codec options provided by the cam (that I mentioned in that prev. thread), tells the cam to output in that format, and then captures the incoming stream, and encapsulates it in a standard mm container.
The reason your app has a tmp file is that it is a somewhat poorly written app, to where it just "saves" the raw stream, then processes the tmp after the capture has completed, into a standard container. If it were better written, it would stream & encapsulate, just-in-time while the stream is coming in. It is also unclear, but very possible, whether the app used is transcoding between the raw save and the encapsulation. That would be bad, as it usually generates further loss.
Logic would dictate that the stream of the tmp is either a raw video-only stream, or a multiplexed (zippered) combination of vid & aud. Much here depends on how the app works. That cam OUGHT to be supplying both.
I would do what was suggested previously regarding the runtime calc to determine the bitrate. This will indicate which of the codecs are used (uncompressed vs mjpeg vs h264...). Then you could do a raw read (testing various packing options), an elementary stream mjpeg read, or an elementary h264 stream read using ffmpeg or avisynth.
I use both, but not regularly, so am not the best reference for this.
The fact that your ffmpeg script outputted an "invalid data" errormessage could mean it is the wrong codec type option, or it could mean you have junk (aka no good data). Hard to tell without trying further codec options.
If the raw stream is multiplexed, that will much further complicate reading properly.
Hope that helps,
+ Reply to Thread
Results 1 to 5 of 5
Your file isn't a video file
The sample is some kind of raw mjpeg stream. I don't know how to get the audio out of it but you can remux the video to another container. ffprobe shows timestamps every 40 ms, 25 fps. Rename the file with an mjpeg extension then use ffmpeg:
ffmpeg -i sample.mjpeg -r 25 -codec copy sample.avi
There was another thread about these videos here a few months ago. I don't recall if there were any resolutions.
Last edited by jagabo; 5th Feb 2020 at 10:04.
Perhaps you can:
Mux the raw stream.
Demux to a new raw stream.
It might just extract purely the video frames.
If so, then use some form of hex file comparator/diff. The "difference" should be the audio.
But again, it would help much more to get deets from Logitech regarding the stream packetization order, because then you could just strip them yourself.
Thanks for your response to this thread.
Could you possibly give me some direction of what tools and the exact steps I would need to take to perform the actions you suggest? Please see my specific questions below:
Steps you suggested:
1) Mux the raw stream.
I am not clear as to why you suggest this? Isn't the raw stream already multiplexed mjpeg video and lpcm audio?
What tool could I use to perform this function? Any specific parameters I will need to specify to perform this successfully? Will it be useful that I know at least some of the stream parameters which are: Video= mjpeg, 1920 x 1080, 15 fps, Audio: lpcm 16-bit, mono, sample rate 44.1 kHz?
2) Demux to a new raw stream. It might just extract purely the video frames. If so, then use some form of hex file comparator/diff. The "difference" should be the audio.
This I logically totally understand, it makes sense that if I can separate out exactly just the video data, what would be left over would be the audio data. Can you give me some guidance on how I could accomplish this? What tools should I use and what steps to perform the following, a) Demux the stream and extract just the video data, and b) What tool could I use to do the hex comparison/difference you mention that would give me just the audio data left over?
3) But again, it would help much more to get deets from Logitech regarding the stream packetization order, because then you could just strip them yourself.
I have a support request in to Logitech, I will request to have them provide the "stream packetization order", which I assume is information specifying the way that the video and audio are multiplexed into the single stream that is in the .tmp file, is that correct? If I knew the packetization order, how exactly could that be used to "strip" out the audio data?
Again Scott, thank you so much for all of your help, I know I may be asking a bunch of newbie questions, I really appreciate you, your patience, and the time you have taken thus far to help me, you are awesome man!