Extracting audio losslessly with ffmpeg or mkvextract

21st Aug 2019 14:14 #1

Member

I'm a bit confused about audio extraction with tools like ffmpeg or mkvextract. I did some research but couldn't find much info about this specific topic.

Basically what I am trying to do is extract opus audio from webm container (losslessly of course, without re-encoding). I've tried multiple methods using the tools mentioned above and performed spectrum analysis on all files afterwards.
It seems like the output files have some data lost/changed during the process, which I don't think should be the case. I'm not sure if the results are 100% accurate, though multiple programs confirm it.

How can I be sure that the extraction is successfull and the data matches exactly the original?

Here is some info and images for side-by-side comparison:

Source file: audio.webm
Size: 2.94 MB
Spek:

[Attachment 49874 - Click to enlarge]
Audacity:

[Attachment 49875 - Click to enlarge]

Extracted file using ffmpeg: audio_extracted_ffmpeg.opus
Size: 2.90 MB
Spek:

[Attachment 49876 - Click to enlarge]
Audacity:

[Attachment 49877 - Click to enlarge]

Extracted file using mkvextract: audio_extracted_mkvextract.opus
Size: 2.91 MB
Spek:

[Attachment 49879 - Click to enlarge]
Audacity:

[Attachment 49880 - Click to enlarge]

Comparison between the source webm and the ffmpeg opus in Audacity:

[Attachment 49881 - Click to enlarge]

[Attachment 49882 - Click to enlarge]

This is the output of the ffmpeg extraction:
Code:
ffmpeg version 4.2 Copyright (c) 2000-2019 the FFmpeg developers
  built with gcc 9.1.1 (GCC) 20190807
  configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libdav1d --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libmfx --enable-amf --enable-ffnvcodec --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth --enable-libopenmpt
  libavutil      56. 31.100 / 56. 31.100
  libavcodec     58. 54.100 / 58. 54.100
  libavformat    58. 29.100 / 58. 29.100
  libavdevice    58.  8.100 / 58.  8.100
  libavfilter     7. 57.100 /  7. 57.100
  libswscale      5.  5.100 /  5.  5.100
  libswresample   3.  5.100 /  3.  5.100
  libpostproc    55.  5.100 / 55.  5.100
Input #0, matroska,webm, from 'files/audio.webm':
  Metadata:
    encoder         : google/video-file
  Duration: 00:03:08.30, start: -0.007000, bitrate: 131 kb/s
    Stream #0:0(eng): Audio: opus, 48000 Hz, stereo, fltp (default)
Output #0, opus, to 'files/ffmpeg/audio_extracted_ffmpeg.opus':
  Metadata:
    encoder         : Lavf58.29.100
    Stream #0:0(eng): Audio: opus, 48000 Hz, stereo, fltp (default)
    Metadata:
      encoder         : Lavf58.29.100
Stream mapping:
  Stream #0:0 -> #0:0 (copy)
Press [q] to stop, [?] for help
size=    2975kB time=00:03:08.28 bitrate= 129.5kbits/s speed=6.57e+03x
video:0kB audio:2952kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.803061%
And the mkvextract:
Code:
Extracting track 0 with the CodecID 'A_OPUS' to the file 'files/mkvextract/audio_extracted_mkvextract.opus'. Container format: Ogg (Opus in Ogg)
Progress: 100%

Quote

21st Aug 2019 14:34 #2
poisondeathray

View Profile

View Forum Posts

Private Message
Member

Join Date
Sep 2007

Location
Canada
Are you using the same decoder version for all ?

spek 0.8.2 was released in 2013. Before current opus revisions

Make sure the decoder you are using is newer, or compiled with newer ffmpeg support

Or decode file to wav then run that through spek

Quote
21st Aug 2019 14:48 #3
Alexander24

View Profile

View Forum Posts

Private Message
Member

Join Date
Aug 2019
Converted the files into 16-bit wavs, this is the output from spek

Source webm:

[Attachment 49883 - Click to enlarge]

ffmpeg opus:

[Attachment 49884 - Click to enlarge]

mkvextract opus:

[Attachment 49885 - Click to enlarge]

There is smaller difference between the source file and the one extracted via mkvextract, but very noticable difference between the ffmpeg and the rest

Quote
21st Aug 2019 15:00 #4
poisondeathray

View Profile

View Forum Posts

Private Message
Member

Join Date
Sep 2007

Location
Canada
Not sure what is going on

How were they converted exactly ? You can get different dithering algorithms applied when converting from fltp to 16bit

Different containers can have slightly different offsets as well. Compressed audio can have different delays, and there can be differences between say mkv(webm) and ogg or something like mp4

For example, you have a -0.007 start time in the webm container according to ffmpeg . What does mkvmerge think the start time is ? or mediainfo ?

If you extracted it without the offset (zero start time), the audio would be shifted slightly when compared to it inside the container (webm converting to pcm wav directly) .

Quote
21st Aug 2019 15:17 #5
sneaker

View Profile

View Forum Posts

Private Message
Member

Join Date
Sep 2014
Yes, Opus can be tricky. I have seen a number of tickets for ffmpeg and mkvtoolnix about e.g. Opus' "discard padding". This is on top of the usual "delay at start" and "padding at the end of the stream" problems known from other lossy codecs.

Quote

21st Aug 2019 15:34 #6

Alexander24

Member

Well, in that case, I guess it might be due to misalignment? I've never had trouble with ffmpeg concatenation or transcoding. Since mkvextract gives closer output compared to the original, the start offset of ffmpeg would be the main issue. I don't know if that applies to all containers with opus codec, but is there any convenient way to fix the padding/offset issue if transcoding a batch of files becomes necessary?

This is medainfo of the source webm:
Code:
General
CompleteName                     : C:\Users\Alexander\Desktop\Scripts\files\audio.webm
Format/String                    : WebM
Format_Version                   : Version 4
FileSize/String                  : 2.95 MiB
Duration/String                  : 3 min 8 s
OverallBitRate/String            : 131 kb/s
Encoded_Application/String       : google/video-file
Encoded_Library/String           : google/video-file

Audio
ID/String                        : 1
Format/String                    : Opus
CodecID                          : A_OPUS
Duration/String                  : 3 min 8 s
Channel(s)/String                : 2 channels
ChannelLayout                    : L R
SamplingRate/String              : 48.0 kHz
BitDepth/String                  : 16 bits
Compression_Mode/String          : Lossy
Language/String                  : English
Default/String                   : Yes
Forced/String                    : No
And those are the parameters used:
Code:
ffmpeg.exe -i "source.webm" -vn -acodec copy "output.opus"

mkvextract.exe "source.webm" tracks 0:"output.opus"
The wav conversion was done in Audacity. I tested 32-bit signed PCM wav and the results had less noise in the spectrogram, but there is still some difference between all of the files.

Quote

21st Aug 2019 15:42 #7
poisondeathray

View Profile

View Forum Posts

Private Message
Member

Join Date
Sep 2007

Location
Canada
If a start offset is partially contributing to the difference, in audacity, you can zoom way in to the start and you should be able to see the difference when comparing webm version loaded directly in audacity to extracted versions in audacity .

But there seems to be more differences than just a shift .

Maybe try the opus decoder directly from libopus or opus-tools . Maybe some ffmpeg implementation issue (audacity is using ffmpeg to decode isn' t it? )

Not sure what to do or how to handle it. Maybe wait until all the tickets sneaker referred to get resolved

Quote
21st Aug 2019 15:54 #8
Alexander24

View Profile

View Forum Posts

Private Message
Member

Join Date
Aug 2019
Just checked the verbose info of the source file. It is affected by the ffmpeg 1ms frame delay issue. https://pastebin.com/cCh2HmV2

It's a lossy re-encoded version with ffmpeg in the first place, because it's downloaded from YouTube... I'll try libopus and hope that it works well for batch muxing. I'm open to other suggestions as well.

Last edited by Alexander24; 21st Aug 2019 at 19:06.

Quote
21st Aug 2019 16:03 #9
poisondeathray

View Profile

View Forum Posts

Private Message
Member

Join Date
Sep 2007

Location
Canada
You can batch mux with ffmpeg or mkvmerge too ... but there seems to be more going on here than just an offset . And what happens in other files where the offset is larger and required for sync ? If you remove that it will go out of sync

Quote
23rd Aug 2019 05:55 #10
Alexander24

View Profile

View Forum Posts

Private Message
Member

Join Date
Aug 2019
So, after extensive testing I came up with a bit more accurate results.

Firstly I converted the opus audio stream from the matroska container to multiple WAVs. Then using ffmpeg's MD5 hashing function I validated the audio streams of each WAV (track #0 is the only one in this case), including the original source file.

Looks like the source, 16-bit signed WAV, 32-bit float WAV and 64-bit float WAV have the same hash (same audio stream data) which is excellent.

[Attachment 49896 - Click to enlarge]

The 24-bit signed WAV and 32-bit signed WAV share the same hash themselves, but don't match the rest.

[Attachment 49897 - Click to enlarge]

So the only issue currently present could be the start offset of -0.007 which ffmpeg detects and uses for the remux process.
Is there any way to set this value to flat 0.000000 on the original (source) file without need of re-encoding, or ignore the -0.007 and straight up use 0.000000 when remuxing?

[Attachment 49898 - Click to enlarge]

I'm not quite sure how to deal with alignment, seeking, DTS/PTS timestamps and similar advanced stuff if necessary.

Quote

Extracting audio losslessly with ffmpeg or mkvextract

Thread Tools

Search Thread

Similar Threads

MKVExtract, etc., only converting first few seconds???

Weird Artifacts when extracting video frames to png in ffmpeg

mkvmerge vs mkvextract

MKVExtract / MKVtoolnix

Encoding Audio Losslessly - 16-Bit vs 32-Bit WAV?