VideoHelp Forum
+ Reply to Thread
Results 1 to 10 of 10
Thread
  1. I'm a bit confused about audio extraction with tools like ffmpeg or mkvextract. I did some research but couldn't find much info about this specific topic.

    Basically what I am trying to do is extract opus audio from webm container (losslessly of course, without re-encoding). I've tried multiple methods using the tools mentioned above and performed spectrum analysis on all files afterwards.
    It seems like the output files have some data lost/changed during the process, which I don't think should be the case. I'm not sure if the results are 100% accurate, though multiple programs confirm it.

    How can I be sure that the extraction is successfull and the data matches exactly the original?

    Here is some info and images for side-by-side comparison:

    Source file: audio.webm
    Size: 2.94 MB
    Spek:

    Image
    [Attachment 49874 - Click to enlarge]

    Audacity:
    Image
    [Attachment 49875 - Click to enlarge]



    Extracted file using ffmpeg: audio_extracted_ffmpeg.opus
    Size: 2.90 MB
    Spek:

    Image
    [Attachment 49876 - Click to enlarge]

    Audacity:
    Image
    [Attachment 49877 - Click to enlarge]



    Extracted file using mkvextract: audio_extracted_mkvextract.opus
    Size: 2.91 MB
    Spek:

    Image
    [Attachment 49879 - Click to enlarge]

    Audacity:
    Image
    [Attachment 49880 - Click to enlarge]



    Comparison between the source webm and the ffmpeg opus in Audacity:
    Image
    [Attachment 49881 - Click to enlarge]

    Image
    [Attachment 49882 - Click to enlarge]



    This is the output of the ffmpeg extraction:
    Code:
    ffmpeg version 4.2 Copyright (c) 2000-2019 the FFmpeg developers
      built with gcc 9.1.1 (GCC) 20190807
      configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libdav1d --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libmfx --enable-amf --enable-ffnvcodec --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth --enable-libopenmpt
      libavutil      56. 31.100 / 56. 31.100
      libavcodec     58. 54.100 / 58. 54.100
      libavformat    58. 29.100 / 58. 29.100
      libavdevice    58.  8.100 / 58.  8.100
      libavfilter     7. 57.100 /  7. 57.100
      libswscale      5.  5.100 /  5.  5.100
      libswresample   3.  5.100 /  3.  5.100
      libpostproc    55.  5.100 / 55.  5.100
    Input #0, matroska,webm, from 'files/audio.webm':
      Metadata:
        encoder         : google/video-file
      Duration: 00:03:08.30, start: -0.007000, bitrate: 131 kb/s
        Stream #0:0(eng): Audio: opus, 48000 Hz, stereo, fltp (default)
    Output #0, opus, to 'files/ffmpeg/audio_extracted_ffmpeg.opus':
      Metadata:
        encoder         : Lavf58.29.100
        Stream #0:0(eng): Audio: opus, 48000 Hz, stereo, fltp (default)
        Metadata:
          encoder         : Lavf58.29.100
    Stream mapping:
      Stream #0:0 -> #0:0 (copy)
    Press [q] to stop, [?] for help
    size=    2975kB time=00:03:08.28 bitrate= 129.5kbits/s speed=6.57e+03x
    video:0kB audio:2952kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.803061%
    And the mkvextract:
    Code:
    Extracting track 0 with the CodecID 'A_OPUS' to the file 'files/mkvextract/audio_extracted_mkvextract.opus'. Container format: Ogg (Opus in Ogg)
    Progress: 100%
    Quote Quote  
  2. Are you using the same decoder version for all ?

    spek 0.8.2 was released in 2013. Before current opus revisions

    Make sure the decoder you are using is newer, or compiled with newer ffmpeg support

    Or decode file to wav then run that through spek
    Quote Quote  
  3. Converted the files into 16-bit wavs, this is the output from spek

    Source webm:
    Image
    [Attachment 49883 - Click to enlarge]


    ffmpeg opus:
    Image
    [Attachment 49884 - Click to enlarge]


    mkvextract opus:
    Image
    [Attachment 49885 - Click to enlarge]


    There is smaller difference between the source file and the one extracted via mkvextract, but very noticable difference between the ffmpeg and the rest
    Quote Quote  
  4. Not sure what is going on

    How were they converted exactly ? You can get different dithering algorithms applied when converting from fltp to 16bit

    Different containers can have slightly different offsets as well. Compressed audio can have different delays, and there can be differences between say mkv(webm) and ogg or something like mp4

    For example, you have a -0.007 start time in the webm container according to ffmpeg . What does mkvmerge think the start time is ? or mediainfo ?

    If you extracted it without the offset (zero start time), the audio would be shifted slightly when compared to it inside the container (webm converting to pcm wav directly) .
    Quote Quote  
  5. Yes, Opus can be tricky. I have seen a number of tickets for ffmpeg and mkvtoolnix about e.g. Opus' "discard padding". This is on top of the usual "delay at start" and "padding at the end of the stream" problems known from other lossy codecs.
    Quote Quote  
  6. Well, in that case, I guess it might be due to misalignment? I've never had trouble with ffmpeg concatenation or transcoding. Since mkvextract gives closer output compared to the original, the start offset of ffmpeg would be the main issue. I don't know if that applies to all containers with opus codec, but is there any convenient way to fix the padding/offset issue if transcoding a batch of files becomes necessary?

    This is medainfo of the source webm:

    Code:
    General
    CompleteName                     : C:\Users\Alexander\Desktop\Scripts\files\audio.webm
    Format/String                    : WebM
    Format_Version                   : Version 4
    FileSize/String                  : 2.95 MiB
    Duration/String                  : 3 min 8 s
    OverallBitRate/String            : 131 kb/s
    Encoded_Application/String       : google/video-file
    Encoded_Library/String           : google/video-file
    
    Audio
    ID/String                        : 1
    Format/String                    : Opus
    CodecID                          : A_OPUS
    Duration/String                  : 3 min 8 s
    Channel(s)/String                : 2 channels
    ChannelLayout                    : L R
    SamplingRate/String              : 48.0 kHz
    BitDepth/String                  : 16 bits
    Compression_Mode/String          : Lossy
    Language/String                  : English
    Default/String                   : Yes
    Forced/String                    : No

    And those are the parameters used:

    Code:
    ffmpeg.exe -i "source.webm" -vn -acodec copy "output.opus"
    
    mkvextract.exe "source.webm" tracks 0:"output.opus"
    The wav conversion was done in Audacity. I tested 32-bit signed PCM wav and the results had less noise in the spectrogram, but there is still some difference between all of the files.
    Quote Quote  
  7. If a start offset is partially contributing to the difference, in audacity, you can zoom way in to the start and you should be able to see the difference when comparing webm version loaded directly in audacity to extracted versions in audacity .

    But there seems to be more differences than just a shift .

    Maybe try the opus decoder directly from libopus or opus-tools . Maybe some ffmpeg implementation issue (audacity is using ffmpeg to decode isn' t it? )

    Not sure what to do or how to handle it. Maybe wait until all the tickets sneaker referred to get resolved
    Quote Quote  
  8. Just checked the verbose info of the source file. It is affected by the ffmpeg 1ms frame delay issue. https://pastebin.com/cCh2HmV2

    It's a lossy re-encoded version with ffmpeg in the first place, because it's downloaded from YouTube... I'll try libopus and hope that it works well for batch muxing. I'm open to other suggestions as well.
    Last edited by Alexander24; 21st Aug 2019 at 19:06.
    Quote Quote  
  9. You can batch mux with ffmpeg or mkvmerge too ... but there seems to be more going on here than just an offset . And what happens in other files where the offset is larger and required for sync ? If you remove that it will go out of sync
    Quote Quote  
  10. So, after extensive testing I came up with a bit more accurate results.

    Firstly I converted the opus audio stream from the matroska container to multiple WAVs. Then using ffmpeg's MD5 hashing function I validated the audio streams of each WAV (track #0 is the only one in this case), including the original source file.

    Looks like the source, 16-bit signed WAV, 32-bit float WAV and 64-bit float WAV have the same hash (same audio stream data) which is excellent.

    Image
    [Attachment 49896 - Click to enlarge]



    The 24-bit signed WAV and 32-bit signed WAV share the same hash themselves, but don't match the rest.

    Image
    [Attachment 49897 - Click to enlarge]



    So the only issue currently present could be the start offset of -0.007 which ffmpeg detects and uses for the remux process.
    Is there any way to set this value to flat 0.000000 on the original (source) file without need of re-encoding, or ignore the -0.007 and straight up use 0.000000 when remuxing?

    Image
    [Attachment 49898 - Click to enlarge]



    I'm not quite sure how to deal with alignment, seeking, DTS/PTS timestamps and similar advanced stuff if necessary.
    Quote Quote  



Similar Threads

Visit our sponsor! Try DVDFab and backup Blu-rays!