VideoHelp Forum
+ Reply to Thread
Results 1 to 10 of 10
Thread
  1. hey guys,

    I'm having a hard time with all the ffmpeg ways of converting audio and also need some confirmation if it is really a proper downmix.

    I know that Mono 2.0 to Mono 1.0 / extract the left channel properly works like this:
    Code:
    -af "pan=mono|c0=FL" -ac 1
    and Mono 1.0 to Mono 2.0 like this:
    Code:
    -af "pan=stereo|FL=c0|FR=c0"
    Is there a command which is able to create a proper downmix of a Stereo source which creates a Mono 2.0 out of both Stereo Channels (without applying senseless db reduction)?

    Also regarding 5.1 to 2.0 I once tried
    Code:
    --af "pan=stereo| FL < FL + 0.5*FC + 0.6*BL + 0.6*SL | FR < FR + 0.5*FC + 0.6*BR + 0.6*SR"
    but have a feeling that this is not a proper way to downmix it.

    thanks in advance for the input of the experts here
    cheers
    Quote Quote  
  2. When you mix two channels you have to reduce the volume to prevent out-of-range values. L+R may overflow. (L+R)/2 will prevent that. If you know L+R won't overflow (your source has very low volume) you can manually specify a 2.0 (or whatever value you want) volume multiplier.

    Presumably you've seen this: https://ffmpeg.org/ffmpeg-filters.html#pan-1
    Quote Quote  
  3. Originally Posted by Gwar View Post
    Is there a command which is able to create a proper downmix of a Stereo source which creates a Mono 2.0 out of both Stereo Channels (without applying senseless db reduction)?
    There is no senseless reduction in dB - you have 2 channels and if you not reduce signal level by half (i.e. 3.0103dB) then acoustic pressure and perceived loudness will be twice higher as power from both channels will be add in air.

    Code:
    "pan=stereo| FL < FL + FR | FR < FR + FL"
    Will do work without changing channel levels so perceived loudness will be twice higher.

    Originally Posted by Gwar View Post
    Also regarding 5.1 to 2.0 I once tried
    Code:
    --af "pan=stereo| FL < FL + 0.5*FC + 0.6*BL + 0.6*SL | FR < FR + 0.5*FC + 0.6*BR + 0.6*SR"
    but have a feeling that this is not a proper way to downmix it.
    Well - it depends what is your goal (personally i like to emphasize FC channel so dialogues are preferred over other channels) - you may follow general downmixing rules (for example specified by Dolby) are available for example https://professionalsupport.dolby.com/s/article/How-do-the-5-1-and-Stereo-downmix-sett...language=en_US

    or for example ITU https://www.itu.int/rec/R-REC-BS.775-4-202212-I/en

    or
    https://trac.ffmpeg.org/wiki/AudioChannelManipulation
    https://superuser.com/questions/852400/properly-downmix-5-1-to-stereo-using-ffmpeg
    https://superuser.com/questions/594741/how-to-use-ffmpeg-to-downmix-5-1-dts-hd-ma-or-d...eo-aac-with-do
    Last edited by pandy; 6th Apr 2023 at 14:25.
    Quote Quote  
  4. Originally Posted by pandy
    There is no senseless reduction in dB - you have 2 channels and if you not reduce signal level by half (i.e. 3.0103dB) then acoustic pressure and perceived loudness will be twice higher as power from both channels will be add in air.
    Code:
    "pan=stereo| FL < FL + FR | FR < FR + FL"
    Will do work without changing channel levels so perceived loudness will be twice higher.
    Thanks this was exactley what I was looking for. It worked fine here and the loudness didnt change. So the output level was exactley the same as the input level.
    Most of the sources I am working with have a maximum range of -6 db, many are between -15 and -9.
    Maybe problems would only occure if you are working with those (thanks to the loudness war) totally overleveled music tracks.

    Originally Posted by pandy
    ...........Well - it depends what is your goal (personally i like to emphasize FC channel so dialogues are preferred over other channels) - you may follow general downmixing rules (for example specified by Dolby) are available for example..............
    Thanks for the links that was a very interesting read. I totally agree with you that dialogues which are too low in the mix and music and effects beeing too loud in the mix is annoying. I will try the Nightmode Dialogue formula and am excited how it will turn out
    cheers
    Quote Quote  
  5. Originally Posted by Gwar View Post
    Thanks this was exactley what I was looking for. It worked fine here and the loudness didnt change. So the output level was exactley the same as the input level.
    Most of the sources I am working with have a maximum range of -6 db, many are between -15 and -9.
    Maybe problems would only occure if you are working with those (thanks to the loudness war) totally overleveled music tracks.
    I mean if you not reduce signal level then your combined power will be twice higher - level is something else than power. Imagine you have signal with particular level and power amplifier - power amplifier will amplify level and will produce power at the speakers, if you not reduce level by half then from two channels you will get combined twice power.
    I know, this not so obvious but this is simple physics and that's why you should reduce level by half to get same power from two combined channels.

    Originally Posted by Gwar View Post
    Thanks for the links that was a very interesting read. I totally agree with you that dialogues which are too low in the mix and music and effects beeing too loud in the mix is annoying. I will try the Nightmode Dialogue formula and am excited how it will turn out
    cheers
    Personally i usually reduce (compress) dynamics of Center channel so it is perceived as louder (this is additional step not covered by downmixing formulas).
    Quote Quote  
  6. Originally Posted by pandy View Post
    I mean if you not reduce signal level then your combined power will be twice higher - level is something else than power. Imagine you have signal with particular level and power amplifier - power amplifier will amplify level and will produce power at the speakers, if you not reduce level by half then from two channels you will get combined twice power. I know, this not so obvious but this is simple physics and that's why you should reduce level by half to get same power from two combined channels.
    I am the worst when it comes to physics that's why I don't understand that at all. I would love to understand it tho'. So let's stay at the Mono / Stereo as an example.

    -Mono to Stereo
    -acodec pcm_s16le -ar 48000 -af "pan=stereo|FL=c0|FR=c0" -f WAV %1%.wav

    -Stereo to Mono 1.0 downmix
    -acodec pcm_s16le -ar 48000 -ac 1 -f WAV %1%.wav

    -Stereo to Mono 2.0 downmix
    -acodec pcm_s16le -ar 48000 -af "pan=stereo| FL < FL + FR | FR < FR + FL" -f WAV %1%.wav

    To my ears the levels don't change if the input file had a peak of -9db then so does the output after the conversion. If a Stereo had a -9db peak I want the mono to have -9db peak too and not -12 as that is too low in the volume.
    So what is wrong about it? Do I need 2 additional commands to decrease the -db before the conversion and increase the -db before the encode?

    Thanks in advance for your further explanation
    Quote Quote  
  7. -----
    Last edited by Gwar; 29th Apr 2023 at 05:29. Reason: -
    Quote Quote  
  8. Will try to address at least some aspects of this.

    First let assume two 16 bit samples with levels between -3dBFS and 0dBFS - this is common case for signals normalized to for example 0dBFS - if you add such two samples, adding results will go beyond 16 bit (you need to have 17 bits) - so quite commonly level is reduced before adding to avoid such problem. I was forced to write this as you pointed particular levels so we need to write why sometimes level conversion is necessary.

    Now we going directly to your cases:
    - first one - mono to stereo is straightforward - signal from channel 'c0' is copied without altering into new FL (Front Left), and FR (Front Right) - my assumption is signal will stay on same level in FR and FL as in 'c0'.

    - second case - downmixing - if you do this in such way signals are reduced by half (-3 dBFS) and combined so in theory no clipping shall occurs (but they may clip anyway - this is so called intersample peaks and intersample clipping

    - third case - signals are combined but they are normalized to 0dBFS to avoid hard clipping (they may clip anyway due intersample peaks / intersample clipping)

    To avoid loosing level you may do channel conversion manually (so not use automatic conversion) but it is up to you to prevent signal clipping.
    Quote Quote  
  9. thanks. Yea I could do the whole process manually in audition but it is much faster to do it in ffmpeg. I didnn't notice any distortion on any of my conversions so I guess I am lucky and have no clipping. Most of the content I work with is -10db at maximum. I don't do this with any 0 db or higher leveled loudness stuff.
    Quote Quote  
  10. Originally Posted by Gwar View Post
    thanks. Yea I could do the whole process manually in audition but it is much faster to do it in ffmpeg. I didnn't notice any distortion on any of my conversions so I guess I am lucky and have no clipping. Most of the content I work with is -10db at maximum. I don't do this with any 0 db or higher leveled loudness stuff.
    By manual mode i mean explicit guide ffmpeg how to dealt with audio channels - if you sure that none of channels going over -10dBFS (in digital domain i'm against using values as -10dB as they are not saying anything - dBFS is digital - maximum signal level in digital domain is 0dBFS this is clear for 1 to 32 bit audio) the you can safely add two channels together (at some point you may even reduce quantization noise) and later duplicate them to L and R. If you happy with no -3dB reduction then fine for me.
    Quote Quote  



Similar Threads

Visit our sponsor! Try DVDFab and backup Blu-rays!