Downmixing 6 channel AAC to 2 channel?

26th Oct 2017 20:50 #1
bizzybody

View Profile

View Forum Posts

Private Message
Member

Join Date
Apr 2007

Location
United States
I have a movie with 6 channel AAC, 2 front, 2 rear, center front, LFE. I want to downmix it to stereo but for some reason on this particular one, AVIDemux is proving incapable of doing it correctly. All I get out of it is a garbled mess, using the same settings that have successfully downmixed many other 5 and 6 channel audio.

I've processed the video with NVEnc to get it down in size, now the audio needs to shrink. Going to keep the bigger version with 6 channel for this one, the shrinky-dink job is for devices of lesser capability.

I loaded it into WaveShop and Audacity and Audition 3.0. I can tell which are the center and LFE channels but none of those tell the user which channels are front, rear, left, right. They show as six mono streams instead of two stereo and two mono.

Is there software that I can load it into and just select to strip out the rear and LFE channels to convert it to 2.1? Or how about a program that labels the channels' positions in its GUI? Remember that little sub-megabyte Windows app for manipulating multi-channel DVD audio? It had a graphic showing each channel position and the user could select which ones to export and also do things like select separate files to merge into a multi-channel audio track. It had a ridiculously high price, then was discontinued.

Quote
27th Oct 2017 01:14 #2
LigH.de

View Profile

View Forum Posts

Private Message
Member

Join Date
Aug 2013

Location
Central Germany
MeGUI can process all audio streams it is able to load into AviSynth. It uses the same engine as BeHappy. You may prefer a Dolby ProLogic (DPL) or ProLogic II (DPL2) downmix over simple stereo, to maintain some of the surround information.

To encode to AAC again, MeGUI supports qaac (which requires the Apple encoder from QuickTime or iTunes to be either installed or extracted with makeportable.cmd), or alternatively the slightly outdated Nero AAC encoder.

Furthermore, if you can discover the right parameter set for the command line, ffmpeg should be able as well, and its current own AAC encoder is not too bad either.

I guess you refer to Sonic Foundry (now Sony) SoftEncode for AC3...

Quote
27th Oct 2017 02:45 #3
bizzybody

View Profile

View Forum Posts

Private Message
Member

Join Date
Apr 2007

Location
United States
Useful info there. I got Hybrid to do the downmix.

SoftEncode. Thanks for jogging my memory. Does that work on Windows 10 x64?

Quote
27th Oct 2017 02:53 #4
LigH.de

View Profile

View Forum Posts

Private Message
Member

Join Date
Aug 2013

Location
Central Germany
It may. But it only handles Dolby Digital AC3, not MPEG AAC.

Quote
27th Oct 2017 04:41 #5
pandy

View Profile

View Forum Posts

Private Message
Member

Join Date
Sep 2008
Lastly ffmpeg received 'headphone' filter (HRTF) - didn't try it yet however it looks promising as binaural sound processor - if someone have intention to perform donwmixing then i suggest to at least try newer ffmpeg.
https://www.ffmpeg.org/ffmpeg-filters.html#headphone

Quote
27th Oct 2017 21:37 #6
hello_hello

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2012
Originally Posted by bizzybody

Is there software that I can load it into and just select to strip out the rear and LFE channels to convert it to 2.1? Or how about a program that labels the channels' positions in its GUI?

For the record, lossy formats sometimes use different channel orders, but the audio is always remapped to wave file channel order when it's decoded on a PC, and the encoder will remap it again when encoding if need be.

5.1ch audio should always load into Audacity in this order:
Front left, Front right, Front centre, LFE, Surround left, Surround right.

I generally use foobar2000 for downmixing as the matrix mixer DSP lets you configure the downmix and QAAC has a peak normalisation function. Thinking about it, QAAC can downmix too.
You'll find third party DSPs for decoding AC3 and DTS on the foobar2000 site.

Nothing wrong with using ffmpeg of course. I use it to apply dynamic range compression for temporary copies I'll be viewing at night, and from there, pipe the ffmpeg output to QAAC.

Quote
28th Oct 2017 05:07 #7
pandy

View Profile

View Forum Posts

Private Message
Member

Join Date
Sep 2008
Originally Posted by hello_hello

5.1ch audio should always load into Audacity in this order:
Front left, Front right, Front centre, LFE, Surround left, Surround right.

There are 2 different 5.1

Code:

5.1 FL+FR+FC+LFE+BL+BR 5.1(side) FL+FR+FC+LFE+SL+SR

Maybe not GUI but relatively easy to use:

Code:

@ffmpeg -y -i "%1" -vn -sn -c:a aac -b:a 192k -af "pan=stereo|FL < FL+1.414FC+0.5BL+0.5SL+0.25LFE+0.125BR|FR < FR+1.414FC+0.5BR+0.5SR+0.25LFE+0.125BL" -movflags faststart -f mp4 "%~n1.mp4"
Last edited by pandy; 28th Oct 2017 at 11:09.
Quote
29th Oct 2017 04:36 #8
shans

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2011

Location
India
Originally Posted by hello_hello

Originally Posted by bizzybody

Is there software that I can load it into and just select to strip out the rear and LFE channels to convert it to 2.1? Or how about a program that labels the channels' positions in its GUI?

5.1ch audio should always load into Audacity in this order:
Front left, Front right, Front centre, LFE, Surround left, Surround right.

Is that order holds good for all type of formats, viz. AC3, AAC etc.
I have opened AC3 5.1 audio in audacity and it opened as 6 mono. I have no idea about the details of individual tracks. Actually, I want to apply compression to only channel 'FC' leaving others untouched. I will mute all other channels keeping the "FC' active. Is there any other method to distinguish the tracks individually?

But, MediaInfo shows differently and I am a bit confused as to which one to follow.

Attached Thumbnails

Quote
29th Oct 2017 04:48 #9
pandy

View Profile

View Forum Posts

Private Message
Member

Join Date
Sep 2008
Originally Posted by shans

Is that order holds good for all type of formats, viz. AC3, AAC etc.
I have opened AC3 5.1 audio in audacity and it opened as 6 mono. I have no idea about the details of individual tracks. Actually, I want to apply compression to only channel 'FC' leaving others untouched. I will mute all other channels keeping the "FC' active. Is there any other method to distinguish the tracks individually?

But, MediaInfo shows differently and I am a bit confused as to which one to follow.

AC3 and AAC may and frequently they use different channel mapping (matrix) - you need to verify this before transcoding - You always need to check which type of 5.1 matrix is used in your audio - that's why i recommend ffmpeg where channel names instead channel sequence can be used.
By multiplying particular channel by '0' (zero) you efficiently mute channel. unless there is some special reason i would recommend to NOT mute all channels - you can significantly attenuate them for example by 40dB (divide level by 100 or place simply 0.01 as multiplier), if 40dB is still too loud then 60dB attenuation (0.001) should be fine.

Quote
29th Oct 2017 09:36 #10
shans

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2011

Location
India
Originally Posted by pandy

Originally Posted by shans

Is that order holds good for all type of formats, viz. AC3, AAC etc.
I have opened AC3 5.1 audio in audacity and it opened as 6 mono. I have no idea about the details of individual tracks. Actually, I want to apply compression to only channel 'FC' leaving others untouched. I will mute all other channels keeping the "FC' active. Is there any other method to distinguish the tracks individually?

But, MediaInfo shows differently and I am a bit confused as to which one to follow.

AC3 and AAC may and frequently they use different channel mapping (matrix) - you need to verify this before transcoding - You always need to check which type of 5.1 matrix is used in your audio - that's why i recommend ffmpeg where channel names instead channel sequence can be used.
By multiplying particular channel by '0' (zero) you efficiently mute channel. unless there is some special reason i would recommend to NOT mute all channels - you can significantly attenuate them for example by 40dB (divide level by 100 or place simply 0.01 as multiplier), if 40dB is still too loud then 60dB attenuation (0.001) should be fine.

I mute the channels one by one only to identify the "FC" where I get dialoque. I will listen to that particular channel as I do not know any other method to distinguish it. After processing this, I will unmute all channels, select them all and save as 5.1 audio. I am not conversant with programming the ffmpeg method. That's why I choose a GUI like Audacity enabled with FFmpeg plug-in for all my audio processing.

Quote
29th Oct 2017 10:34 #11
pandy

View Profile

View Forum Posts

Private Message
Member

Join Date
Sep 2008
Originally Posted by shans

I mute the channels one by one only to identify the "FC" where I get dialoque. I will listen to that particular channel as I do not know any other method to distinguish it. After processing this, I will unmute all channels, select them all and save as 5.1 audio. I am not conversant with programming the ffmpeg method. That's why I choose a GUI like Audacity enabled with FFmpeg plug-in for all my audio processing.

Well - it is your time and work not mine, and i would not call placing few numbers before letters as programming...

Quote
29th Oct 2017 10:53 #12
hello_hello

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2012
Originally Posted by pandy

Originally Posted by hello_hello

5.1ch audio should always load into Audacity in this order:
Front left, Front right, Front centre, LFE, Surround left, Surround right.

There are 2 different 5.1

Code:

5.1 FL+FR+FC+LFE+BL+BR 5.1(side) FL+FR+FC+LFE+SL+SR

Yeah but there's no back or side channels in 5.1ch audio, only surround channels. Every lossy encoder I know of will accept either BL+BR or SL+SR and encode them as "surround" left and right.

You may know the answer to this..... foobar2000 decodes most/all lossy 5.1ch formats with the surround as "back", except for dts which it decodes as "side". It's probably an ffmpeg thing, but I've often wondered if there's a reason why.
Quote
29th Oct 2017 11:20 #13
hello_hello

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2012
Originally Posted by shans

But, MediaInfo shows differently and I am a bit confused as to which one to follow.

There's nothing to say there's not something wrong with your particular audio stream, it does happen, but assuming that's not the case.....
MediaInfo doesn't show the correct order in it's GUI. Well..... "correct" in that it doesn't use the wave file channel order. Here's the way I remember it.... (there was a discussion about it at doom9 quite a while ago).

Because different formats use their own channel order internally and there's no such thing as "side" in 5.1ch audio anyway (just surround) it was decided to display the channels in an order that makes sense if you're standing in the middle of the room surrounded by speakers. So generally it'll show L C R Side LR LFE which makes sense because then for 7.1 channel you can use something like L C R Side LR Back LR LFE. It's really nothing to do with the actual encoding order.

You can, if you're a foobar2000 user, open AC3/AAC/DTS (AC3 and DTS require third party DSPs to decode but they're on the foobar2000 site) and it'll decode to wave file format and adjust it's output meters (which I don't think are enabled by default) to show you what's going on.

For 5.1ch it generally decodes lossy audio to the back channels, which it labels "R" to aggravate me personally ("rear" I assume) but whether the surround channels are "R" or "S" (for side) they're both interchangeable as surround when encoding.

This is the matrix mixer DSP I use with foobar2000 displaying the channels in wave file order. You can see why "Back" was originally used for "surround". Then along came 7.1ch and made things worse because the extra surround speakers go behind the listener, so they need to be "back" or they're around the wrong way (and yes there's more than one type of 7.1ch, but I'm referring to the only one we actually use).

Hence the "back"/"side" ambiguity that still survives today for 5.1ch, but that's the order it's decoded to on a PC, and lossy encoders should accept either back or side as "surround".
And if I still haven't convinced you, here's how QAAC handles the channel mapping when you feed it wave files.
https://github.com/nu774/qaac/wiki/Multichannel--handling
The third column shows the AAC channel mappings. MediaInfo won't show them to you that way. Whatever the AAC mapping, the first column effectively also shows the order in which it'll be decoded on a PC. At least in theory, but 7.1ch AAC encoding/decoding is pretty messy. Personally I think surround sound sucks anyway......

The channel order you're seeing with MediaInfo happens to be the channel order in which AC3 is encoded.
http://avisynth.nl/index.php/GetChannel

Last edited by hello_hello; 30th Oct 2017 at 00:05.

Quote
29th Oct 2017 11:23 #14
pandy

View Profile

View Forum Posts

Private Message
Member

Join Date
Sep 2008
Originally Posted by hello_hello

You may know the answer to this..... foobar2000 decodes most/all lossy 5.1ch formats with the surround as "back", except for dts which it decodes as "side". It's probably an ffmpeg thing, but I've often wondered if there's a reason why.

Modern audio codecs (AAC) may use different matrix (more than 20 channels supported) - old audio codecs (AC3, DTS) usually use single type of matrix (as they support only up to 5.1). IMHO ffmpeg must be flexible on this and recognize all of them. I mentioned this as AAC may use both channel matrices.

Quote
30th Oct 2017 00:43 #15
shans

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2011

Location
India
Originally Posted by hello_hello

Originally Posted by shans

But, MediaInfo shows differently and I am a bit confused as to which one to follow.

You can, if you're a foobar2000 user, open AC3/AAC/DTS (AC3 and DTS require third party DSPs to decode but they're on the foobar2000 site) and it'll decode to wave file format and adjust it's output meters (which I don't think are enabled by default) to show you what's going on.

The channel order you're seeing with MediaInfo happens to be the channel order in which AC3 is encoded.
http://avisynth.nl/index.php/GetChannel

Thank you hello-hello for your time and detailed explanation. I got it. So far, I have been using foobar2000 only to listen to music and now I learn that it does conversion too. I need to install the required plug-ins and learn the process from scratch. I will try.

Quote
30th Oct 2017 03:19 #16
hello_hello

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2012
No worries. Foobar2000 comes with presets for various encoders (even some that you have to download separately) but if you need help configuring it with customised command line options, someone will be able to help. There's an encoder pack on the foobar2000 site that contains the encoders they can legally distribute.

The only downside to using foobar2000 for encoding, is aside from changing the relative volumes or downmixing, is it's generally not practical for editing individual streams in multichannel audio as you can with Audacity, such as compressing a single channel, but chances are ffmpeg can do it if someone can supply the correct command line. I use ffmpeg for encoding with foobar2000 quite a bit but usually it takes me a while to work out how to get it to do clever things as I don't use it's filtering much, and what I do use I have saved as presets.

Last edited by hello_hello; 30th Oct 2017 at 04:07.

Quote
2nd Nov 2017 09:59 #17
shans

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2011

Location
India
Originally Posted by hello_hello

No worries. Foobar2000 comes with presets for various encoders (even some that you have to download separately) but if you need help configuring it with customised command line options, someone will be able to help. There's an encoder pack on the foobar2000 site that contains the encoders they can legally distribute.

The only downside to using foobar2000 for encoding, is aside from changing the relative volumes or downmixing, is it's generally not practical for editing individual streams in multichannel audio as you can with Audacity, such as compressing a single channel, but chances are ffmpeg can do it if someone can supply the correct command line. I use ffmpeg for encoding with foobar2000 quite a bit but usually it takes me a while to work out how to get it to do clever things as I don't use it's filtering much, and what I do use I have saved as presets.

I have downloaded the encoder pack from the foobar site and also followed your instructions line by line as given in another thread whose link is given below:
https://forum.videohelp.com/threads/370412-Looking-for-an-AAC-LC-encoder-from-Frauenho...er#post2376348. I processed a single "FC" channel of a multi-channel audio file in audacity, saved the whole as a WAV file, opened it in foobar and saved as a 5.1 AAC using encoders like Apple, nero, fdk_aac, and FhG. I could not find any distinct difference of these, anyway. I observed that the volume gets down by around 5 to 6 dB when I played it in foobar compared to audacity. I do not know if there is a set up by default to reduce the volume in foobar. I am not conversant with command line. Thanks again for your able guidance.

Quote
3rd Nov 2017 19:17 #18
shans

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2011

Location
India
Originally Posted by hello_hello

No worries. Foobar2000 comes with presets for various encoders

I have downloaded the encoder pack from the foobar site and also followed your instructions line by line as given in another thread whose link is given below:
https://forum.videohelp.com/threads/370412-Looking-for-an-AAC-LC-encoder-from-Frauenho...er#post2376348. I processed a single "FC" channel of a multi-channel audio file in audacity, saved the whole as a WAV file, opened it in foobar and saved as a 5.1 AAC using encoders like Apple, nero, fdk_aac, and FhG. I could not find any distinct difference of these, anyway. I observed that the volume gets down by around 5 to 6 dB when I played it in foobar compared to audacity. I do not know if there is a set up by default to reduce the volume in foobar. I am not conversant with command line. Thanks again

Quote
8th Nov 2017 12:01 #19
hello_hello

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2012
On my XP PC, Audacity doesn't adjust it's own volume, it adjusts the volume of the sound card. Therefore when I reduce the volume in Audacity, it turns foobar2000 down too (foobar2000's volume fader doesn't change though). I'm not sure if it's possible to configure Audacity to adjust it's own volume and leave the soundcard alone, but if you simply convert from one format to another, the volume shouldn't change.

I'd recommend using QAAC. It has a "no delay" option to prevent it adding extra silence to the beginning (all lossy encoders do it). It doesn't matter when encoding music (players should know to skip the padding anyway) and it's probably better not to use the no delay option for that, but when muxing soundtrack audio it can effect the audio sync a little (depending on the muxing program used).

Quote
8th Nov 2017 12:26 #20
hello_hello

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2012
PS. For fhgaacenc.exe, use nsutil.dll from the encoder pack and not from the Winamp installer. It's modified to accept a 32bit float input and fix a bug, or something along those lines.

And I think these days the iTunes installer contains a newer version of Apple's CoreAudioToolbox than the QuickTime installer, so if you used the latter to get the files it might pay to replace them.

Quote
9th Nov 2017 07:53 #21
shans

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2011

Location
India
Originally Posted by shans

Originally Posted by hello_hello

No worries. Foobar2000 comes with presets for various encoders

I observed that the volume gets down by around 5 to 6 dB when I played it in foobar compared to audacity. I do not know if there is a set up by default to reduce the volume in foobar.

I think I have not made this quite clear. I wanted to say that when I played the same song in audacity and foobar, I observed the volume indicator / UV meter in those programs show different volume level "dB". It can be seen in the attached screen shots. Not an issue but, out of curiosity I would like to know what could be the reason.

Can you please tell me the procedure to down mix a 5.1 aac audio to stereo in foobar, right from the installation of plug in?

Attached Thumbnails

Quote
9th Nov 2017 09:13 #22
shans

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2011

Location
India
Originally Posted by hello_hello

And I think these days the iTunes installer contains a newer version of Apple's CoreAudioToolbox than the QuickTime installer, so if you used the latter to get the files it might pay to replace them.

I put iTunes installer along with make portable. But, it failed to extract QTfiles. However, I could extract it with QuickTimeInstaller.

Quote
10th Nov 2017 10:50 #23
hello_hello

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2012
foobar2000 has a ReplayGain playback option. Your file probably doesn't have ReplayGain, but check in Preferences/Playback to make sure it's not enabled. The pre-amp can be configured to automatically reduce the volume by a specific amount when there's no ReplayGain info saved to the file. That's the only explanation I can think of for a volume reduction, unless you have a DSP in the playback chain that's reducing the volume.

The DSPs often come with an installer. If not you can unzip and copy them to your user configuration folder. For Windows XP it's:
C:\Documents and Settings\Your User Name\Application Data\foobar2000\user-components.

That should help you find it for other versions of Windows. I think you can put the DSPs in the foobar2000 installation folder and they'll work, but I don't think foobar2000's updater will check for updates if you put them there.

If you have to do it manually, for the matrix mixer DSP create a folder called "foo_dsp_mm" as a sub-folder in user-components and put "foo_dsp_mm.dll" inside it.

Foobar2000 has it's own DSP for downmixing multichannel audio to stereo, but I prefer to use the matrix mixer DSP as it's configurable, and automatically reduces the volume to prevent clipping if you check the "normalise" option. http://skipyrich.com/w/index.php/Foobar2000:Matrix_Mixer

Add the DSP to the processing chain when creating a conversion preset.

I have it configured like this, but it's personal preference. I don't include the LFE channel. If you want it, set it to 1 in both the FL and FR channels. The channels with the volume set to 0.707 mean they're reduced by 3dB relative to the others. If you don't want that, set them to 1 (-3dB is standard for the centre channel as it's being split to two channels. For the surround channels, it's more personal preference). You should check the "normalise" option to prevent clipping.

That'll generally leave the volume a bit low as it lowers the volume to "worst case scenario" when combining channels to prevent clipping. There's several ways to adjust the overall volume. Adjusting so the peaks are at maximum is one way, and the simplest way to do that is to encode with QAAC and get it to peak normalise. The -N option tells it to do just that.
(Tip: Select QAAC in the encoder configuration, select your desired encoder settings, switch the encoder to "custom" at the top, foobar2000 will fill in the command line and you can edit it from there).

When you've added a DSP and configured the encoder, save it all as a conversion preset. Recheck everything to make sure it's okay. Then you can load a bunch of 5.1ch audio files into a playlist (or just one), highlight them all, right click, select your QAAC downmix preset, downmix and convert.

Or for fun....
Another option is to not use a DSP and downmix with ffmpeg while converting instead. I just borrowed pandy's command line from earlier (because I'm a bit crap with ffmpeg) and modified it to work with foobar2000 (running on XP). This downmixes with ffmpeg the way pandy suggested (which I'll confess I don't fully understand yet), then sends the downmixed audio to QAAC for peak normalising and converting to AAC. You'd have to modify it for the file paths on your PC, but the whole command line looks like this (there's a space at the beginning):

/d /c c:\progra~1\foobar2000\encoders\ffmpeg.exe -y -i - -c:a pcm_f32le -af "pan=stereo|FL < FL+1.414FC+0.5BL+0.5SL+0.25LFE+0.125BR|FR < FR+1.414FC+0.5BR+0.5SR+0.25LFE+0.125BL" -f wav - | c:\progra~1\foobar2000\encoders\QAAC\qaac.exe -N --ignorelength -s --no-optimize --no-delay -V 91 -o %d -

Last edited by hello_hello; 11th Nov 2017 at 18:34.

Quote
11th Nov 2017 07:39 #24
shans

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2011

Location
India
Originally Posted by hello_hello

foobar2000 has a ReplayGain playback option. Your file probably doesn't have ReplayGain, but check in Preferences/Playback to make sure it's not enabled.

Thank you so much, Sir. Yes, ReplayGain was enabled and I removed it. Now, I get the actual volume indicated.

I have installed the matrix mixer DSP and I could down mix 5.1 ch audio to stereo too. I have learned a lot from your explanations and still keep learning. What should I do to encode AC3 in foobar. I have decoder and could open ac3 files.

In qaac encoder setting parameters, I found --no-optimize, --no-delay commands. I have no idea about these. What do they do?

After googling, I am of the opinion that VBR encoding is preferred to CBR. I would like to get some suggestions from you in this regard. Which one is better in respect of stereo, multi-channel encoding, music and movie track etc., Thanks again.

Quote
11th Nov 2017 11:54 #25
hello_hello

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2012
Originally Posted by shans

Thank you so much, Sir. Yes, ReplayGain was enabled and I removed it. Now, I get the actual volume indicated.

No problem, I didn't think the pre-amp was enabled by default which is why I didn't think about it originally. At least that's solved.

Originally Posted by shans

I have installed the matrix mixer DSP and I could down mix 5.1 ch audio to stereo too. I have learned a lot from your explanations and still keep learning. What should I do to encode AC3 in foobar. I have decoder and could open ac3 files.

Use ffmpeg. You can use the Aften AC3 encoder but it was merged into ffmpeg and it's no longer maintained. If you put ffmpeg.exe in the foobar2000/encoders folder, f2k will automatically check for it there. See the attached screenshot.

Originally Posted by shans

In qaac encoder setting parameters, I found --no-optimize, --no-delay commands. I have no idea about these. What do they do?

When some encoders write an audio file, they optimise it afterwards, shuffling stuff around, putting any tags first, that sort of thing. Foobar2000 does it's own optimising, so --no-optimize tells QAAC not to bother.
F2k doesn't add --no-delay. Did you confuse that with something else?

Originally Posted by shans

After googling, I am of the opinion that VBR encoding is preferred to CBR. I would like to get some suggestions from you in this regard. Which one is better in respect of stereo, multi-channel encoding, music and movie track etc., Thanks again.

Yep, I only ever use VBR for AAC. Stereo or 5.1ch, I use the same setting for everything. Generally you'd use CBR for AC3 though, because VBR AC3 is a non-standard thing and players will probably reject it. 192kbps for stereo and 384kbps or 448kbps for 5.1ch AC3.
AAC encoders have a VBR quality setting. You select the quality and the encoder will use whatever variable bitrate is required to achieve it. The more channels, the higher the bitrate. It's much like x264's CRF encoding.

I started out using Nero's default of q0.50 for AAC, so for QAAC I use V91 which results in a similar bitrate.
QAAC also has a Constrained VBR mode and average bitrate mode. They're both also variable, but True VBR is better.
https://github.com/nu774/qaac/wiki/Command-Line-Options
https://github.com/nu774/qaac/wiki/Encoder-configuration

If you like I can explain how to use ReplayGain to encode all your files at the same volume. Sometimes it's preferable to peak normalising. You don't need to enable ReplayGain on playback. Just use it when encoding.

For ffmpeg 192kbps AC3.
-i - -ignore_length true -c:a ac3 -b:a 192k %d

Attached Thumbnails

Last edited by hello_hello; 11th Nov 2017 at 18:40.

Quote
11th Nov 2017 20:20 #26
shans

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2011

Location
India
Originally Posted by hello_hello

Originally Posted by shans

In qaac encoder setting parameters, I found --no-optimize, --no-delay commands. I have no idea about these. What do they do?

When some encoders write an audio file, they optimise it afterwards, shuffling stuff around, putting any tags first, that sort of thing. Foobar2000 does it's own optimising, so --no-optimize tells QAAC not to bother.
F2k doesn't add --no-delay. Did you confuse that with something else?

Thanks a lot. If I understood this properly, I need to use this --no-delay option only during encoding movie sound track to avoid A/V sync issues. Am I right, Sir?

Originally Posted by hello_hello

If you like I can explain how to use ReplayGain to encode all your files at the same volume. Sometimes it's preferable to peak normalising. You don't need to enable ReplayGain on playback. Just use it when encoding.

Sure. I love to get the detailed explanation on this subject too.

Quote
11th Nov 2017 23:30 #27
hello_hello

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2012
Originally Posted by shans

Thanks a lot. If I understood this properly, I need to use this --no-delay option only during encoding movie sound track to avoid A/V sync issues. Am I right, Sir?

It depends on the muxing program. If you mux with MKVToolNix it'll account for the AAC audio delay (only AAC). For instance if you encode with Nero (because I can roughly remember the numbers) it adds padding of a little over 50ms. MKVToolNix will remove it, but because lossy audio is stored in frames it has to remove a little more than 50ms, then it applies an audio delay to compensate. None of that's a bad thing because soundtrack audio is virtually always silent at the beginning, but for Nero after muxing you usually end up with a 9ms audio delay. With QAAC's --no-delay option, none of that needs to happen.

Originally Posted by hello_hello

Sure. I love to get the detailed explanation on this subject too.

MP3 & AAC can have their volumes losslessly adjusted. It's limited to steps of 1.5dB but that's still fairly accurate. The downside is foobar2000's option for setting the volume when losslessly adjusting is buried deep in it's preferences. It's an audio player after-all and the ReplayGain volume is supposed to be fixed at 89 (there's a long explanation as to what 89 means, and it's a little retarded anyway, and it's not required for this story).
The original idea behind ReplayGain was to scan the files to determine the volume, save the info to tags and let the player adjust the volume on playback. Hardware support is fairly non-existent though, so the workaround is to adjust the volume of the audio to the same level so the player doesn't need to do it.

Right click and use the "ReplayGain/Scan per file track gain" option. When it's done save the ReplayGain info. Right click again and select "ReplayGain/Apply track gain to content" and that's it. It only works for MP3 and AAC. That should give each track the same average volume according to your ears rather than the highest peak volume or an RMS volume etc. For standard music (CD tracks) stick with a volume of 89. For soundtrack audio, change it to 83 as that's the European standard for soundtrack audio and provides more headroom for greater dynamics. See the first attached screenshot.

For more accuracy or other audio types, it's a 2 stage process if you downmix, but it doesn't require an intermediate file if you're not downmixing.
You'd downmix and convert to something lossless like a wave file, scan the wave file, then use the ReplayGain option in the Converter/Processing section to apply the volume when converting. For music tracks you'd normally leave the preamp on 0dB. For soundtrack audio you'd set it to -6dB to give you a volume of 83. See the second screenshot.

In both cases, you can load the adjusted/converted files and check their volumes by scanning again to confirm there's no peaks above 1.000 (I don't fuss till they exceed about 1.1).

The second method can also be used for peak normalising. See the third screenshot. It increases the volume by 20dB, which would normally cause clipping, but if there's ReplayGain info saved to the file, f2k will limit the volume increase to prevent that and you end up with a peak normalised output. As long as the "prevent clipping" option is selected. For movie audio I generally just peak normalise, but for audio from a bunch of episodes of a TV series etc, it's nice to have them all the same volume. For CD tracks if you put a bunch of them on an MP3 player and run it in random mode like I do, it's absolutely mental not to adjust them all to the same volume of 89 with ReplayGain first (unless your player supports adjusting the volume using the info saved to tags).

The ReplayGain option in the first screenshot only effects losslessly adjusting MP3 or AAC.
The ReplayGain option in the second/third screenshots can be saved as part of a conversion preset, so you could save one for converting to a volume of 89, another to 83, and one more for peak normalising.

Attached Thumbnails

Last edited by hello_hello; 12th Nov 2017 at 00:01.

Quote
12th Nov 2017 04:05 #28
shans

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2011

Location
India
Originally Posted by hello_hello

Originally Posted by shans

Thanks a lot. If I understood this properly, I need to use this --no-delay option only during encoding movie sound track to avoid A/V sync issues. Am I right, Sir?

It depends on the muxing program. If you mux with MKVToolNix it'll account for the AAC audio delay (only AAC). For instance if you encode with Nero (because I can roughly remember the numbers) it adds padding of a little over 50ms. MKVToolNix will remove it, but because lossy audio is stored in frames it has to remove a little more than 50ms, then it applies an audio delay to compensate. None of that's a bad thing because soundtrack audio is virtually always silent at the beginning, but for Nero after muxing you usually end up with a 9ms audio delay. With QAAC's --no-delay option, none of that needs to happen.

I always use MKVToolnix for muxing and extraction of tracks as well. What about MP3 track? Is there any means to know the audio delay and actually where it is stored. What encoding parameters do you recommend for MP3 track and music files? I have all my music files encoded as flac with mode-5 and MP3 CBR 320 Kbps, not to sacrifice the quality and might be due to ignorance.

Originally Posted by hello_hello

MP3 & AAC can have their volumes losslessly adjusted.
The original idea behind ReplayGain was to scan the files to determine the volume, save the info to tags and let the player adjust the volume on playback. Hardware support is fairly non-existent though, so the workaround is to adjust the volume of the audio to the same level so the player doesn't need to do it.

So. this means that if I save the ReplayGain information and play it in any player the volume will get adjusted to the same level of 89. Is it right?

Originally Posted by hello_hello

For soundtrack audio, change it to 83 as that's the European standard for soundtrack audio and provides more headroom for greater dynamics. See the first attached screenshot.

The second method can also be used for peak normalising. See the third screenshot. It increases the volume by 20dB, which would normally cause clipping, but if there's ReplayGain info saved to the file, f2k will limit the volume increase to prevent that and you end up with a peak normalised output. As long as the "prevent clipping" option is selected. For movie audio I generally just peak normalise

For soundtrack audio, it should be 83 as per European standard. Again, if we do peak normalising the volume will naturally increase by 20dB. It seems like contradictory, if I put it in other way, I am not quite clear in this aspect. Would be please throw some more light?

Quote
12th Nov 2017 05:26 #29
pandy

View Profile

View Forum Posts

Private Message
Member

Join Date
Sep 2008
Originally Posted by shans

For soundtrack audio, it should be 83 as per European standard. Again, if we do peak normalising the volume will naturally increase by 20dB. It seems like contradictory, if I put it in other way, I am not quite clear in this aspect. Would be please throw some more light?

There is no such European standard first (or please provide it: 83 or whatever number), secondly normalization to peak can be used but level -3.0103dBFS shall be used as maximum allowed. (Loudness level is tricky concept especially for audiophiles).
Normalization to peak value allow to fully use finite resolution of DAC - if signal level is reduced too much then DAC resolution is wasted and overall sound quality lower - there is compromise between this - between -3.0103dBFS and -6.0206dBFS (so loosing between half and one bit from overall system resolution).

Quote
12th Nov 2017 08:14 #30
hello_hello

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2012
Originally Posted by pandy

There is no such European standard first (or please provide it: 83 or whatever number)

ReplayGain's target volume of 89dB is the equivalent of EBU R128's -18LUFS. Reducing the ReplayGain target volume to 83dB gives you the equivalent of -23LUFS. It mightn't be exact... it could be 82dB now I think about it, but it wouldn't be more than 1dB off.

For the record, foobar2000 doesn't use the original ReplayGain scanner any more. It uses an EBU R128 scanner because it's more accurate, but it has to keep writing tags referring to the old ReplayGain target volume and using ReplayGain-speak for backwards compatibility. Personally, I think music players should switch to EBU R128 because it's more initiative, but it probably won't happen anytime soon.

EBU - Operating Eurovision and Euroradio.
-23 LUFS
Basically EBU R 128 recommends to normalize audio at -23 LUFS ±0.5 LU (±1 LU for live programmes), measured with a relative gate at -10 LU. The metering approach can be used with virtually all material. To make sure meters from different manufacturers provide the same reading, EBU Tech 3341 specifies the 'EBU Mode', which includes a Momentary (400 ms), Short term (3s) and Integrated (from start to stop) meter. Many vendors support 'EBU Mode' in their products.

You can't normalise to -3dB using foobar2000 itself. Well you can, but the process becomes far less automatic (Edit: Wrong! See post #32). You can if you're encoding with QAAC and use it to peak normalise though. For f2k and QAAC, the command line for -3dB peaks would be something like:
-N --gain -3dB --ignorelength -s -V 91 -o %d -

I testing the above command line on a CD track and then ran a true peak scan on the encoded version. The peak was -2.72dB, which is typical AAC variation. Without the -3dB gain reduction the peak was +0.25dB.
Realistically though, there ain't going to be a bunch of peaks at that same level, so there's probably a single peak in the audio which might be clipped a tiny little bit, and for soundtrack audio especially, that'll be where there's gunshots or explosions etc so you're never going to hear it anyway, which is why I've never fussed about peak normalising to -3dB, but QAAC can do it.

shans,
you can think of LUFS (loudness units relative to full scale, I think) as being the same thing as dB.
So effectively -18LUFS is the same as -18dB is the same as ReplayGain's 89dB
and
-23LUFS is the same as -23dB is the same as the equivalent of 83dB in ReplayGain-speak. Or maybe it's 82dB. I'd have to check, but I think it's 83dB.

Last edited by hello_hello; 12th Nov 2017 at 20:02.

Quote

Downmixing 6 channel AAC to 2 channel?

Thread Tools

Search Thread

Similar Threads

Convert 6 Channel AAC to 6 Channel AC3

Downmixing AAC 6 channel to stereo with dialogue audio gain?

Surround channel downmixing to stereo in movie rips - why?

Downmixing 5.1 Channel AAC Audio to Stereo w/o "perceivable" quality loss?

Converting 5.1 to 2-channel AAC