I have a movie with 6 channel AAC, 2 front, 2 rear, center front, LFE. I want to downmix it to stereo but for some reason on this particular one, AVIDemux is proving incapable of doing it correctly. All I get out of it is a garbled mess, using the same settings that have successfully downmixed many other 5 and 6 channel audio.
I've processed the video with NVEnc to get it down in size, now the audio needs to shrink. Going to keep the bigger version with 6 channel for this one, the shrinky-dink job is for devices of lesser capability.
I loaded it into WaveShop and Audacity and Audition 3.0. I can tell which are the center and LFE channels but none of those tell the user which channels are front, rear, left, right. They show as six mono streams instead of two stereo and two mono.
Is there software that I can load it into and just select to strip out the rear and LFE channels to convert it to 2.1? Or how about a program that labels the channels' positions in its GUI? Remember that little sub-megabyte Windows app for manipulating multi-channel DVD audio? It had a graphic showing each channel position and the user could select which ones to export and also do things like select separate files to merge into a multi-channel audio track. It had a ridiculously high price, then was discontinued.
+ Reply to Thread
Results 1 to 30 of 34
MeGUI can process all audio streams it is able to load into AviSynth. It uses the same engine as BeHappy. You may prefer a Dolby ProLogic (DPL) or ProLogic II (DPL2) downmix over simple stereo, to maintain some of the surround information.
To encode to AAC again, MeGUI supports qaac (which requires the Apple encoder from QuickTime or iTunes to be either installed or extracted with makeportable.cmd), or alternatively the slightly outdated Nero AAC encoder.
Furthermore, if you can discover the right parameter set for the command line, ffmpeg should be able as well, and its current own AAC encoder is not too bad either.
I guess you refer to Sonic Foundry (now Sony) SoftEncode for AC3...
It may. But it only handles Dolby Digital AC3, not MPEG AAC.
Lastly ffmpeg received 'headphone' filter (HRTF) - didn't try it yet however it looks promising as binaural sound processor - if someone have intention to perform donwmixing then i suggest to at least try newer ffmpeg.
5.1ch audio should always load into Audacity in this order:
Front left, Front right, Front centre, LFE, Surround left, Surround right.
I generally use foobar2000 for downmixing as the matrix mixer DSP lets you configure the downmix and QAAC has a peak normalisation function. Thinking about it, QAAC can downmix too.
You'll find third party DSPs for decoding AC3 and DTS on the foobar2000 site.
Nothing wrong with using ffmpeg of course. I use it to apply dynamic range compression for temporary copies I'll be viewing at night, and from there, pipe the ffmpeg output to QAAC.
5.1 FL+FR+FC+LFE+BL+BR 5.1(side) FL+FR+FC+LFE+SL+SR
@ffmpeg -y -i "%1" -vn -sn -c:a aac -b:a 192k -af "pan=stereo|FL < FL+1.414FC+0.5BL+0.5SL+0.25LFE+0.125BR|FR < FR+1.414FC+0.5BR+0.5SR+0.25LFE+0.125BL" -movflags faststart -f mp4 "%~n1.mp4"
Last edited by pandy; 28th Oct 2017 at 12:09.
I have opened AC3 5.1 audio in audacity and it opened as 6 mono. I have no idea about the details of individual tracks. Actually, I want to apply compression to only channel 'FC' leaving others untouched. I will mute all other channels keeping the "FC' active. Is there any other method to distinguish the tracks individually?
But, MediaInfo shows differently and I am a bit confused as to which one to follow.
ffmpeg where channel names instead channel sequence can be used.
By multiplying particular channel by '0' (zero) you efficiently mute channel. unless there is some special reason i would recommend to NOT mute all channels - you can significantly attenuate them for example by 40dB (divide level by 100 or place simply 0.01 as multiplier), if 40dB is still too loud then 60dB attenuation (0.001) should be fine.
5.1 FL+FR+FC+LFE+BL+BR 5.1(side) FL+FR+FC+LFE+SL+SR
You may know the answer to this..... foobar2000 decodes most/all lossy 5.1ch formats with the surround as "back", except for dts which it decodes as "side". It's probably an ffmpeg thing, but I've often wondered if there's a reason why.
MediaInfo doesn't show the correct order in it's GUI. Well..... "correct" in that it doesn't use the wave file channel order. Here's the way I remember it.... (there was a discussion about it at doom9 quite a while ago).
Because different formats use their own channel order internally and there's no such thing as "side" in 5.1ch audio anyway (just surround) it was decided to display the channels in an order that makes sense if you're standing in the middle of the room surrounded by speakers. So generally it'll show L C R Side LR LFE which makes sense because then for 7.1 channel you can use something like L C R Side LR Back LR LFE. It's really nothing to do with the actual encoding order.
You can, if you're a foobar2000 user, open AC3/AAC/DTS (AC3 and DTS require third party DSPs to decode but they're on the foobar2000 site) and it'll decode to wave file format and adjust it's output meters (which I don't think are enabled by default) to show you what's going on.
For 5.1ch it generally decodes lossy audio to the back channels, which it labels "R" to aggravate me personally ("rear" I assume) but whether the surround channels are "R" or "S" (for side) they're both interchangeable as surround when encoding.
This is the matrix mixer DSP I use with foobar2000 displaying the channels in wave file order. You can see why "Back" was originally used for "surround". Then along came 7.1ch and made things worse because the extra surround speakers go behind the listener, so they need to be "back" or they're around the wrong way (and yes there's more than one type of 7.1ch, but I'm referring to the only one we actually use).
Hence the "back"/"side" ambiguity that still survives today for 5.1ch, but that's the order it's decoded to on a PC, and lossy encoders should accept either back or side as "surround".
And if I still haven't convinced you, here's how QAAC handles the channel mapping when you feed it wave files.
The third column shows the AAC channel mappings. MediaInfo won't show them to you that way. Whatever the AAC mapping, the first column effectively also shows the order in which it'll be decoded on a PC. At least in theory, but 7.1ch AAC encoding/decoding is pretty messy. Personally I think surround sound sucks anyway......
The channel order you're seeing with MediaInfo happens to be the channel order in which AC3 is encoded.
Last edited by hello_hello; 30th Oct 2017 at 01:05.
No worries. Foobar2000 comes with presets for various encoders (even some that you have to download separately) but if you need help configuring it with customised command line options, someone will be able to help. There's an encoder pack on the foobar2000 site that contains the encoders they can legally distribute.
The only downside to using foobar2000 for encoding, is aside from changing the relative volumes or downmixing, is it's generally not practical for editing individual streams in multichannel audio as you can with Audacity, such as compressing a single channel, but chances are ffmpeg can do it if someone can supply the correct command line. I use ffmpeg for encoding with foobar2000 quite a bit but usually it takes me a while to work out how to get it to do clever things as I don't use it's filtering much, and what I do use I have saved as presets.
Last edited by hello_hello; 30th Oct 2017 at 05:07.
https://forum.videohelp.com/threads/370412-Looking-for-an-AAC-LC-encoder-from-Frauenho...er#post2376348. I processed a single "FC" channel of a multi-channel audio file in audacity, saved the whole as a WAV file, opened it in foobar and saved as a 5.1 AAC using encoders like Apple, nero, fdk_aac, and FhG. I could not find any distinct difference of these, anyway. I observed that the volume gets down by around 5 to 6 dB when I played it in foobar compared to audacity. I do not know if there is a set up by default to reduce the volume in foobar. I am not conversant with command line. Thanks again for your able guidance.
https://forum.videohelp.com/threads/370412-Looking-for-an-AAC-LC-encoder-from-Frauenho...er#post2376348. I processed a single "FC" channel of a multi-channel audio file in audacity, saved the whole as a WAV file, opened it in foobar and saved as a 5.1 AAC using encoders like Apple, nero, fdk_aac, and FhG. I could not find any distinct difference of these, anyway. I observed that the volume gets down by around 5 to 6 dB when I played it in foobar compared to audacity. I do not know if there is a set up by default to reduce the volume in foobar. I am not conversant with command line. Thanks again
On my XP PC, Audacity doesn't adjust it's own volume, it adjusts the volume of the sound card. Therefore when I reduce the volume in Audacity, it turns foobar2000 down too (foobar2000's volume fader doesn't change though). I'm not sure if it's possible to configure Audacity to adjust it's own volume and leave the soundcard alone, but if you simply convert from one format to another, the volume shouldn't change.
I'd recommend using QAAC. It has a "no delay" option to prevent it adding extra silence to the beginning (all lossy encoders do it). It doesn't matter when encoding music (players should know to skip the padding anyway) and it's probably better not to use the no delay option for that, but when muxing soundtrack audio it can effect the audio sync a little (depending on the muxing program used).
PS. For fhgaacenc.exe, use nsutil.dll from the encoder pack and not from the Winamp installer. It's modified to accept a 32bit float input and fix a bug, or something along those lines.
And I think these days the iTunes installer contains a newer version of Apple's CoreAudioToolbox than the QuickTime installer, so if you used the latter to get the files it might pay to replace them.
Can you please tell me the procedure to down mix a 5.1 aac audio to stereo in foobar, right from the installation of plug in?
foobar2000 has a ReplayGain playback option. Your file probably doesn't have ReplayGain, but check in Preferences/Playback to make sure it's not enabled. The pre-amp can be configured to automatically reduce the volume by a specific amount when there's no ReplayGain info saved to the file. That's the only explanation I can think of for a volume reduction, unless you have a DSP in the playback chain that's reducing the volume.
The DSPs often come with an installer. If not you can unzip and copy them to your user configuration folder. For Windows XP it's:
C:\Documents and Settings\Your User Name\Application Data\foobar2000\user-components.
That should help you find it for other versions of Windows. I think you can put the DSPs in the foobar2000 installation folder and they'll work, but I don't think foobar2000's updater will check for updates if you put them there.
If you have to do it manually, for the matrix mixer DSP create a folder called "foo_dsp_mm" as a sub-folder in user-components and put "foo_dsp_mm.dll" inside it.
Foobar2000 has it's own DSP for downmixing multichannel audio to stereo, but I prefer to use the matrix mixer DSP as it's configurable, and automatically reduces the volume to prevent clipping if you check the "normalise" option. http://skipyrich.com/w/index.php/Foobar2000:Matrix_Mixer
Add the DSP to the processing chain when creating a conversion preset.
I have it configured like this, but it's personal preference. I don't include the LFE channel. If you want it, set it to 1 in both the FL and FR channels. The channels with the volume set to 0.707 mean they're reduced by 3dB relative to the others. If you don't want that, set them to 1 (-3dB is standard for the centre channel as it's being split to two channels. For the surround channels, it's more personal preference). You should check the "normalise" option to prevent clipping.
That'll generally leave the volume a bit low as it lowers the volume to "worst case scenario" when combining channels to prevent clipping. There's several ways to adjust the overall volume. Adjusting so the peaks are at maximum is one way, and the simplest way to do that is to encode with QAAC and get it to peak normalise. The -N option tells it to do just that.
(Tip: Select QAAC in the encoder configuration, select your desired encoder settings, switch the encoder to "custom" at the top, foobar2000 will fill in the command line and you can edit it from there).
When you've added a DSP and configured the encoder, save it all as a conversion preset. Recheck everything to make sure it's okay. Then you can load a bunch of 5.1ch audio files into a playlist (or just one), highlight them all, right click, select your QAAC downmix preset, downmix and convert.
Or for fun....
Another option is to not use a DSP and downmix with ffmpeg while converting instead. I just borrowed pandy's command line from earlier (because I'm a bit crap with ffmpeg) and modified it to work with foobar2000 (running on XP). This downmixes with ffmpeg the way pandy suggested (which I'll confess I don't fully understand yet), then sends the downmixed audio to QAAC for peak normalising and converting to AAC. You'd have to modify it for the file paths on your PC, but the whole command line looks like this (there's a space at the beginning):
/d /c c:\progra~1\foobar2000\encoders\ffmpeg.exe -y -i - -c:a pcm_f32le -af "pan=stereo|FL < FL+1.414FC+0.5BL+0.5SL+0.25LFE+0.125BR|FR < FR+1.414FC+0.5BR+0.5SR+0.25LFE+0.125BL" -f wav - | c:\progra~1\foobar2000\encoders\QAAC\qaac.exe -N --ignorelength -s --no-optimize --no-delay -V 91 -o %d -
Last edited by hello_hello; 11th Nov 2017 at 19:34.
I have installed the matrix mixer DSP and I could down mix 5.1 ch audio to stereo too. I have learned a lot from your explanations and still keep learning. What should I do to encode AC3 in foobar. I have decoder and could open ac3 files.
In qaac encoder setting parameters, I found --no-optimize, --no-delay commands. I have no idea about these. What do they do?
After googling, I am of the opinion that VBR encoding is preferred to CBR. I would like to get some suggestions from you in this regard. Which one is better in respect of stereo, multi-channel encoding, music and movie track etc., Thanks again.
ffmpeg. You can use the Aften AC3 encoder but it was merged into ffmpeg and it's no longer maintained. If you put ffmpeg.exe in the foobar2000/encoders folder, f2k will automatically check for it there. See the attached screenshot.
Foobar2000 does it's own optimising, so --no-optimize tells QAAC not to bother.
F2k doesn't add --no-delay. Did you confuse that with something else?
AAC encoders have a VBR quality setting. You select the quality and the encoder will use whatever variable bitrate is required to achieve it. The more channels, the higher the bitrate. It's much like x264's CRF encoding.
I started out using Nero's default of q0.50 for AAC, so for QAAC I use V91 which results in a similar bitrate.
QAAC also has a Constrained VBR mode and average bitrate mode. They're both also variable, but True VBR is better.
If you like I can explain how to use ReplayGain to encode all your files at the same volume. Sometimes it's preferable to peak normalising. You don't need to enable ReplayGain on playback. Just use it when encoding.
For ffmpeg 192kbps AC3.
-i - -ignore_length true -c:a ac3 -b:a 192k %d
Last edited by hello_hello; 11th Nov 2017 at 19:40.
MKVToolNix it'll account for the AAC audio delay (only AAC). For instance if you encode with Nero (because I can roughly remember the numbers) it adds padding of a little over 50ms. MKVToolNix will remove it, but because lossy audio is stored in frames it has to remove a little more than 50ms, then it applies an audio delay to compensate. None of that's a bad thing because soundtrack audio is virtually always silent at the beginning, but for Nero after muxing you usually end up with a 9ms audio delay. With QAAC's --no-delay option, none of that needs to happen.
foobar2000's option for setting the volume when losslessly adjusting is buried deep in it's preferences. It's an audio player after-all and the ReplayGain volume is supposed to be fixed at 89 (there's a long explanation as to what 89 means, and it's a little retarded anyway, and it's not required for this story).
The original idea behind ReplayGain was to scan the files to determine the volume, save the info to tags and let the player adjust the volume on playback. Hardware support is fairly non-existent though, so the workaround is to adjust the volume of the audio to the same level so the player doesn't need to do it.
Right click and use the "ReplayGain/Scan per file track gain" option. When it's done save the ReplayGain info. Right click again and select "ReplayGain/Apply track gain to content" and that's it. It only works for MP3 and AAC. That should give each track the same average volume according to your ears rather than the highest peak volume or an RMS volume etc. For standard music (CD tracks) stick with a volume of 89. For soundtrack audio, change it to 83 as that's the European standard for soundtrack audio and provides more headroom for greater dynamics. See the first attached screenshot.
For more accuracy or other audio types, it's a 2 stage process if you downmix, but it doesn't require an intermediate file if you're not downmixing.
You'd downmix and convert to something lossless like a wave file, scan the wave file, then use the ReplayGain option in the Converter/Processing section to apply the volume when converting. For music tracks you'd normally leave the preamp on 0dB. For soundtrack audio you'd set it to -6dB to give you a volume of 83. See the second screenshot.
In both cases, you can load the adjusted/converted files and check their volumes by scanning again to confirm there's no peaks above 1.000 (I don't fuss till they exceed about 1.1).
The second method can also be used for peak normalising. See the third screenshot. It increases the volume by 20dB, which would normally cause clipping, but if there's ReplayGain info saved to the file, f2k will limit the volume increase to prevent that and you end up with a peak normalised output. As long as the "prevent clipping" option is selected. For movie audio I generally just peak normalise, but for audio from a bunch of episodes of a TV series etc, it's nice to have them all the same volume. For CD tracks if you put a bunch of them on an MP3 player and run it in random mode like I do, it's absolutely mental not to adjust them all to the same volume of 89 with ReplayGain first (unless your player supports adjusting the volume using the info saved to tags).
The ReplayGain option in the first screenshot only effects losslessly adjusting MP3 or AAC.
The ReplayGain option in the second/third screenshots can be saved as part of a conversion preset, so you could save one for converting to a volume of 89, another to 83, and one more for peak normalising.
Last edited by hello_hello; 12th Nov 2017 at 01:01.
Normalization to peak value allow to fully use finite resolution of DAC - if signal level is reduced too much then DAC resolution is wasted and overall sound quality lower - there is compromise between this - between -3.0103dBFS and -6.0206dBFS (so loosing between half and one bit from overall system resolution).
For the record, foobar2000 doesn't use the original ReplayGain scanner any more. It uses an EBU R128 scanner because it's more accurate, but it has to keep writing tags referring to the old ReplayGain target volume and using ReplayGain-speak for backwards compatibility. Personally, I think music players should switch to EBU R128 because it's more initiative, but it probably won't happen anytime soon.
EBU - Operating Eurovision and Euroradio.
Basically EBU R 128 recommends to normalize audio at -23 LUFS ±0.5 LU (±1 LU for live programmes), measured with a relative gate at -10 LU. The metering approach can be used with virtually all material. To make sure meters from different manufacturers provide the same reading, EBU Tech 3341 specifies the 'EBU Mode', which includes a Momentary (400 ms), Short term (3s) and Integrated (from start to stop) meter. Many vendors support 'EBU Mode' in their products.
You can't normalise to -3dB using foobar2000 itself. Well you can, but the process becomes far less automatic (Edit: Wrong! See post #32). You can if you're encoding with QAAC and use it to peak normalise though. For f2k and QAAC, the command line for -3dB peaks would be something like:
-N --gain -3dB --ignorelength -s -V 91 -o %d -
I testing the above command line on a CD track and then ran a true peak scan on the encoded version. The peak was -2.72dB, which is typical AAC variation. Without the -3dB gain reduction the peak was +0.25dB.
Realistically though, there ain't going to be a bunch of peaks at that same level, so there's probably a single peak in the audio which might be clipped a tiny little bit, and for soundtrack audio especially, that'll be where there's gunshots or explosions etc so you're never going to hear it anyway, which is why I've never fussed about peak normalising to -3dB, but QAAC can do it.
you can think of LUFS (loudness units relative to full scale, I think) as being the same thing as dB.
So effectively -18LUFS is the same as -18dB is the same as ReplayGain's 89dB
-23LUFS is the same as -23dB is the same as the equivalent of 83dB in ReplayGain-speak. Or maybe it's 82dB. I'd have to check, but I think it's 83dB.
Last edited by hello_hello; 12th Nov 2017 at 21:02.