I use MeGUI for converting audio. "LAME MP3: *scratchpad*" is my Encoder setting, and my Config for it is in the Attachment. Basically, what I do is I use mkvextract to get the raw aac file itself from the mkv and then I use MeGUI to encode the 5.1ch aac to 128 ABR mp3. But whenever I do that, the mp3 product always has WAY less volume. I tried tinkering with the bit rate and even made it CBR 320kbps but the volume is still much weaker compared to the original 5.1ch aac file. Does anyone know why? Why is this happening? How do I retain the volume when encoding?
+ Reply to Thread
Results 1 to 21 of 21
Your option clearly says "downmix multichannel to stereo".....the software is doing that badly. Simple.
MeGUI uses what seems to be a fairly standard downmix matrix. If you use something like ffdshow to downmix multichannel audio to stereo, and check the "normalize matrix" option in ffdshows Mixer filter, you'll see it uses the same matrix as MeGUI for downmixing.
The way I understand it, it's a "worse case scenario" matrix, in that the volume of each channel is reduced in order to prevent any chance of clipping when the channels are combined. This invariably reduces the volume.
The way to "fix" it is to check the "Normalize peaks" option when downmixing. That'll get MeGUI to increase the gain of the audio until the peaks are at maximum (100%, although you can change it if you want to). Sometimes that'll result in stereo audio which sounds a little louder than the original (depending on how the original is being downmixed when you listen to it )
How do you use that black Windows theme? It'd make my head hurt.
Here's the script MeGUI uses for downmixing 5.1ch audio. You'll find it in the log file.
Here's ffdshow's mixer filter with the Normalize matrix enable (it's not the same as MeGUI's normalize function)
If you have a look at the percentages (might be a bit hard to see for ffdshow, as they're grey) you'll see both programs reduce the volume by the same amount when downmixing, so I don't think MeGUI is doing anything odd, you just need to enable MeGUI's normalize function to regain as much volume as possible after downmixing.
Someone else may know more about it than I do and might be able to shed some more light on the matrices used when downmixing.
Last edited by hello_hello; 18th Apr 2014 at 04:15.
Thanks! That "Normalize Peaks" option fixed it. It was originally checked, but then I unchecked it because I didn't know what it was. What I thought it did was that it would make low volumes louder and louder volumes weaker (with a limit of 100 alterations). But I didn't want that. I wanted the volume proportionate to the original, even if it meant having the resulting volume to be much lower overall.
But well, that fixed it. If I may ask for a quick explanation on what it does? I didn't quite get it reading on wikipedia.
Okay, I seem to have run into another problem. Now that the Normalize Peaks fixes it, when I tried encoding a longer 5.1ch aac file (audio track from a movie that runs an hour and a half), how come it always ends up in Error and does not encode at all? It just leaves the white ffindex file on the desktop. The time elapsed always (or so it appears) ends at 00:53.
Log is in attachments. What is going on this time? I noticed that the first successful encoding with Normalize Peaks just a few moments ago took longer than usual. That one was just 23:27 long (23m 27s). Yes it took quite some time, longer than 10 seconds, that what I recall. So what more if it's an hour and thirty three minutes? Anyone?
FFAudioSource: ReadPacked unexpectedly failed to read a packet
1) Try leaving it in the mkv container instead of demuxing
2) Try another decoder like bassaudiosource, or configure your directshow decoders to be able to decode aac (e.g. install lav filters), and use directshow
Yeah, try loading the MKV directly into the audio encoding section to see if that works. I don't know why you're getting that error. MeGUI seems to be trying to decode using DirectShowSource. Any reason for that? Change the preferred to decoder to NicAudio and let MeGUI choose the best decoder for the job. That might be the problem. Unless you're specifying DirectShowSource for a particular reason?
I just tried downmixing a 5.1ch 90 minute AC3 file while normalising and converting to AAC and the job completed without error.
All MeGUI's normalise peaks option does is raise the overall volume until the loudest part is at maximum. It doesn't change the relative volume throughout the file, it adjusts the volume over-all by the same amount (for each file). It'll probably take longer to encode because I think MeGUI would need to downmix, check the peak levels, then adjust the audio volume and encode, so it's probably a bit like 2 pass encoding.
I don't use MeGUI for audio encoding as I have my own system for downmixing multichannel audio to stereo. I use foobar2000's audio encoder instead.
foobar2000 has a downmix to stereo DSP you can add to the conversion chain. It doesn't normalise or prevent clipping so when downmixing I also get foobar2000 to apply a 6db gain reduction to the audio to prevent it. 6db seems to be plenty. When converting to a lossy format such as MP3, the audio can be stored with values above 100%, so there's no clipping as such anyway. Even if my method does result in values above 100% occasionally, it'd be only 1dB or so, so it's not really a problem. Doing it the way I do it.... downmixing while applying the same gain reduction each time..... seems to allow me to downmix while maintaining the same volume relative to the original each time.
For encoding multichannel audio without downmixing I just re-encode it "as-is".
If you've not used foobar2000 before it can involve a bit of a learning curve to configure it to your liking. You can save conversion settings as presets though, so once it's done, converting is nice and easy. Unlike MeGUI, foobar2000 will also batch convert audio, so you can load a bunch of files into a playlist, right click, select a conversion preset, and it'll convert as many simultaneously as you have CPU cores until it's done. You can also load MKV, AVI and MP4s into it's playlist and convert the audio without demuxing. It also has a plugin which lets it play/convert audio via an Avisynth script.
My foobar2000 conversion setup looks like this:
If you want to try it, I can hopefully give you a head start. My foobar2000 configuration files are attached to this post. I don't think it includes the QAAC or FhG encoders or configurations but I think the rest is the same.
PS. If you keep getting the same error when normalising it may be a bug. The best place to report it would be here. The MeGUI developers hang out there and they generally fix reported problems very quickly, although if you're not a member you have to wait five days after joining before you can post which is really dumb. And don't mention downloaded files. Once of the moderators looks for the slightest excuse to moderate. I've reported bugs in the past which have been fixed and a new MeGUI version released within 24 hours. It depends how busy the developers are.
Last edited by hello_hello; 19th Apr 2014 at 01:48.
It's been an hour and 50 minutes already, and until now, it still hasn't started. I left the aac in the mka container, and until now, it's still "preprocessing." I don't get why it's taking this long. I still have yet to experiment using NicAudio and FFAudioSource as the decoder, also to experiment with the 3 decoders on the mkv itself (since right now, I'm doing the mka). Why's it taking this long? I have 8 GB RAM, so this is quite ridiculous already.
*UPDATE: It works with NicAudio. It only took less than a minute to do the whole pre-encoding process. Thank you!
Last edited by zetsu_shoren; 19th Apr 2014 at 03:24.
It appears to be "stuck". Just abort it and try again with a different decoder. NicAudio "should" work. DirectShow appears to be failing and ffaudiosource appears to be getting stuck, but theoretically they should work too.
Does it only happen when you try to convert aac audio and normalise it?
Last edited by hello_hello; 19th Apr 2014 at 03:22.
Thinking about it, there were a few audio changes made recently. Bug fixes, I think. I don't know if any could relate to your problem, but it can't hurt to try. You're using MeGUI version 2237. The current stable version is 2418 (I think) but the most recent version is 2493. Either update to the latest stable version, or switch to the development update server in MeGUI's options, then run an update from the Options menu. You never know, the problem might go away. The version you're using is probably a bit old.
There's also an exclamation mark next to "update" and "versions" at the top of the log file. Probably for a reason. What's MeGUI unhappy about there?
If all else fails and you can upload a sample of the audio, I could give it a spin to see what happens.
Last edited by hello_hello; 19th Apr 2014 at 03:35.
I have come across a new (or probably the same) problem. It was already working with 2237, but with 2418, even 5.1ch ac3 files (which normally have their volume the same when encoded before knowing the Normalise Peaks) have their corresponding resulting mp3 volume lowered, even with the Normalise. Some 5.1ch aac and ac3 files (in the mka container), when encoded to mp3 using any of the decoders, with and without Normalise Peaks, ABR 128, would always result to lower volume. I don't get it. It was fine when I was using NicAudio Normalise 100 Downmix to Stereo ABR 128 mp3 on 2237, but the same settings on 2418 make my new discovery as if it was pointless to learn it.
I think I'll be switching back to 2237. In 2418, I also tried the raw aac file with Normalise Peaks using any decoder, but MeGUI always errors and I have to close it. I'll be experimenting with 2237 again, and if I get the same volume, then I'm sticking to 2237. If anyone knows how to fix it in 2418, it would be great so that I don't have to wonder which one I should be using.
*EDIT: Apparently, for one episode's 5.1ch aac of 24m 2s, regardless for which version I use, with NicAudio and Normalise results in the same volume for the mp3, but a DIFFERENT episode with its 5.1ch aac of 24m 38s with the same settings results in/to lower volume. What is the reason for this? The ac3 file I mentioned above was a 5.1ch of movie length. I'm guessing the problem lies with the length of the audio file to encode, seeing as that's what I notice.
Last edited by zetsu_shoren; 24th Apr 2014 at 04:05.
Standard normalisation isn't the best way to normalise audio. It normalises the audio until the peaks are at maximum value, but one audio might have louder peaks than another.
If two audio tracks have the same average volume, but one has a peak 10dB louder than the other (a particularly loud gunshot, for example), after normalising them both so the loudest part is at maximum, the first will have an average volume that's 10dB lower, because it can't be increased as much.
Often when normalising audio.... especially if it's different episodes of a TV show.... the average and peak volumes will be pretty similar so normalising the traditional way works fine. If the peaks are quite different though (which sometimes happens) the results will be different..... that's why I do it using foobar2000 and the method I explained earlier.
If you want to confirm whether the problem is MeGUI or NicAudio etc, and you're working with MKVs, extract the audio using the HD Stream Extractor (under the Tools menu). In the options section, add the following (without the quotes) "-normalize -downStereo". Covert it to another format and check to see if the result it the same (it won't convert to MP3 but you can convert to something else to compare volume). If the result is the same, it won't be anything MeGUI is doing wrong. It might also be interesting to try the HD Streams Extractor with the problem AAC..... the one which causes MeGUI to fail.
You should find the eac3to log file saved to the same location as the output file.
For the MP3s you've already encoded, try MP3Gain. Scan the files and it should tell you how loud the peaks are. It'll also "normalise" using ReplayGain, which normalises by adjusting the volume according to how loud it sounds on average (not by simply increasing it until the peaks are at maximum). I think you'll need to enable the "maximise features" in it's advanced settings first, but then you can right click on an MP3 and select "track analysis" and when it's done it'll show you the maximum volume increase before the peaks are at max. In the example below it's only 1.5dB for both files and according to ReplayGain the average volumes are the same.
For soundtrack audio I change the "target volume" to 83dB. The default is 89dB. That's fine for music audio, but for sountrack audio which tends to be more dynamic, reducing it to 83dB should prevent any chance of clipping once ReplayGain has been applied. MP3Gain can adjust the volume of MP3s without re-encoding them.
I don't know of any video conversion GUIs which use ReplyGain for normalising audio. I'm not sure why as it's generally a better way to do it than simply adjusting the volume until the peaks are at maximum.
Anyway..... in the example below the average volumes are 85.9dB, I've set the target volume to 83dB, so after applying "track gain" it would reduce the volume by 3dB. The peaks are currently 1.5dB below maximum. If you check one of your "low volume" MP3s that way and the peaks are already at 0dB then MeGUI's normalisation has probably worked correctly.
The same two files with the target volume set to 89dB instead of 83dB. Now instead of reducing the volume by 3dB, applying ReplayGain will increase it by 3dB. The red "Y" indicates the peaks will be above maximum after applying ReplayGain.
You can also adjust the volume of MP3s losslessly with mp3DirectCut. It only has a traditional "normalise" function, but you can adjust the volume up or down by an amount you like.
Last edited by hello_hello; 24th Apr 2014 at 07:40.
Okay. I guess I'll just try foobar2000 instead.
*EDIT: I've added your files already after installation. May I know the difference among the three MP3 presets? Before I added your files, I tried making my own MP3 preset.
My settings were:
1. MP3 (LAME)
2. ~130kbps (*), V5 (this was a knob which I couldn't set at 128, nor ABR. If I may ask for assistance on this part?)
1. Specify --> Desktop
2. Name format --> %title% (I left it as is)
1. Active DSPs --> Downmix channels to stereo
2. Everything else was left as is, including "Without RG info" being left at ±0.00dB (crucial part)
1. All left as is
This was all before I fully read your post mentioning instructions which I failed to see within the extracted folder.
After copying your files, I tried using your MP3 128 Downmix, and then the resulting mp3 file was 3mb bigger than the 110 VBR mp3 I made, and it's volume was even lower.
2. Everything else was left as is, including "Without RG info" being left at ±0.00dB (crucial part)
Here, audio mediainfo:
This one is when I tried your "No Processing" mp3 preset:
I'm guessing this was an error because there was no "Downmix" option chosen, since mp3 can only have 2ch.
Last edited by zetsu_shoren; 24th Apr 2014 at 14:16.
You've got to remember that every time you combine audio channels you increase the volume. If you took two identical stereo tracks and mixed them into a single stereo track the resulting stereo track would be louder than either of the originals. The same applies to downmixing multichannel audio to stereo, so when you say the downmixed version is quiter then the original it depends on how you're listening to the original. When you listen to the original using a PC the multichannel audio is probably just being "combined" to stereo. You can downmix exactly the same way, but because downmixing that way might result in clipping (sections of the audio will have peaks greater than maximum) the foobar2000 presets I use reduce the volume by 6dB while downmixing to prevent that from happening. If you want to stop the volume from being reduced while downmixing you can disable it in the ReplayGain section.
There's two settings under ReplayGain. The first adjusts the volume after ReplayGain has been applied. My conversion presets don't use ReplayGain. In order to use it, you first need to run a ReplayGain scan on each file and save the ReplayGain info as tags in the files (foobar2000 has a function for that). foobar2000 can then use the saved ReplayGain info when converting. The second fader in the ReplayGain section adjusts the volume for conversions where there is no ReplayGain info in the files. For my "downmix presets" it's set to -6dB. The reason I do it that way is so the result will always be the same when downmixing. The multichannel audio is downmixed, it's volume is reduced by 6dB to prevent clipping and then it's converted. Even if the resulting stereo audio sounds a little quieter than the original it'll still have the same volume relative to the original each time because it's being downmixed the same way each time and the same volume reduction is being applied. When you downmix with MeGUI while "normalising" it's being downmixed the same way but the "normalising" doesn't always adjust the volume by the same amount.
If you want to get rid of the -6dB volume reduction you can. Just move the second slider under the ReplayGain section back to 0dB and save the conversion preset. foobar2000 will then just downmix without adjusting the volume any further and the output should sound the same volume as the input.
For the LAME encoder.....
Foobar2000 only has one method for adjusting the MP3 settings "automatically". It sets LAME's variable bitrate encoding using one of the built-in LAME variable bitrate presets. (V0, V1, V2, V3, V4 etc). V2 is the default. The lower the value the better the quality, although anything from V2 - V0 is considered "transparent". I use V2 for converting CDs for my MP3 player.
The standard LAME VBR presets are like the audio equivalent of x264's CRF encoding. You're picking the quality, but the bitrate will be different each time.
The other possibilities are average bitrate encoding and constant bitrate. I'm pretty sure MeGUI has presets for both but foobar2000 doesn't. Average bitrate is a variable bitrate method which unlike the normal variable bitrate lets you choose the target bitrate. It's a compromise between true variable bitrate and constant bitrate. To use ABR or CBR LAME encoding with foobar2000 you need to set the encoder commandline manually. For a CBR 128k MP3 it looks like this (select "custom" for the encoder, or select one of my CBR presets and then "edit" or "new"):
For an ABR 128kbps encode the commandline might look like this:
-S --noreplaygain --abr 128 - %d
For average and constant bitrate encoding LAME also lets you adjust the quality which effects encoding speed. The default is q3. Many encoder GUI's (including MeGUI) set the quality to q2 which is slightly better and a bit slower. For an ABR 128kbps encode with q2 the commandline might look like this:
-S --noreplaygain --abr 128 -q 2 - %d (I'm pretty sure that'd be the commanline MeGUI uses, with whatever bitrate you set)
-S --noreplaygain --cbr -b 128 -q 2 - %d
(Don't worry about the --noreplaygain setting. It stops the encoder from saving ReplayGain tags to the MP3s when it's encoding).
I'm guessing this was an error because there was no "Downmix" option chosen, since mp3 can only have 2ch.
Something to remember....
If you change an encoder "setup" and it's being used in a conversion "preset" you must resave the preset if you want to use the new encoder settings with that preset. If you don't, the preset will revert back to the original encoder setup the next time you use it. The same applies to ReplayGain and DSP settings etc.
It can take a bit to get your head around the way the foobar2000 converter works but once you do and everything's saved as presets it becomes very easy to use.
Last edited by hello_hello; 24th Apr 2014 at 20:44.
What's the essential difference between:
-S --noreplaygain --abr 128 - %d
-S --noreplaygain --abr 128 -q 2 - %d
And if I were to use either command line, would I need to change this part
to "ABR, 128kbps" too? Just like that above? It was previously "CBR" so I thought that if I had to change the cbr in the command line, maybe I'd have to follow suit with anything else that says "CBR." And about the --abr128 in the image, I accidentally forgot to add a space in between. I changed it to --abr 128.
And about the relation between
If the Processing is "none," Without RG info cannot be altered? And if Without RG info is set to ±0.00dB, will Processing automatically be set to "none?"
Last edited by zetsu_shoren; 25th Apr 2014 at 06:25.
Try re-encoding a file without -q 2 in the commandline, then again with -q 2. You'll find with it, encoding takes longer. Try again with
-q 0 and you'll see it's painfully slow. Even though you don't see it in the GUI, I'm pretty sure MeGUI adds -q 2 to the commandline when it's doing ABR or CBR MP3 encoding. It'll be shown in the commandline in MeGUI's log file.
The top dropdown box tells foobar2000 which ReplayGain tags to use to adjust the volume when encoding. The Track Gain or Album Gain tags. It's two different types of scans. Album Gain scans a bunch of files as an album so when ReplayGain is applied they all have their volumes adjusted by the same amount. Track Gain adjusts tracks individually. If the top dropdown box is set to none, any ReplayGain tags in the files will be ignored and the top volume slider will have no effect.
For files without ReplayGain tags, or when the top box is set to "none", the bottom slider adjusts the volume by a set amount. If you set the bottom dropdown box to "none", the bottom the slider will have no effect. If you change the volume using the bottom slider with the bottom dropdown box set to "none", use the "back button" and then go back into the ReplayGain configuration, you'll see the slider has been reset to zero.
You can get a fair idea of what the ReplayGain section is doing in terms of volume here:
If in doubt, after setting up the volume in ReplayGain (or disabling it), use the back button to see what's listed under the Processing section. If there's no mention of ReplayGain, then the volume isn't being adjusted.
Last edited by hello_hello; 25th Apr 2014 at 08:37.
Yes, I noticed that the first time I encoded with my own made up settings (mentioned here), it was encoding at x50.0+, but now with the -q 2, it's encoding at the same speed as MeGUI of x20.0+.
Alright, I think I've got it already. Haha, thanks mate! For foobar2000, the instructions, the files, and explanations!
I don't do much downmixing these days as I don't convert to AVI for playing video with DVD players much, but I happened to be doing a season of a TV show today so I thought I'd run the downmixed files through ReplayGain and have a look. I used my usual foobar2000 downmix method. I'd assume as they're from the same TV series all the volumes should be pretty much the same to begin with.
ReplayGain seemed to do a pretty good job of determining the volumes. It thinks they're all within 2dB of being the same which isn't much of a variation. In this case if they were all adjusted by the same amount until one of them had peaks at maximum, it turns out they could be all increased in volume by 6dB without any of them "clipping" so it turns out the 6dB gain reduction I normally apply was completely unnecessary.
Mind you the series in question doesn't have particularly dynamic audio..... it's all dialogue with no car chases or gunshots etc..... and the idea behind the 6dB gain reduction was a "worst case scenario" method of downmixing so I wouldn't have to keep checking for clipping after every downmix. Without the 6dB gain reduction many of them would have had peaks at maximum. Mind you even if the peaks end up a couple of dB above maximum it's probably no big deal.
Even for the series in question, the difference in peak levels is 6dB, so using "traditional" normalisation would mess with the relative volumes a bit. 6dB isn't huge, but it's enough to be noticeable.
Anyway..... as I was downmixing a bunch of files I thought I'd have a look to see what the result was, given we've been discussing the subject.
There's something I noticed. Whenever I convert a single audio file using foobar2000, how come the speed is normal (the usual x20 to x26) but when I convert a batch of around 7-10 (even more), the encoding speed reaches x70 to x90 (but I think it slowly drops in time, which my guess for that is when there are fewer files left to encode)?
I think the displayed encoding time is the combined encoding time for multiple files. So it'll display something like 80x when it's encoding four files simultaneously at 20x. It probably drops towards the end as they complete and the number of files it's encoding simultaneously drops.