audio issue not sure how to fix

9th Aug 2020 01:19 #1

Member

Hello everyone hopefully i have found the right location for this post.

I'll start off first with what I am using to encode. I use xvid4ps version 8 but also have the last free version 7. both produce the same issue.

if anyone knows how to fix the issue with those programs great otherwise if there's a way to fix it with other free programs great.

the issue is also present in the source file and I've yet to find a good source file thats not having the issue so i can not fault the encoders.

the offending file is an avi but not sure that makes a difference

what is happening is the audio in it has dramatic volume changes throughout the video. one minute its quite loud then drops really low and vice versa.

I am thinking whats needed is to somehow rip the audio (unless those programs can do it that I use but so far no luck) out and convert it to something and create a base line volume level and have everything either lowered or increased so it matches throughout the file.

I just am not sure how to get this done, one reason i like that programs the encoding is almost automatic for me.

thanks

ryan

Quote

10th Aug 2020 14:48 #2

abolibibelot

Member

I thought that someone would have attempted to reply something by now, but no... so let's have a go at it...

The location is correct, but perhaps the “issue” is a bit difficult to clearly understand – in part because it is first exposed about 2/3 into the post, while sentences before that only vaguely allude to it without containing much in terms of relevant information regarding the situation. So possibly most of those who could have helped were tired and thirsty and low on blood sugar before they got to that point and jumped to the next thread.

So what exactly is the source file ? Is it an official release of some kind or a personal recording ? In what format is the source ? DVD ? Other ?

I once had a similar situation with a professionally made but utterly botched movie DVD, which had about 40min of very low volume audio, then about 10min at about normal volume but with a lot of hiss, then about 30min at very low volume again, plus it had other cringeworthy defects, like a sudden strident noise during a calm scene, or about 10 seconds of audio seemingly coming from another movie... Well, anyway, it was a nasty disaster, and took quite a lot of work to get a halfway decent result. To balance the volume I had to separate the audio track in three parts then normalize the volume of each part independently to reach a similar volume throughout. If the volume changes are very frequent, this might not be manageable. Magix Music Editor has a feature called Volume Adaptation which automatically cuts a track in as many chunks as required (based on the settings : target RMS volume and % of adaptation) and sets the volume for each chunk. It's quite good at what it does (its noise reduction features are also very efficient) but it's more of a companion to the Magix video editor it comes packed with, it's not so convenient as a standalone audio editing software. Perhaps a similar feature can be found in renowned full blown audio editors, but as far as I know Audacity doesn't have anything equivalent, so it would have to be a commercial software.
Another option is to process the whole track with a compressor, which should increase the volume of quiet parts without affecting the loud parts, or the other way around (I did some research and am still not quite sure of what is and what is not dynamic compression). It may sound complicated but it's more complicated than it sounds. Audacity does have a basic compressor, and can work with more fancy ones as third party plugins. I tried a ffmpeg option called dynaudnorm but wasn't impressed by the result. I'm seeing right now that DynamicAudioNormalizer also comes as a standalone command line utility, I don't know if it performs better.

Anyway, the bottom line is -- there is no automatic or almost automatic fix for this, and perhaps that's why noone cared to reply yet, as there seems to be a big gap between the complexity of the task and your expectations of a quick and easy fix.

Quote

10th Aug 2020 16:43 #3

hello_hello

Member

I find the Dynamic Audio Normalizer to be quite good.

abolibibelot,
I kind of remember discussing this in the past, but if not, try the following command line for ffmpeg. The default Dynamic Audio Normalizer settings react too slowly for me.
-i - -ignore_length true -af dynaudnorm=f=150:b=1 -c:a pcm_s24le out.wav
This will react more quickly, but it might cause some noticeable "volume pumping". That's a trade-off with any compression method though.
-i - -ignore_length true -af dynaudnorm=f=75:g=11:b=1 -c:a pcm_s24le out.wav
The CLI and ffmpeg versions are exactly the same.

I use foobar2000 for encoding. For ridiculously dynamic audio, you can add the Amplify DSP to the conversion chain to give the volume a serious boost, follow it with foobar2000's Advanced Limiter to limit the peaks, then send the output to the Dynamic Audio Normalizer for compressing and encoding. I use the version built into ffmpeg as the CLI version isn't very GUI friendly. I've also created encoder presets that pipe the audio from ffmpeg to QAAC for encoding.

I uploaded a portable version of foobar2000 here that contains a whole bunch of conversion and encoder presets.
https://forum.videohelp.com/threads/396860-foobar2000-portable-(for-audio-encoding)
You need to download ffmpeg yourself, but there's instructions in an included zip file telling you where to put it, and a few file paths need to be configured in it's options, but it's fully portable so if you don't like it you can just delete it. Once you've done the few things required to get it working properly you can load any audio into a playlist, right click, select convert, and a list of conversion presets will pop-up.
foobar2000 can also open the common video containers (MKV, MP4, AVI etc) and play and re-encode the audio within. There's a couple of presets ready to go for compressing with the Dynamic Audio Normalizer, but if it doesn't compress enough it's a starting point for creating one that does. You probably won't have to adjust the Dynamic Audio Normalizer settings though, just boost the volume and limit it before it's compressed.

The existing presets list looks something like this. I can help you to create one for more compression of need be, but it might also help to upload a sample of the audio so I can experiment with it.

Attached Thumbnails

Click image for larger version

Name: Clipboard01.jpg
Views: 62
Size: 57.9 KB
ID: 54484

Last edited by hello_hello; 10th Aug 2020 at 17:04.

Avisynth functions Resize8 Mod - Audio Speed/Meter/Wave - FixBlend.zip - Position.zip
Avisynth/VapourSynth functions CropResize - FrostyBorders - CPreview (Cropping Preview)

Quote

10th Aug 2020 19:24 #4

abolibibelot

Member

I kind of remember discussing this in the past, but if not, try the following command line for ffmpeg. The default Dynamic Audio Normalizer settings react too slowly for me.
-i - -ignore_length true -af dynaudnorm=f=150:b=1 -c:a pcm_s24le out.wav
This will react more quickly, but it might cause some noticeable "volume pumping". That's a trade-off with any compression method though.
-i - -ignore_length true -af dynaudnorm=f=75:g=11:b=1 -c:a pcm_s24le out.wav

I don't remember exactly how ffmpeg + dynaudionorm performed compared with the other methods I tried, but, for the issue exposed in the thread linked above (that was about a year ago), I got the best result with Adobe Audition (a “portable” version I “happened” to find -- wasn't gonna pay big bucks for a one-time job) and its multiband compressor + denoising filters.
I'll try to think about running some more tests one of these days (I kept the intermediate files for this now considered completed project on another HDD, and the temperature's too high right now to mess with that mess).

The CLI and ffmpeg versions are exactly the same.

Alright then. But how come the standalone version is that big, 35MB, which is more than half the size of ffmpeg which includes a gazillion other components ? (Well, to partly answer my own question, the ZIP includes vcredist_x86 and vcresist_x64 which are almost 15MB each.)

foobar2000 can also open the common video containers (MKV, MP4, AVI etc) and play and re-encode the audio within.

That's good to know, and that would make the task almost as straightforward as what the O.P. is requesting, provided that the result is satisfying.
Can it directly remux the processed audio as a new video file, or does it only export audio, which has to be remuxed with another application ?

...just boost the volume and limit it before it's compressed.

Here do you mean dynamic compression, or compression of the data to create the output file ?

Quote

12th Aug 2020 04:57 #5

Ryan

Member

hello everyone

thanks for the reply's, been busy and its a small issue really its only had this issue in 1 file out of thousands.

the source supposedly is a dvd rip avi file

back in the day (way back probably in the late 90's) i used to have a program for mp3 music files I used to normalize batches of mp3's so they'd all have the same volume level. that was a great in theory project but never got really off the ground ripping all my cd's then to work on cassettes. nothing worth ripping on records only a few records left mostly collectibles like the gambler from kenny rogers thats signed.

used to do my own captures then found out it was quicker just to download them, oh the days of capping, editing, converting, etc. man hours and hours a day with as slow as computers was it was one or two shows a night at most plus space being at a premium and raw captures like 1gb a minute. don't miss those days.

but none of those programs can I recall or have access too anymore. tmpgenc being the big one, then i had one that would edit the raw files and save them untouched (except for the edits) and then one program i used to make playable dvd's with.

i had even bought a printer that could print directly onto cd and dvd's, had that thing 8 years and never printed on one disc.

of course all that has little to do with the current problem.

whats odd is i've gone back and there are no comments on the audio issue so i deleted and re downloaded it and its still there. had the volume just changed once or even twice it would not have been a big deal but it does it many times and i am already half deaf so when its low then jumps high even I notice.

when I get a chance I'll try that foobar, I just remember in the past when there was audio issues if i separated the audio and video it never went back together quite the same usually snych issues.

thanks again

ryan

Quote

12th Aug 2020 13:29 #6

abolibibelot

Member

Well, you seem to have a habit of wandering around a lot before getting to the point... O_o

of course all that has little to do with the current problem.

And that's the problem...

Was there a question in that last post ?

the source supposedly is a dvd rip avi file

Is this such a rare movie that there's only one source available ? AVI with Xvid/Divx + MP3 should be considered a thing of the past by now.

Oh, by the way I just ate two eggs and a bowl of cereals with two yoghurts and cocoa powder and half a kiwi, and now I'm about to prepare some tea...

Quote

12th Aug 2020 15:41 #7

hello_hello

Member

Originally Posted by abolibibelot

Alright then. But how come the standalone version is that big, 35MB, which is more than half the size of ffmpeg which includes a gazillion other components ? (Well, to partly answer my own question, the ZIP includes vcredist_x86 and vcresist_x64 which are almost 15MB each.)

I don't know if there's a more recent version. The one I have is dated 2017/04/14.
DynamicAudioNormalizerCLI.exe is only 2.8MB.

The annoying thing about the CLI version is it doesn't support a "fake" wave header. Almost every other encoder and audio encoding GUI does.
The "fake" wave header includes information such as channel layout and sample rate etc. It's why most encoders have some sort of -ignorelength option, because the duration in the fake header is invariably wrong.
Without support for a fake wave header, all the necessary details must be specified in the command line. I have a fooba2000 encoder configuration for the Dynamic Audio Normalizer, piping the audio to QAAC for encoding. It looks like this (I removed the file paths to make it shorter).

DynamicAudioNormalizerCLI.exe -i - --input-bits 32 --input-chan 2 --input-rate 48000 -o - -f 150 -b | qaac.exe -R --raw-channels 2 --raw-rate 48000 --raw-format F32L -s --no-optimize --no-delay -V 91 -o %d -

It'll only work if the audio is 2ch and 48k. Any other configurations require a new encoder preset with a different command line.

ffmpeg and QAAC support the fake wave header, so this works for any audio format.

ffmpeg.exe -i - -ignore_length true -af dynaudnorm=f=150:b=1 -c:a pcm_f32le -f wav - | qaac.exe --ignorelength -s --no-optimize --no-delay -V 91 -o %d -

The alternative for the Dynamic Audio Normalizer would be to output a temporary wave file with a real wave header, use it as the DAN input and output another temp wave file for encoding with QAAC.

foobar2000 can also open the common video containers (MKV, MP4, AVI etc) and play and re-encode the audio within.

Originally Posted by abolibibelot

That's good to know, and that would make the task almost as straightforward as what the O.P. is requesting, provided that the result is satisfying.
Can it directly remux the processed audio as a new video file, or does it only export audio, which has to be remuxed with another application ?

It doesn't remux. The reality of it is, it's supposed to be an audio player. It also just happens to be a very good converter.
The only downside to encoding audio without extracting it from the container first, is if there's any container audio delay, it's not accounted for, so it's a good idea to check the original for an audio delay and apply it for the new file when muxing.
Most of the time I extract the audio first, and it's generally extracted with any delay included in the file name. ie

S02E01_track2_[eng]_DELAY 42ms.ac3

The encoded version has the same name, so MKVToolNixGUI automatically applies the delay when muxing.

There's also a plugin giving foobar2000 the ability to open the audio in Avisynth scripts, so you can play and/or encode it. For anything not supported by foobar2000 directly, there's a plugin for decoding with ffmpeg. You have to configure it for the formats you want it to open, but it means foobar2000 can open any audio supported by ffmpeg, in any container it supports.

...just boost the volume and limit it before it's compressed.

Originally Posted by abolibibelot

Here do you mean dynamic compression, or compression of the data to create the output file ?

I was trying to compress a recording from a smartphone recently. There was lots of pops and loud crackling (already at maximum volume), while the voices were very quiet (the phone was hidden in a bag). So I gave the volume something like a 15dB boost, followed by a hard limiter to prevent clipping, then EQ'd and compressed it. It worked quite well, and something similar would probably work for other types of audio with huge volume fluctuations. You can only do so much with compression.

The great thing about foobar2000's DSP processing is you can use the same DSPs and DSP configurations for both playback and conversion. So you can add DSPs to the playback chain, adjust them while you listen to the audio, save the configuration as a preset and use it for converting. The encoder and DSP presets can then be saved together as a conversion preset.

This is the playback DSP manager with the amplify DSP open for configuration, followed by a limiter and then EQ.
The converter has it's own DSP manager, but they share presets. When converting, the audio is decoded, optionally sent to the ReplayGain processor to have the volume adjusted according to any ReplayGain info in the file, then to the DSP manager if any DSPs are loaded, and finally to the encoder, which in this case was ffmpeg for compression with the DAN, then piped to QAAC for encoding.

Attached Thumbnails

Click image for larger version

Name: 1.jpg
Views: 56
Size: 79.1 KB
ID: 54508

Last edited by hello_hello; 13th Aug 2020 at 21:11.

Avisynth functions Resize8 Mod - Audio Speed/Meter/Wave - FixBlend.zip - Position.zip
Avisynth/VapourSynth functions CropResize - FrostyBorders - CPreview (Cropping Preview)

Quote

12th Aug 2020 19:34 #8

abolibibelot

Member

The reality of it is, it's supposed to be an audio player. It also just happens to be a very good converter.

That I know, but at this point it would be an almost trivial addition... I read somewhere that any software being developed for a sufficient amount of time always ends up being able to send e-mail messages...

So I gave the volume something like a 15dB boost, followed by a hard limiter to prevent clipping, then EQ'd and compressed it. It worked quite well, and something similar would probably work for other types of audio with huge volume fluctuations. You can only do so much with compression.

Why is that, I mean, why is compression “limited” in a case like this, as compared with the seemingly more rudimentary method of amplifying everything then applying a hard limiter ? And in this case, how did you configure the Equalizer to boost vocals in a recording with lots of unwanted noises ?

Quote

13th Aug 2020 02:00 #9

hello_hello

Member

Originally Posted by abolibibelot

Why is that, I mean, why is compression “limited” in a case like this, as compared with the seemingly more rudimentary method of amplifying everything then applying a hard limiter ?

Compressors sometimes have limiters built in. Technically, a traditional compressor can double as a limiter if you set the compression ratio quite high and the attack time quite fast, but often you want more gentle compression while still hard limiting loud peaks. Here's a compressor with both. The forum software will probably resize the pic, so you might need to save it to read the labels.

The DAN works the opposite way to a traditional compressor, boosting the quiet parts instead of reducing the loud bits, and in the digital world it also has the advantage of being able to look ahead. The end result can be much the same, but by default the maximum amount the DAN can increase the volume is 10dB. You can change that, but everything's a trade-off. It's also effected by how quickly it responds. If it's too fast you'll probably hear "volume pumping" (low background sounds suddenly increasing in volume when there's no foreground sound such as someone talking, and quickly decreasing when there is a foreground sound). For a traditional compressor, if it responds too slowly a transient peak can come and go before it has time to react, and for the DAN's type of "compression", too slow means it can still be amplifying during a transient peak and actually make it louder rather than quieter. If you allow it to amplify more, it can make those problems worse, but if you configure it to amplify less, it won't "compress" as much. It's all a compromise.

There's a couple of examples in this zip file (created for another thread). There's two DAN samples included. One amplifies more than 10dB and uses a larger window for determining the volume so the response is much slower. The speech at the beginning is louder than the DAN sample using my usual settings, but just before the section where the music starts, the level of speech drops quite a bit, because it's looking far enough ahead for the louder section that follows to effect how much it's increasing the volume. The first loud peak that follows (an explosion) is slightly louder than the downmixed version without compression.

"DownMix Only Matrix Mixer.flac" is a version downmixed without compression.
"f=150 b=1.flac" is the sample compressed with the settings I normally use.
"f=2000 g=23 m=15 b=1.flac" is the sample using a greater maximum amplification and a larger window.

Anyway... for the audio I referred to earlier, the pops and crackles were often at maximum volume, while the speech was probably -30dB. The DAN's default 10dB maximum amplification wouldn't have been enough, but rather than mess with it I increased the volume by 15dB and followed it with a peak limiter. That reduced the dynamic range of the pops and crackles by 15dB (they were mostly at maximum already), and brought the speech up to about -15dB. That way the DAN could compress more gently, and therefore less obviously. I guess what I've taken the long road to say, is a compressor can reduce the dynamic range (compress) or hard limit the peaks, but especially for very dynamic audio, it's hard for it to do both jobs well at the same time.

Originally Posted by abolibibelot

And in this case, how did you configure the Equalizer to boost vocals in a recording with lots of unwanted noises ?

Nothing fancy. It wasn't so much to reduce the pops and crackles, but to reduce the constant low frequency hum. It wasn't overly loud, but made it harder to listen to the speech, especially with the DAN amplifying it. It was just a quick way to get rid of it. The idea wasn't to achieve great quality as such, just to make the speech more intelligible.

Attached Thumbnails

Click image for larger version

Name: Clipboard01.jpg
Views: 40
Size: 29.8 KB
ID: 54514

Last edited by hello_hello; 13th Aug 2020 at 02:41.

Avisynth functions Resize8 Mod - Audio Speed/Meter/Wave - FixBlend.zip - Position.zip
Avisynth/VapourSynth functions CropResize - FrostyBorders - CPreview (Cropping Preview)

Quote

audio issue not sure how to fix

Thread Tools

Similar Threads

Audio Lag, Drift, Latency, IDK. Audio Issue

one audio language option vanished, in a dual audio track IN VLC mdia plyr

how can fix it? [AUDIO]

Issue when playing the point of joint of two video segments. How to fix it?

NVENC h.265 Audio Sync Issue and fix with StaxRip - Passing knowledge along