getting audio all at same volume

Thread

15th Mar 2017 02:35 #1
thecrock

View Profile

View Forum Posts

Private Message
Member

Join Date
May 2005

Location
england
Hi all
I have been given a movie file and asked to clean up the sound. The Volume is low in some places and high in others. I need a quick way to get the audio at a volume that is constant. is there a quick way to do this in Premiere pro. I have audacity and adobe audition as well.

thanks in advance

“He who makes a beast of himself gets rid of the pain of being a man.”

Quote
15th Mar 2017 02:59 #2
manono

View Profile

View Forum Posts

Private Message
Member

Join Date
Aug 2003
Just use Levelator on it.

Quote
15th Mar 2017 06:17 #3
pandy

View Profile

View Forum Posts

Private Message
Member

Join Date
Sep 2008
Should work OK.

Code:

@ffmpeg -y -i "%1" -vn -c:a pcm_f32le -f wav -af "dynaudnorm=p=1/sqrt(2):m=100:s=12:g=15" %~n1_dn.wav"
Quote
19th Apr 2017 01:16 #4
rowjekto

View Profile

View Forum Posts

Private Message
Member

Join Date
Apr 2017
Does this code work on Windows XP?

Quote
19th Apr 2017 03:51 #5
hello_hello

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2012
It does if you use an XP friendly version of ffmpeg

https://sourceforge.net/projects/ffmpegwindowsbi/?source=navbar

Quote
19th Apr 2017 05:10 #6
rowjekto

View Profile

View Forum Posts

Private Message
Member

Join Date
Apr 2017
Thanks.
I used this command line, based on the code above and another one

Code:

ffmpeg -i input.mkv -af "dynaudnorm=p=0.71:m=100:s=12:g=15" -vcodec copy output.mkv

and had the following issues:
s=12 causes FFmpeg to stop after just a few seconds. When I tried to set other values to s - the result was the same, so I had to remove it.
The volume is not always constant even at the same dialogue. (The problematic input doesn't have this issue at the same dialogue.)
Overall, the result is alright - it harmonizes the loud parts and the quiet parts pretty well.
Last edited by rowjekto; 19th Apr 2017 at 05:21.
Quote
19th Apr 2017 06:12 #7
hello_hello

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2012
Try it like this:

"dynaudnorm=f=150"

or

"dynaudnorm=f=150:g=15"

Quote
19th Apr 2017 08:20 #8
rowjekto

View Profile

View Forum Posts

Private Message
Member

Join Date
Apr 2017
The result is better now. Thanks again!

Last edited by rowjekto; 19th Apr 2017 at 08:41.

Quote
20th Apr 2017 09:30 #9
rowjekto

View Profile

View Forum Posts

Private Message
Member

Join Date
Apr 2017
Another issue I noticed: sometimes, when there are background sound and talking at the same time, the talking volume is too low.

Quote
22nd Apr 2017 00:37 #10
hello_hello

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2012
I haven't used Dynamic Audio Normaliser a lot. I use RockSteady regularly (on playback) but the principle is the same. I almost never use "compression" with multi-channel audio without downmixing to stereo first (I don't know what you have), so I can only guess.

About the only time I've noticed the background sounds have too much of an adverse effect on foreground speech is if the background sounds contain a lot of low frequency content, & depending on your sound system it mightn't appear to be loud, but the effect can be an apparent reduction of the speech level compared to any speech on it's own. If you're downmixing to stereo, try not including the LFE channel (I never do) or if there is a lot of low frequency in the background sounds, try filtering a bit out first.

You could also try reducing the frame size even further. Try a few different values to see if it helps or makes it worse. ie "dynaudnorm=f=75". Or use the "m" option to increase the maximum amplification. The default is m=10. Maybe try 15 or 20. You might need to experiment, or even have to try compressing/filtering problem sections of the video differently.
The list of options is here: https://ffmpeg.org/ffmpeg-all.html#dynaudnorm

A problem with compression is "volume pumping", which causes the volume of background sounds (say, in between speech) to increase and decrease noticeably as the speech comes and goes. Sometimes by the time the speech is the perfect level, the background sounds will be "pumping" away. If your TV has a night mode for the audio, put it on maximum and have a listen.

Other than that, maybe post a couple of samples of the source (before compression). One where the speech is okay and one where it seems to end up a bit quiet. Someone may be able to come up with settings that work better.

Last edited by hello_hello; 22nd Apr 2017 at 00:44.

Quote
24th Apr 2017 01:40 #11
rowjekto

View Profile

View Forum Posts

Private Message
Member

Join Date
Apr 2017
Originally Posted by hello_hello

I almost never use "compression" with multi-channel audio without downmixing to stereo first

Is it necessary to do that with Dynamic Audio Normalizer?

Also, I'm interested in adding input speed up (25/23.976) to the command line. What command line should I use?

Quote

24th Apr 2017 07:53 #12

Member

I suppose if you wanted to use ffmpeg's loudnorm to adjust audio loudness, you could do so a bit like this (extracted from a .bat and modified slightly to use a fixed filename)
http://k.ylo.ph/2016/04/04/loudnorm.html
https://ffmpeg.org/ffmpeg-filters.html#loudnorm

EBU R128 loudness normalization. Includes both dynamic and linear normalization modes. Support for both single pass (livestreams, files) and double pass (files) modes. This algorithm can target IL, LRA, and maximum true peak.

Code:

@echo on
set inp=.\filename.mpg
set tempaudio=%inp%.aac.mp4
SET jsonFile=%inp%.json
SET lI=-16
SET lTP=0.0
SET lLRA=11
"%ffmpegexex64%" -threads 0 -nostats -nostdin -y -hide_banner -i "%inp%" -vn -threads 0 -af loudnorm=I=%lI%:TP=%lTP%:LRA=%lLRA%:print_format=json -f null - 2> "%jsonFile%"  
SET EL=!ERRORLEVEL!
IF /I "!EL!" NEQ "0" (
   Echo *********  Error !EL! was found 
   Echo *********  Error !EL! was found 
   Echo *********  Error !EL! was found 
   Echo *********  ABORTING ... 
   %xpause%
   EXIT !EL!
)
REM all the trickery below is simply to remove quotes and tabs and spaces from the json single-level response
set input_i=
set input_tp=
set input_lra=
set input_thresh=
set target_offset=
for /f "tokens=1,2 delims=:, " %%a in (' find ":" ^< "%jsonFile%" ') do (
   set "var="
   for %%c in (%%~a) do set "var=!var!,%%~c"
   set var=!var:~1!
   set "val="
   for %%d in (%%~b) do set "val=!val!,%%~d"
   set val=!val:~1!
REM   echo .!var!.=.!val!.
   IF "!var!" == "input_i"         set !var!=!val!
   IF "!var!" == "input_tp"        set !var!=!val!
   IF "!var!" == "input_lra"       set !var!=!val!
   IF "!var!" == "input_thresh"    set !var!=!val!
   IF "!var!" == "target_offset"   set !var!=!val!
)
echo input_i=%input_i% 
echo input_tp=%input_tp% 
echo input_lra=%input_lra% 
echo input_thresh=%input_thresh% 
echo target_offset=%target_offset% 
REM
REM later, in a second encoding pass we MUST down-convert from 192k (loadnorm upsamples it to 192k whis is way way too high ... use  -ar 48k or -ar 48000
REM
set loudnormfilter=loudnorm=I=%lI%:TP=%lTP%:LRA=%lLRA%:measured_I=%input_i%:measured_LRA=%input_lra%:measured_TP=%input_tp%:measured_thresh=%input_thresh%:offset=%target_offset%:linear=true:print_format=summary
ECHO --------------------------------------------------------------------------------------------
ECHO --------------------------------------------------------------------------------------------
set audiofreq=48000
set audiobitrate=384k
"%ffmpegexex64%" -threads 0 -i "%inp%" -vn -threads 0 -map_metadata -1 -af %loudnormfilter% -c:a libfdk_aac -cutoff 18000 -ab %audiobitrate% -ar %audiofreq% -y "%tempaudio%" 
SET EL=!ERRORLEVEL!
IF /I "!EL!" NEQ "0" (
   Echo *********  Error !EL! was found 
   Echo *********  ABORTING ... 
   EXIT !EL!
)

pause
exit

Last edited by hydra3333; 24th Apr 2017 at 08:04.

Quote

24th Apr 2017 08:47 #13
hello_hello

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2012
Originally Posted by rowjekto

Originally Posted by hello_hello

I almost never use "compression" with multi-channel audio without downmixing to stereo first

Is it necessary to do that with Dynamic Audio Normalizer?

I don't think so.

n
Enable channels coupling. By default is enabled. By default, the Dynamic Audio Normalizer will amplify all channels by the same amount. This means the same gain factor will be applied to all channels, i.e. the maximum possible gain factor is determined by the "loudest" channel. However, in some recordings, it may happen that the volume of the different channels is uneven, e.g. one channel may be "quieter" than the other one(s). In this case, this option can be used to disable the channel coupling. This way, the gain factor will be determined independently for each channel, depending only on the individual channel’s highest magnitude sample. This allows for harmonizing the volume of the different channels.

Originally Posted by rowjekto

Also, I'm interested in adding input speed up (25/23.976) to the command line. What command line should I use?

I've never used ffmpeg to speedup/slowdown audio as I generally use Avisynth for that (I don't use ffmeg a great deal myself), however I tried to work it out:

-af atempo='1001/960'

Only atempo doesn't alter the audio pitch, which is great if that's what you want, but the opposite of what I'd normally call "normal" as the pitch is rarely corrected when 23.976 is sped up for PAL, so generally you'd want to apply a similar lack of pitch correction if you're converting PAL back to NTSC so the pitch is the same as it would have been originally. Unless of course you know you need to correct the pitch because it's your own video/audio etc....

-af rubberband=tempo='1001/960': pitch='1001/960'

or without changing the pitch

-af rubberband=tempo='1001/960'

The above seems to do the job of adjusting the duration without correcting the pitch (it's raised or lowered) and the following as a full command line seems to work fine (I'm terrible with ffmpeg syntax). Maybe there's a better way?

"C:\Program Files\ffmpeg\ffmpeg.exe" -report -i "E:\input.wav" -y -threads 1 -vn -af "dynaudnorm=f=150" -af "rubberband=tempo='1001/960': pitch='1001/960'" -acodec flac -sn "E:\output.flac"

I don't know how much difference it makes but 1001/960 is probably technically accurate than 25/23.976 etc.

23.976 -> 25 speedup - 1001/960 should be the same as 25 / (24000 / 1001)
25 -> 23.976 slowdown - 960/1001 should be the same as (24000 / 1001) / 25

Oh well.... I learned something new for future reference. It'll probably come in handy at times, rather than using Avisynth. I'll save that as a ffmpeg conversion preset in foobar2000. Anyone know what the quality is like compared to using Avisynth? Thinking about it, ffmpeg generally accepts avs scripts as input these days, doesn't it? Something like the way MeGUI does it:

# 23.976 -> 25 without pitch correction
LoadPlugin("C:\Program Files\MeGUI\tools\lsmash\LSMASHSource.dll")
LWLibavAudioSource("E:\Input.wav")
AssumeSampleRate(Round((AudioRate()*1001.0)/960.0)).SSRC(AudioRate())
AudioBits(last)>24?ConvertAudioTo24bit(last):last
return last

or

# 25 -> 23.976 without pitch correction
LoadPlugin("C:\Program Files\MeGUI\tools\lsmash\LSMASHSource.dll")
LWLibavAudioSource("E:\Input.wav")
SSRC(Round((AudioRate()*1001.0)/960.0)).AssumeSampleRate(AudioRate())
AudioBits(last)>24?ConvertAudioTo24bit(last):last
return last

Last edited by hello_hello; 24th Apr 2017 at 12:13.

Quote
24th Apr 2017 12:11 #14
hello_hello

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2012
I ran some test encodes. Am I missing something or is there a problem with "rubber band" or does the tempo adjustment on it's own suck?

01 Original
02 ffmpeg -af "atempo='1001/960'"
03 ffmpeg -af "rubberband=tempo='1001/960'"
04 ffmpeg -af "rubberband=tempo='1001/960': pitch='1001/960'"
05 Avisynth (no pitch correction) - script from previous post
06 Avisynth (with pitch correction) - TimeStretchPlugin(tempo=(1001.0/9.6))

Edit: Added Avisynth samples

Attached Files

01 original.flac (5.92 MB, 677 views)

02 1001-960 atempo.flac (5.65 MB, 444 views)

03 1001-960 tempo only - rubber band.flac (5.55 MB, 444 views)

04 1001-960 pitch and tempo - rubber band.flac (5.80 MB, 441 views)

05 Avisynth (no pitch correction).flac (5.83 MB, 436 views)

06 Avisynth (with pitch correction).flac (5.67 MB, 442 views)
Last edited by hello_hello; 24th Apr 2017 at 12:37.
Quote
25th May 2017 08:39 #15
rowjekto

View Profile

View Forum Posts

Private Message
Member

Join Date
Apr 2017
Originally Posted by hello_hello

"C:\Program Files\ffmpeg\ffmpeg.exe" -report -i "E:\input.wav" -y -threads 1 -vn -af "dynaudnorm=f=150" -af "rubberband=tempo='1001/960': pitch='1001/960'" -acodec flac -sn "E:\output.flac"

Does the command line order between dynaudnorm and rubberband matter?

Quote
25th May 2017 21:01 #16
hello_hello

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2012
Originally Posted by rowjekto

Originally Posted by hello_hello

"C:\Program Files\ffmpeg\ffmpeg.exe" -report -i "E:\input.wav" -y -threads 1 -vn -af "dynaudnorm=f=150" -af "rubberband=tempo='1001/960': pitch='1001/960'" -acodec flac -sn "E:\output.flac"

Does the command line order between dynaudnorm and rubberband matter?

I've no idea to be honest. Those tests are the only times I've used Rubberband and I haven't gone back to it. I don't do audio speedup/slowdown much so when I do, I just do it with MeGUI/Avisynth, or sometimes with foobar2000/Avisynth.

Quote
2nd Jul 2017 00:43 #17
rowjekto

View Profile

View Forum Posts

Private Message
Member

Join Date
Apr 2017
Is it possible to make movie's quiet parts (voices) louder than the loud parts (background sounds)?

Quote
2nd Jul 2017 08:24 #18
sneaker

View Profile

View Forum Posts

Private Message
Member

Join Date
Sep 2014
Is your source stereo or with a center channel (5.1/6.1/etc.)?

Quote
2nd Jul 2017 20:54 #19
rowjekto

View Profile

View Forum Posts

Private Message
Member

Join Date
Apr 2017
With a center channel.

Quote
3rd Jul 2017 09:33 #20
hello_hello

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2012
Assuming the dialogue is mainly in the centre channel you could give it a boost, but if you compress the audio as a whole there's probably not much point because compression tries to squish everything to the same level. At least that's what I've found when compressing after downmixing to stereo. Boosting the centre channel first doesn't help much. Compressing multichannel audio might be a different story (I never do it myself).

I find the compressors based on R128 scanning works quite well (LoudNorm for ffmpeg or there's a foobar2000 DSP etc) but they're a little slow to respond to volume changes. I prefer using something like the Dynamic Audio Normalizer. It's built into ffmpeg these days and the following in the ffmpeg command line works pretty well (at least for stereo):

-af dynaudnorm=f=150

or with a higher chance of volume "pumping" you can make it respond even faster.

-af dynaudnorm=f=75:g=11

Quote
2nd Sep 2017 07:04 #21
hydra3333

View Profile

View Forum Posts

Private Message
Member

Join Date
Oct 2009

Location
Australia
update regarding post #12 :
https://forum.doom9.org/showthread.php?p=1817171#post1817171

I'll probably change it to also use dynaudnorm=f=150

Quote

getting audio all at same volume

Thread Tools

Similar Threads

DTS Audio Volume

filter audio by volume?

Volume normalization for 5.1 audio

Why does ffmpeg audio volume cmd -af "volume=50dB" not work?

Dialogue Volume in LCD TV is low while action/music volume is high