I have been given a movie file and asked to clean up the sound. The Volume is low in some places and high in others. I need a quick way to get the audio at a volume that is constant. is there a quick way to do this in Premiere pro. I have audacity and adobe audition as well.
thanks in advance
+ Reply to Thread
Results 1 to 21 of 21
“He who makes a beast of himself gets rid of the pain of being a man.”
Should work OK.
@ffmpeg -y -i "%1" -vn -c:a pcm_f32le -f wav -af "dynaudnorm=p=1/sqrt(2):m=100:s=12:g=15" %~n1_dn.wav"
Does this code work on Windows XP?
I used this command line, based on the code above and another one
ffmpeg -i input.mkv -af "dynaudnorm=p=0.71:m=100:s=12:g=15" -vcodec copy output.mkv
s=12 causes FFmpeg to stop after just a few seconds. When I tried to set other values to s - the result was the same, so I had to remove it.
The volume is not always constant even at the same dialogue. (The problematic input doesn't have this issue at the same dialogue.)
Overall, the result is alright - it harmonizes the loud parts and the quiet parts pretty well.
Last edited by rowjekto; 19th Apr 2017 at 05:21.
Try it like this:
The result is better now. Thanks again!
Last edited by rowjekto; 19th Apr 2017 at 08:41.
Another issue I noticed: sometimes, when there are background sound and talking at the same time, the talking volume is too low.
I haven't used Dynamic Audio Normaliser a lot. I use RockSteady regularly (on playback) but the principle is the same. I almost never use "compression" with multi-channel audio without downmixing to stereo first (I don't know what you have), so I can only guess.
About the only time I've noticed the background sounds have too much of an adverse effect on foreground speech is if the background sounds contain a lot of low frequency content, & depending on your sound system it mightn't appear to be loud, but the effect can be an apparent reduction of the speech level compared to any speech on it's own. If you're downmixing to stereo, try not including the LFE channel (I never do) or if there is a lot of low frequency in the background sounds, try filtering a bit out first.
You could also try reducing the frame size even further. Try a few different values to see if it helps or makes it worse. ie "dynaudnorm=f=75". Or use the "m" option to increase the maximum amplification. The default is m=10. Maybe try 15 or 20. You might need to experiment, or even have to try compressing/filtering problem sections of the video differently.
The list of options is here: https://ffmpeg.org/ffmpeg-all.html#dynaudnorm
A problem with compression is "volume pumping", which causes the volume of background sounds (say, in between speech) to increase and decrease noticeably as the speech comes and goes. Sometimes by the time the speech is the perfect level, the background sounds will be "pumping" away. If your TV has a night mode for the audio, put it on maximum and have a listen.
Other than that, maybe post a couple of samples of the source (before compression). One where the speech is okay and one where it seems to end up a bit quiet. Someone may be able to come up with settings that work better.
Last edited by hello_hello; 22nd Apr 2017 at 00:44.
I suppose if you wanted to use ffmpeg's loudnorm to adjust audio loudness, you could do so a bit like this (extracted from a .bat and modified slightly to use a fixed filename)
EBU R128 loudness normalization. Includes both dynamic and linear normalization modes. Support for both single pass (livestreams, files) and double pass (files) modes. This algorithm can target IL, LRA, and maximum true peak.Code:
@echo on set inp=.\filename.mpg set tempaudio=%inp%.aac.mp4 SET jsonFile=%inp%.json SET lI=-16 SET lTP=0.0 SET lLRA=11 "%ffmpegexex64%" -threads 0 -nostats -nostdin -y -hide_banner -i "%inp%" -vn -threads 0 -af loudnorm=I=%lI%:TP=%lTP%:LRA=%lLRA%:print_format=json -f null - 2> "%jsonFile%" SET EL=!ERRORLEVEL! IF /I "!EL!" NEQ "0" ( Echo ********* Error !EL! was found Echo ********* Error !EL! was found Echo ********* Error !EL! was found Echo ********* ABORTING ... %xpause% EXIT !EL! ) REM all the trickery below is simply to remove quotes and tabs and spaces from the json single-level response set input_i= set input_tp= set input_lra= set input_thresh= set target_offset= for /f "tokens=1,2 delims=:, " %%a in (' find ":" ^< "%jsonFile%" ') do ( set "var=" for %%c in (%%~a) do set "var=!var!,%%~c" set var=!var:~1! set "val=" for %%d in (%%~b) do set "val=!val!,%%~d" set val=!val:~1! REM echo .!var!.=.!val!. IF "!var!" == "input_i" set !var!=!val! IF "!var!" == "input_tp" set !var!=!val! IF "!var!" == "input_lra" set !var!=!val! IF "!var!" == "input_thresh" set !var!=!val! IF "!var!" == "target_offset" set !var!=!val! ) echo input_i=%input_i% echo input_tp=%input_tp% echo input_lra=%input_lra% echo input_thresh=%input_thresh% echo target_offset=%target_offset% REM REM later, in a second encoding pass we MUST down-convert from 192k (loadnorm upsamples it to 192k whis is way way too high ... use -ar 48k or -ar 48000 REM set loudnormfilter=loudnorm=I=%lI%:TP=%lTP%:LRA=%lLRA%:measured_I=%input_i%:measured_LRA=%input_lra%:measured_TP=%input_tp%:measured_thresh=%input_thresh%:offset=%target_offset%:linear=true:print_format=summary ECHO -------------------------------------------------------------------------------------------- ECHO -------------------------------------------------------------------------------------------- set audiofreq=48000 set audiobitrate=384k "%ffmpegexex64%" -threads 0 -i "%inp%" -vn -threads 0 -map_metadata -1 -af %loudnormfilter% -c:a libfdk_aac -cutoff 18000 -ab %audiobitrate% -ar %audiofreq% -y "%tempaudio%" SET EL=!ERRORLEVEL! IF /I "!EL!" NEQ "0" ( Echo ********* Error !EL! was found Echo ********* ABORTING ... EXIT !EL! ) pause exit
Last edited by hydra3333; 24th Apr 2017 at 08:04.
Enable channels coupling. By default is enabled. By default, the Dynamic Audio Normalizer will amplify all channels by the same amount. This means the same gain factor will be applied to all channels, i.e. the maximum possible gain factor is determined by the "loudest" channel. However, in some recordings, it may happen that the volume of the different channels is uneven, e.g. one channel may be "quieter" than the other one(s). In this case, this option can be used to disable the channel coupling. This way, the gain factor will be determined independently for each channel, depending only on the individual channel’s highest magnitude sample. This allows for harmonizing the volume of the different channels.
ffmpeg to speedup/slowdown audio as I generally use Avisynth for that (I don't use ffmeg a great deal myself), however I tried to work it out:
Only atempo doesn't alter the audio pitch, which is great if that's what you want, but the opposite of what I'd normally call "normal" as the pitch is rarely corrected when 23.976 is sped up for PAL, so generally you'd want to apply a similar lack of pitch correction if you're converting PAL back to NTSC so the pitch is the same as it would have been originally. Unless of course you know you need to correct the pitch because it's your own video/audio etc....
-af rubberband=tempo='1001/960': pitch='1001/960'
or without changing the pitch
The above seems to do the job of adjusting the duration without correcting the pitch (it's raised or lowered) and the following as a full command line seems to work fine (I'm terrible with ffmpeg syntax). Maybe there's a better way?
"C:\Program Files\ffmpeg\ffmpeg.exe" -report -i "E:\input.wav" -y -threads 1 -vn -af "dynaudnorm=f=150" -af "rubberband=tempo='1001/960': pitch='1001/960'" -acodec flac -sn "E:\output.flac"
I don't know how much difference it makes but 1001/960 is probably technically accurate than 25/23.976 etc.
23.976 -> 25 speedup - 1001/960 should be the same as 25 / (24000 / 1001)
25 -> 23.976 slowdown - 960/1001 should be the same as (24000 / 1001) / 25
Oh well.... I learned something new for future reference. It'll probably come in handy at times, rather than using Avisynth. I'll save that as a ffmpeg conversion preset in foobar2000. Anyone know what the quality is like compared to using Avisynth? Thinking about it, ffmpeg generally accepts avs scripts as input these days, doesn't it? Something like the way MeGUI does it:
# 23.976 -> 25 without pitch correction
# 25 -> 23.976 without pitch correction
Last edited by hello_hello; 24th Apr 2017 at 12:13.
I ran some test encodes. Am I missing something or is there a problem with "rubber band" or does the tempo adjustment on it's own suck?
02 ffmpeg -af "atempo='1001/960'"
03 ffmpeg -af "rubberband=tempo='1001/960'"
04 ffmpeg -af "rubberband=tempo='1001/960': pitch='1001/960'"
05 Avisynth (no pitch correction) - script from previous post
06 Avisynth (with pitch correction) - TimeStretchPlugin(tempo=(1001.0/9.6))
Edit: Added Avisynth samples
Last edited by hello_hello; 24th Apr 2017 at 12:37.
Is it possible to make movie's quiet parts (voices) louder than the loud parts (background sounds)?
Is your source stereo or with a center channel (5.1/6.1/etc.)?
With a center channel.
Assuming the dialogue is mainly in the centre channel you could give it a boost, but if you compress the audio as a whole there's probably not much point because compression tries to squish everything to the same level. At least that's what I've found when compressing after downmixing to stereo. Boosting the centre channel first doesn't help much. Compressing multichannel audio might be a different story (I never do it myself).
I find the compressors based on R128 scanning works quite well (LoudNorm for ffmpeg or there's a foobar2000 DSP etc) but they're a little slow to respond to volume changes. I prefer using something like the Dynamic Audio Normalizer. It's built into ffmpeg these days and the following in the ffmpeg command line works pretty well (at least for stereo):
or with a higher chance of volume "pumping" you can make it respond even faster.