Converting from WAV to AAC introduces clipping

21st Apr 2025 16:21 #1
elektro

View Profile

View Forum Posts

Private Message
Member

Join Date
May 2011

Location
Sweden
On top is the original WAV and on bottom is the resulting AAC (224kbps):

I tried many different solutions and same result. This is driving me crazy. The original doesn't have clipping; from where the clipping comes from? Both have the same sampling rate (48kHz) and they're both 16bit. Any solution to get AAC out of this WAV with zero clipping without messing with the volume level?

Attached Files

sample.wav (52.56 MB, 3 views)
Quote
21st Apr 2025 20:22 #2
johns0

View Profile

View Forum Posts

Private Message
I'm a Super Moderator

Join Date
Jun 2002

Location
canada
Before you save the aac track adjust it so the clipping will be gone,this means reducing the sound a bit since the track got amplified a bit.

I think,therefore i am a hamster.

Quote
22nd Apr 2025 01:02 #3
davexnet

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2008

Location
United States
Compressing to lossy format can introduce some clipping if the source levels are close to 0db.
In page 39 of the Izotope Ozone Mastering PDF, this sentence is found:

"When mastering for compressed audio formats like AAC and MP3,
it’s a good idea to set the Ceiling between -1 dB and -1.5 dB to prevent clipping due to file compression."

http://downloads.izotope.com/guides/iZotopeMasteringGuide_MasteringWithOzone.pdf

I did notice that the problem doesn't occur in Audacity when exporting to mp3

Quote
22nd Apr 2025 08:55 #4
elektro

View Profile

View Forum Posts

Private Message
Member

Join Date
May 2011

Location
Sweden
Thanks. So, I'll have to reduce the amplification level.

Quote
22nd Apr 2025 10:28 #5
cholla

View Profile

View Forum Posts

Private Message
Member

Join Date
Oct 2010

Location
USA
I used ffmpeg with the Apple codec aac_at .
I used for the first time ffmpeg loudnorm.
So I used the suggested settings.
Many are the default.
This is what the result is if it is acceptable to you.

Code:

ffmpeg -i input.wav -filter:a loudnorm=linear=true:i=-24.0:lra=7.0:tp=-2.0:offset=0.0:measured_I=-24.01:measured_tp=-10.11:measured_LRA=18.80:measured_thresh=-34.44 -ar 48000 -c:a aac_at -aac_at_quality 0 -aac_at_mode cbr -b:a 320k -ac 2 output.aac

If you use the .m4a extension the file will be Constant instead of the .acc Variable.

This is using Audacity 3.7.3 with the Limiter function.

[Attachment 86690 - Click to enlarge]
This is the Export:

[Attachment 86691 - Click to enlarge]
This removed the clipping.I had to use -2.0 as -1.5 still had some clipping.
This is the sample1.m4a Attachment

Attached Files

sample.aac (11.04 MB, 3 views)

sample.m4a (11.01 MB, 1 views)

sample1.m4a (11.72 MB, 1 views)
Last edited by cholla; 22nd Apr 2025 at 11:43. Reason: Added Audacity conversion.
Quote
22nd Apr 2025 13:10 #6
pandy

View Profile

View Forum Posts

Private Message
Member

Join Date
Sep 2008
Normalization should be done with level -3.0103 dBFS - this should prevent possibility of clipping (inter-sample peaks and as outcome clipping) in most of cases.

Quote
23rd Apr 2025 12:06 #7
cholla

View Profile

View Forum Posts

Private Message
Member

Join Date
Oct 2010

Location
USA
Originally Posted by pandy

Normalization should be done with level -3.0103 dBFS - this should prevent possibility of clipping (inter-sample peaks and as outcome clipping) in most of cases.

If you would like to elaborate on the way you do this I would like to read it.
Especially if it is for the ffmpeg code.

I went a different route with Audacity & did not use the Limiter on this .m4a.
Using the Effects
Normalize peak amplitude to -5.2
Noise reduction (dB): 6
Sensitivity: 6
Frequency smoothing (bands):0

Attached Files

sample.m4a (11.68 MB, 1 views)
Quote
23rd Apr 2025 14:06 #8
pandy

View Profile

View Forum Posts

Private Message
Member

Join Date
Sep 2008
Originally Posted by cholla

Originally Posted by pandy

Normalization should be done with level -3.0103 dBFS - this should prevent possibility of clipping (inter-sample peaks and as outcome clipping) in most of cases.

If you would like to elaborate on the way you do this I would like to read it.
Especially if it is for the ffmpeg code.

I went a different route with Audacity & did not use the Limiter on this .m4a.
Using the Effects
Normalize peak amplitude to -5.2
Noise reduction (dB): 6
Sensitivity: 6
Frequency smoothing (bands):0

Detailed explanation is in Annex 2 of https://www.itu.int/dms_pubrec/itu-r/rec/bs/R-REC-BS.1770-5-202311-I!!PDF-E.pdf , less math explanation are provided for example here (series of application notes dedicated inter sample peaks) https://benchmarkmedia.com/blogs/application_notes/tagged/inter-sample-overs but i highly recommend first to read this paper https://service-tcgroup.tcelectronic.com/media/Level_paper_AES109(1).pdf - generally sampling signals involve not only signal level (sample peak) but also signal phase (relation between samples neighboring your sample) - by introducing normalization level equal to -3.0103dBFS you loose half of bit resolution but in exchange you get signal free from clipping.

Dynamic processing (loudness) is something else than normalization - at some cases dynamic processing may be not acceptable - normalization not changing signal dynamics is quasi transparent process (if sufficient signal resolution provided) where dynamics processing change will alter signal unavoidably.

In ffmpeg i use something like:

Code:

dynaudnorm=p=1/sqrt(2):m=100:s=12

to normalize level and loudness as prevention for clipping.
Quote
24th Apr 2025 11:05 #9
cholla

View Profile

View Forum Posts

Private Message
Member

Join Date
Oct 2010

Location
USA
@ pandy,
I looked at the 3 links.
I understand some of the information in them.

Can you explain the ffmpeg code ?
p=1/sqrt(2)
I tried Google & found the code you used & some others as well.
Especially the sqrt(2).
What does this do ?

Quote
24th Apr 2025 15:43 #10
pandy

View Profile

View Forum Posts

Private Message
Member

Join Date
Sep 2008
Originally Posted by cholla

@ pandy,
I looked at the 3 links.
I understand some of the information in them.

Can you explain the ffmpeg code ?
p=1/sqrt(2)
I tried Google & found the code you used & some others as well.
Especially the sqrt(2).
What does this do ?

sqrt - square root where square root from 2 is 1.414213562373095

1/sqrt(2) =1/1.414213562373095=0.7071067811865475=-3.0103dB

Quote
25th Apr 2025 09:46 #11
cholla

View Profile

View Forum Posts

Private Message
Member

Join Date
Oct 2010

Location
USA
Originally Posted by pandy

p=VALUE: Sets the target RMS level in dBFS (decibels relative to full scale).

sqrt - square root where square root from 2 is 1.414213562373095

1/sqrt(2) =1/1.414213562373095=0.7071067811865475=-3.0103dB

I thought sqrt was the abbreviation for square root but wanted to make sure.
The / in a fraction means divided by.
So 1/1.414213562373095 = 0.7071067811865475 & I get the math up to there.
How does this get to =-3.0103dB ?
How does 0.7071067811865475=-3.0103dB ?

m=maxgain.The maximum setting is 100. So the code sets this at the maximum gain.
s=compress .This is for the dynaudnorm filter & the method it used for compression.
s=12 is a little less than the middle range for this setting.

If the "p=1/sqrt(2)" does equal -3.0103dB. Which is considered the "sweet spot" for Normalization.
I get why it is there. I'm not sure why ffmpeg needs the code this way.

Quote
25th Apr 2025 12:58 #12
pandy

View Profile

View Forum Posts

Private Message
Member

Join Date
Sep 2008
Originally Posted by cholla

How does this get to =-3.0103dB ?
How does 0.7071067811865475=-3.0103dB ?

20*log(1/sqrt(2))=-3.0103dB (approx).

Originally Posted by cholla

m=maxgain.The maximum setting is 100. So the code sets this at the maximum gain.
s=compress .This is for the dynaudnorm filter & the method it used for compression.
s=12 is a little less than the middle range for this setting.

If the "p=1/sqrt(2)" does equal -3.0103dB. Which is considered the "sweet spot" for Normalization.
I get why it is there. I'm not sure why ffmpeg needs the code this way.

-3dBFS is optimal target normalization level - in many ways optimal - efficiently you loose only half of bit from resolution and usually this is not a problem as your codec usually use 32 bit floats (exception are some codecs implementations like libfdk_aac which is 16 bit integer).

ffmpeg no need this code - usually i use it to perform single pass normalization with mild compression (loudness normalization) - alternatively you can perform true normalization but in ffmpeg this is 2 pass process - first step (pass) audio is under statistical evaluation and as such highest (peak) level found is reported then you can apply desired "amplification" level in second step (pass).

Quote

Converting from WAV to AAC introduces clipping

Thread Tools

Search Thread

Similar Threads

Converting video with AC3 audio to AAC without losing subtitles

Converting to AAC 5.1 / 7.1 with ffmpeg changes channel layout

inverse telecine weird NTSC DVD back to film introduces problems

Converting framerate of an aac - Reducing depth from 64 to 32 bits...

Difference between AAC (lav) and AAC (Fraunhofer) audio codec?