ffmpeg bug: only 32 bit possible when re-encoding aac to ac3

4th Mar 2024 16:54 #1
skh

View Profile

View Forum Posts

Private Message
Member

Join Date
Oct 2023
hey there,
I sometimes do audio conversions from AAC to AC3 when changing a video container.

Code:

-i %INPUTFILE% -c:v copy -acodec ac3 -b:a 384k -f matroska %1%.mkv

I just found out that no matter what you do it will always encode the audio with a bitdepth of 32 which doesnt make much sense.
I tried a lot of commands and most wouldn't work at all. For example as listed in this thread

https://forum.videohelp.com/threads/373264-FFMpeg-List-of-working-sample-formats-per-f...at-and-encoder

Code:

- ac3 -sample_fmt="fltp" - ac3_fixed -sample_fmt="s16p"

The -sample_fmt command wasn't accepted in ac3 nor in ac3_fixed.

The only command which was accepted but didn't do what it was supposed to do and instead encoded to 32bit anyways was

Code:

-i %INPUTFILE% -c:v copy -acodec ac3 -af aresample=48000:osf=s16p -b:a 384k -f matroska %1%.mkv

So this seems to be a bug in ffmpeg. Is there any way to encode to wav / -acodec pcm_s16le as a pre-step in between before encoding to ac3 in the end? That would probably work as wav to ac3 always works. I don't have any idea how that could be integrated in my batch.

thanks in advance for looking into this
regards
Quote
4th Mar 2024 17:25 #2
davexnet

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2008

Location
United States
aac files do not have a set bit depth, see the info here:
https://superuser.com/questions/553552/how-to-determine-aac-bit-depth

Quote

5th Mar 2024 03:39 #3

Member

But to be honest i don't understand where is your problem - why you push ac3 to encode 16 bit PCM?

btw seem ffmpeg encoder support 32 bit float only (command:"ffmpeg -h encoder=ac3 >ac3.txt"):
Code:
Encoder ac3 [ATSC A/52A (AC-3)]:
    General capabilities: dr1 
    Threading capabilities: none
    Supported sample rates: 48000 44100 32000
    Supported sample formats: fltp
    Supported channel layouts: mono stereo 3.0(back) 3.0 quad(side) quad 4.0 5.0(side) 5.0 2 channels (FC+LFE) 2.1 4 channels (FL+FR+LFE+BC) 3.1 4.1 5.1(side) 5.1

Last edited by pandy; 5th Mar 2024 at 03:46.

Quote

5th Mar 2024 07:06 #4
skh

View Profile

View Forum Posts

Private Message
Member

Join Date
Oct 2023
I don't see a point in converting the audio to 32bit. Even for wav pcm I find that a maximum of 16bit is enough. If you do a wav to ac3 conversion for an audio file in ffmpeg it always is 16bit.
So if a direct aac to ac3 is only possible in 32bit due to the unset bitdepth of aac I wonder if there is a way to make a batch with an additional conversion step so that it just copys the video without re-encode. reencodes the aac audio to 16bit wav and after that re-encodes it to ac3 and mux it into a mkv container.

Quote
5th Mar 2024 10:23 #5
pandy

View Profile

View Forum Posts

Private Message
Member

Join Date
Sep 2008
Originally Posted by skh

I don't see a point in converting the audio to 32bit. Even for wav pcm I find that a maximum of 16bit is enough. If you do a wav to ac3 conversion for an audio file in ffmpeg it always is 16bit.
So if a direct aac to ac3 is only possible in 32bit due to the unset bitdepth of aac I wonder if there is a way to make a batch with an additional conversion step so that it just copys the video without re-encode. reencodes the aac audio to 16bit wav and after that re-encodes it to ac3 and mux it into a mkv container.

16 bit is OK for reproduction but not for processing - aac decoder decode audio in float 32 bit (24 bit resolution) so ac3 encoder use same float representation - both codecs (i.e. aac and ac3) are transformation codecs where audio is transformed from time domain into frequency domain and as a function there is represented within 0..1 (or -.5..0.5 or -1..1) so no longer PCM but closer to float representation. Also this provide highest quality and it is not contradictory to 16 bit PCM time domain representation. btw hope you are aware of something called inter-sample peak - using float representation and higher bit depth resolution provide opportunity to deal with this and other problems without sacrificing signal quality.
16 bit PCM is fine if you immediately convert 16 bit PCM to analog signal in DAC but if your goal is signal processing then it is NOK.

Last edited by pandy; 5th Mar 2024 at 13:26.

Quote
5th Mar 2024 16:40 #6
Luke M

View Profile

View Forum Posts

Private Message
Member

Join Date
Apr 2017
First of all lossy audio codecs are designed to be final, not intermediate stages. So reencoding is already a mistake. Secondly there is no benefit to forcing a bit depth reduction. It can only hurt.

Quote
6th Mar 2024 03:56 #7
skh

View Profile

View Forum Posts

Private Message
Member

Join Date
Oct 2023
re-encoding is inevitable.
In a perfect world everything you're working with is in a lossless codec and after you're done you make an encode for the final stage. But that is not the reality. The majority of stuff I have to deal with is already encoded so when I change things I have no other choice than to re-encode it.

I always work with 16 bit PCM. And in the end convert that to AC3 192 (if its an old mono source) or 384, 448 kb/s if its a clean stereo source. Shure I made some comparisions with 24bit but I really hear no difference so never understood why I should bloat everything up to 24.
I don't see how a mono TV source originally recorded on VHS over 40 years ago encoded wrongly in some weird aac 96 kb/s 24bit format will sound any worse when I re-convert it to 16 bit pcm and after that to 192 kb/s ac3 - 16bit. There really can't be any audible difference for the human ear.

I'm not doing any high end produced music related masters here just some random old TV crap which does'nt have the best quality to begin with.

I also do a lot of DVD or Blurays for myself which have to be compatible to my stand alone Blu-ray players or if single file media for their internal usb ports and meant for streaming media AAC codecs or any sort of 24bit just doesn't make any sense to use there space and compatibility wise. Even for official stuff: Did you ever see a DVD with a 24bit AC3 audio track? I even find Blu-rays using 24bit lpcm for mono sources from an already degenerated hiss and crackle 35mm print from the 70's a total joke and a waste of disc space which could be better used for good video bitrate. Those audio tracks would sound the same in ac3 192 so whats the point other than marketing some voodoo?

In this case I have here at the moment (a TV show from the 80's purchased as a digital download) the audio is out of synch on my stand alone player and only synch when played on a computer. After converting it with my batch listed above to wav it runs fine and is in synch. So whatever aac setting the provider used is a bit weird. Space wise it doesnt make sense to use wav audio and compatible and quality wise there is no purpose in using 24bit ac3. Thats why proper 16bit ac3 in a 2 stage conversion would be the best and I would be eternally thankful if somebody could tell me how to make the batch 2-staged for the audio conversion

Quote
6th Mar 2024 10:29 #8
pandy

View Profile

View Forum Posts

Private Message
Member

Join Date
Sep 2008
Originally Posted by skh

re-encoding is inevitable.
In a perfect world everything you're working with is in a lossless codec and after you're done you make an encode for the final stage. But that is not the reality. The majority of stuff I have to deal with is already encoded so when I change things I have no other choice than to re-encode it.

I always work with 16 bit PCM. And in the end convert that to AC3 192 (if its an old mono source) or 384, 448 kb/s if its a clean stereo source. Shure I made some comparisions with 24bit but I really hear no difference so never understood why I should bloat everything up to 24.
I don't see how a mono TV source originally recorded on VHS over 40 years ago encoded wrongly in some weird aac 96 kb/s 24bit format will sound any worse when I re-convert it to 16 bit pcm and after that to 192 kb/s ac3 - 16bit. There really can't be any audible difference for the human ear.

Well... 16 bit is fine for reproduction but not for processing - nowadays there is no justification to force 16 bit integer for encoding - simply encoder will convert 16 bit samples anyway into something internal (float or for example 32 bit integer) - there is many reasons to process signal with higher bitdepth also there is no speed benefits as nowadays CPU's prefer 64 bit aligned data. I would understand your objections 35 years ago but today not. - such radical approach usually negatively affect audio quality and human ears is rarely able to hear distortions lower than 1..0.5% - since at least 30 years most common used audio DAC are 1..4 bit anyway with massive oversampling, noise-shaping and dithering but this is OK for hearing not for processing.

Originally Posted by skh

I'm not doing any high end produced music related masters here just some random old TV crap which does'nt have the best quality to begin with.

I also do a lot of DVD or Blurays for myself which have to be compatible to my stand alone Blu-ray players or if single file media for their internal usb ports and meant for streaming media AAC codecs or any sort of 24bit just doesn't make any sense to use there space and compatibility wise. Even for official stuff: Did you ever see a DVD with a 24bit AC3 audio track? I even find Blu-rays using 24bit lpcm for mono sources from an already degenerated hiss and crackle 35mm print from the 70's a total joke and a waste of disc space which could be better used for good video bitrate. Those audio tracks would sound the same in ac3 192 so whats the point other than marketing some voodoo?

Once again - 16 bit is fine (even 14..10 bit with proper noise shaping and oversampling will be more than enough) - problem is if you do signal processing then in many cases even 32 bit is not enough - some dedicated audio signal processing using 56 bits or more (at some cases more than 80 bits) - this is justified not by voodoo but by math - for example recursive filters (IIR filters) need high bitdepth to prevent their instability, small level adjustment require re-quantization - this lead to new errors as such you need higher bitdepth to prevent signal loss and creating errors...
So if you see audio DAC with 32 bit word is not to provide 32 bit audio but rather to allow for example digital volume control.

Originally Posted by skh

In this case I have here at the moment (a TV show from the 80's purchased as a digital download) the audio is out of synch on my stand alone player and only synch when played on a computer. After converting it with my batch listed above to wav it runs fine and is in synch. So whatever aac setting the provider used is a bit weird. Space wise it doesnt make sense to use wav audio and compatible and quality wise there is no purpose in using 24bit ac3. Thats why proper 16bit ac3 in a 2 stage conversion would be the best and I would be eternally thankful if somebody could tell me how to make the batch 2-staged for the audio conversion

If your source is VHS then i assume you apply some lowpass filtering (to remove for example horizontal line frequency present in recorded audio), you may apply some denoising, perhaps level equalization, perhaps you apply some form of spectrum replication etc - all this require higher than 16 bit depth.

Last edited by pandy; 6th Mar 2024 at 11:03.

Quote
8th Mar 2024 02:58 #9
skh

View Profile

View Forum Posts

Private Message
Member

Join Date
Oct 2023
ok so you would say that it is ok for to use 16bit as an output bitdepth and just not as a processing format. That is ok I have no problem if ffmpeg processes everything automatically higher than 16bit to guarantee a minimum loss in quality. so how would it be then if I process the aac in a sample_fmt="s32" wav and after that use 16 bit ac3 for the final conversion stage. Would that be good?

Quote
8th Mar 2024 18:24 #10
pandy

View Profile

View Forum Posts

Private Message
Member

Join Date
Sep 2008
Originally Posted by skh

ok so you would say that it is ok for to use 16bit as an output bitdepth and just not as a processing format. That is ok I have no problem if ffmpeg processes everything automatically higher than 16bit to guarantee a minimum loss in quality. so how would it be then if I process the aac in a sample_fmt="s32" wav and after that use 16 bit ac3 for the final conversion stage. Would that be good?

I have impression that ffmpeg is able to perform automatic data type conversion when needed to assure maximum quality at least in audio area - video is different topic and frequently ffmpeg perform multiple unnecessary even wrong data type conversion. If you start pushing ffmpeg to particular sample format then it may be sub-optimal and anyway ffmpeg will be forced to change data type format to satisfy encoder expectations.
If i can advise something then unless you are interested to achieve particular functionality then left to ffmpeg decision - in case conversion from aac to ac3 float32 seem to be optimal (from one side it provide 24 bit depth so enough from audio quality perspective on other side it has sufficient margin to deal with some problems)

Quote
17th Mar 2024 10:52 #11
hello_hello

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2012
Originally Posted by skh

I always work with 16 bit PCM. And in the end convert that to AC3 192 (if its an old mono source) or 384, 448 kb/s if its a clean stereo source. Shure I made some comparisions with 24bit but I really hear no difference so never understood why I should bloat everything up to 24.

The input bitdepth makes absolutely no difference to the size of lossy audio. The size is determined only by the bitrate and the duration. If you encode as 192kb/s AC3 while decoding the source as 32 bit, the size of the encoded AC3 file will be exactly the same as it would be if you decode the same source as 24 bit, or 16 bit, or 8 bit, because no matter what the bitdepth of the source audio, it's being re-encoded at 192kb/s.

For a lossy source, the higher the bitdepth it's decoded to, the more accurately it's decoded, therefore it makes sense to use the highest bitdepth possible as the input for another lossy encode.

Lossless audio is different because when encoding at a given sample rate (44.1kHz or 48kHz etc), the greater the bitdepth, the greater the bitrate required to store the extra bits, and therefore the greater the file size.

Last edited by hello_hello; 17th Mar 2024 at 10:57.

Avisynth functions Resize8 Mod - Audio Speed/Meter/Wave - FixBlend.zip - Position.zip
Avisynth/VapourSynth functions CropResize - FrostyBorders - CPreview (Cropping Preview)

Quote
17th Mar 2024 16:45 #12
pandy

View Profile

View Forum Posts

Private Message
Member

Join Date
Sep 2008
Originally Posted by hello_hello

The input bitdepth makes absolutely no difference to the size of lossy audio. The size is determined only by the bitrate and the duration. If you encode as 192kb/s AC3 while decoding the source as 32 bit, the size of the encoded AC3 file will be exactly the same as it would be if you decode the same source as 24 bit, or 16 bit, or 8 bit, because no matter what the bitdepth of the source audio, it's being re-encoded at 192kb/s.

For a lossy source, the higher the bitdepth it's decoded to, the more accurately it's decoded, therefore it makes sense to use the highest bitdepth possible as the input for another lossy encode.

Lossless audio is different because when encoding at a given sample rate (44.1kHz or 48kHz etc), the greater the bitdepth, the greater the bitrate required to store the extra bits, and therefore the greater the file size.

Don't forget about quantization noise - definitely bitdepth is important also for lossy audio compression.

Quote
19th Mar 2024 04:47 #13
skh

View Profile

View Forum Posts

Private Message
Member

Join Date
Oct 2023
Originally Posted by hello_hello

The input bitdepth makes absolutely no difference to the size of lossy audio. The size is determined only by the bitrate and the duration. If you encode as 192kb/s AC3 while decoding the source as 32 bit, the size of the encoded AC3 file will be exactly the same as it would be if you decode the same source as 24 bit, or 16 bit, or 8 bit, because no matter what the bitdepth of the source audio, it's being re-encoded at 192kb/s.

For a lossy source, the higher the bitdepth it's decoded to, the more accurately it's decoded, therefore it makes sense to use the highest bitdepth possible as the input for another lossy encode.

Lossless audio is different because when encoding at a given sample rate (44.1kHz or 48kHz etc), the greater the bitdepth, the greater the bitrate required to store the extra bits, and therefore the greater the file size.

you are right that the file size encoded in a lossy aac or ac3 will not much differ between 16,24 and 32 bit but in an uncompressed format such as .wav it will. It is also very performance intense working with 24 or 32bit wav files in adobe audition takes ages to load on my PC. 16bit is much faster and doesn't take as much space also. I am just used to using 16bit my entire life for everything and most people even on pro releases do that too. You could always argue that as long as every step before the final encode was lossless with an as high bitdepth as possible it is ok to use 16 as a final encode but the reality is that the majority of the sources are already lossy so most who don't know better would just convert ac3 16bit to wav integer 16bit and then to ac3 16bit or aac 16bit again
That would generate the loss pandy warned about. However working like that for many years I never actually could tell a difference hearing wise such as more compression artefacts or that quantization noise. I have some descent monitors and a very good stereo setup and am pretty picky when it comes to sound quality so whatever I mess up can't be noticed by my hearing and when comparing before and after in spectral view also doesn't seem to differ much from each other.

Quote

ffmpeg bug: only 32 bit possible when re-encoding aac to ac3

Thread Tools

Search Thread

Similar Threads

Converting to AAC 5.1 / 7.1 with ffmpeg changes channel layout

AC3 to AAC

How do I use QAAC AAC with FFMPEG command line?

AC3 5.1 to AAC 5.1

AAC or AC3