I am looking already for a while for an answer which quality settings are in generally the better ones.
Audio data are partly available as 16 & 24 bit - whether as PCM, DTS-MA or as TrueHD.
I was wondering whether it is possible to hear that difference. If it is so - why there is 16 Bit available.
Lets assume that there is a 24 bit source (PCM) and encode it with a good lossy codec and get lets say a 320 kBit audio file (24 Bit).
Lets now downsample the source to 16 Bit and use a lossless codec like flac and we get again a 320 kBit audio file - but this time 16 Bit.
Which of the 2 files will be of better quality?
Is it even possible to give an answer for that?
Or ask the question different:
If I want a specific file size - is it qualitywise better to downsample to 16 bit or stay at 24 bit?
+ Reply to Thread
Results 1 to 30 of 46
Thread: 16 vs 24 Bit
Since Flac is lossless, I don't think you'll get 320kbps. Samples I've seen are typically half the size (in MB)
of the source PCM file. Since it's lossless it will sound the same as it's source.
Meanwhile the 320kbps (mp3 ?) will not be as good as it's source, since lossy compression was used.
Whether you can hear the difference or not - well that depends.
You've got to separate what the encoder can do from what the spec is capable of. Some encoders that accept 24bit audio will still internally dither/round/truncate down to 16bit (with varying degrees of quality, depending on how they do it).
However, just think about it theoretically, assuming an encoder DOESN'T downrez its source, a lossy encoding of a 24bit file should still sound better than a similar bitrate lossy encoding of a 16bit version of the same source. Why? Because the difference between 16bit and 24bit LPCM audio has to do with dynamic range resolution. 24bit is theoretically resolvable to 144dB (though in practice the max is usually 112-124dB because of limits of other links in the recording chain), while 16bit is theoretically resolvable to 96dB. That's a difference of 48dB. And that's not just the ability of things to be LOUDER, but also the ability of things to be softer without being drowned out in noise.
A lossy encoding uses certain psychoacoustic principles to guide how it drops info. Having a master that has better "resolving power" SHOULD give the encoder more/better info with which to base its decisions on. It would follow that for a given bitrate (say 320kbps), the encode created from the 24bit master SHOULD have better relative quality.
Do they actually do that? Who knows? This could be an empirical question given an encoder that doesn't downrez internally. Simple differencing algorithms could objectively show whether an encode based on 24bit was or was not a better match (percentage-wise) to a similar encode based on 16bit.
Of course, this all also begs the question: WHO can hear this difference? When you are referring to being able to hear it, WHOM are you referring to? What's your criteria? This depends alot on the type of signal and the user and the environment and the expectations.
I could tell you that oftentimes I can hear the difference between a 24bit LPCM master and a 16bit VERY HIGH QUALITY downrez (which has been dithered with 1/3LSB Gaussian noise, then rounded, then truncated). But not every time.
<edit>NOTE that there is really no such thing as 24bit mp3 vs. 16bit mp3, as mp3 and all the other lossless formats NO LONGER are using LPCM. Everything is an approximation, REGENERATED from what the encoder decided to keep from it's filter banks. Mp3, etc., doesn't count things that way.
CD Audio uses 16 bits and some playback devices, like car audio players, can't play anything above 16 bits, so these reasons are why 16 bit is available. If you have 16 bit FLAC at 44.1 KHz it's trivial to make an audio CD out of it.
Unless I misunderstand things, the difference between 16 bit and 24 bit is precision bits, so 16 bit and 24 bit files encoded to the same codec and same bitrate should be about the same size. In that case, you may wish to keep the 24 bit one if you are using FLAC or LPCM (WAV) but I agree with Scott that for lossy codecs, it doesn't really matter. In fact, using 24 bits for things like MP3 may result in files that only a PC can play and no other playback device can play.
@jman98, Things like HDCD are attempts to give CD Audio more than 16 bits, but that's being nitpicky.
If it's the same codec, with the same parameters, AND the SAME bitrate, it's going to be the ~same size. Also, I could be wrong about this, but I sincerely doubt that a decoder is going to care which master it came from: as long as it is compliant with the spec, it should be decodable. At that point, how it decodes is TOTALLY dependent upon the decoder. IOW, if both a "24bit" and a "16bit" title are both decoded by a "16bit"-capable decoder, they'll both decode as 16bits.
With audio, what's eroded is small differences in the loudness. So anything below a certain decibel level would be gone and if you turned up the volume you'd hear either distortion or noise. With pictures, banding is solved by dithering so it appears higher quality. With audio, dithering smooths the distortion into noise and the higher the bit depth is, the quieter the noise is.
With 24-bit audio, you can have very loud and very quiet parts of the audio with their quality retained, just like with HDR Photos that have very high bit depth can restore a vibrant picture from a photo taken in a dark room. If you take a photo in a dark alley and save it as 96-bit HDR then re-save it as 32-bit, you won't see a difference when looking at them side by side. However, if you turn up the brightness in the 96-bit photo, the photo will look as if it was taken in the daylight while brightening up the same 32-bit photo will just make the blacks bright grey and the picture will look like shit because all the super-small details were squashed.
So you won't hear a difference between 16 and 24 bit on a regular song but it can be useful if it has extremely quiet and extremely loud parts and you wanna play the song with a normalization pre-processing.
@Mephesto, how "1bit" is encoded determines what its capabilities are. 1bit in the context of LPCM is as LOW a resolution as you can get: full on or full off. "1bit" in DeltaSigma Modulation (aka SACD & DSD & DST) is equivalent to ~20bit LPCM (if the sample rate is high enough).
I agree, 24bits really comes into its own as a source intended for eventual processing/fx, mixing/blending, and/or compression downstream.
However, the OP asked about 16 v. 24 bit, and I personally consider 24 bits to be a waste of storage space for playback.
The thing is, there is not one 24 bit DAC that will actually give you 24 bits of resolution and there probably never will be. 24 bit has a dynamic range or 144dB. -144dB is well below the noise floor of any electronic circuit you can get. You may be able to get 20 bits resolution, maybe 21, but you may want to be sitting down when you hear the price. It'd definitely be an at least 5K$$$ DAC.
Yes, 24 bit SACDs generally do sound better, but they only release material that was very well recorded on them. That's really why they sound that good.
It is true that 16 bit CDs are recorded on 24 bit ADC machines but that's so they can do all that silly processing and then the quantization artifacts they cause are below 16 bit LSB level.
So I do resample the little 24 bit audio I have to 16 bit, using Audacity with dithering.
But I wouldn't recommend resampling the sample rate from 96 to 44.1. 48 would be better but I still wouldn't want to do that. Changling the sample rate is evil.
I've done a bunch of tweaking to config files in linux on the machine I play music on just so the mixer doesn't resample 44.1K audio files to 48, which it loves doing. So does Windows. That's what those ASIO/Wasapi filters are for. It makes a big difference.
Actually nowadays I just avoid 24 bit audio for computer playback. You're talking about taking up 3-4X the disk space for no real perceivable difference. A lot of recordings sound like crap anyway, and I convert those to 320K CBR mp3.
Your first example isn't accurate. Lossy compressors don't have a fixed bit depth as lossless sources do. Therefore you don't really have a 24bit 320k lossy file, just a 320k lossy file. It doesn't matter if the source was 24, 16 or 8 bit (lossless). There is no such thing as a 24 or 16 bit MP3 etc. It's just an MP3.
Lossy encoders may have a maximum input bitdepth (ie 24bit for LAME/MP3 and 32bit for NereoAAC, if I remember correctly) and they may need to be decoded to a fixed bitdepth on playback, but lossy encoded files don't have a fixed bitdepth as such.
Your second example isn't accurate either because you won't end up with a 320k lossless flac file. The bitrate will be much higher. That aside, generally your second example would produce the best quality result.... well at least in theory.... because it's lossless. There may be some rounding (quantisation) errors when downsampling to lossless 16 bit, but it shouldn't be anything audible at 16 bit, and often when downsampling (or converting from lossy to lossless) dithering is used.
I just converted a couple of MP3s to flac as a quick test and even using maximum compression they ended up close to 900kb/s when I converted them to 16 bit flac and about 1600kb/s for 24 bit flac. I tried an audio track from an episode of a TV show as it'd be easier to compress and it produced a stereo 540kb/s 16bit flac file (950kb/s for 24bit).
You're probably comparing apples and oranges. Flac will still require large file sizes (even for 16 bit) while at 320k a lossy encode will be much smaller, so if file size is the main issue, I'd be going for whichever gives me the closest to the file size I want. Personally I use either LAME's standard v2 preset (for MP3) or the default q0.5 setting for the Nero AAC encoder. I don't even go for maximum lossy bitrates. I just use variable bitrate presets which are considered to be "transparent".
At a high enough bitrate pretty much all lossy encoders are considered "transparent" anyway, but I guess the main advantage of keeping a lossless copy is in case you ever want to convert it to another format at a later date... you're not re-encoding a file which is already lossy.
Last edited by hello_hello; 2nd Nov 2013 at 14:15.
Could I be so bold as to post one of the concluding paragraphs?
This paper presented listeners with a choice between high-rate DVD-A/SACD content, chosen by high-definition audio advocates to show off high-def's superiority, and that same content resampled on the spot down to 16-bit / 44.1kHz Compact Disc rate. The listeners were challenged to identify any difference whatsoever between the two using an ABX methodology. Boston Audio Society conducted the test using high-end professional equipment in noise-isolated studio listening environments with both amateur and trained professional listeners.
In 554 trials, listeners chose correctly 49.8% of the time. In other words, they were guessing. Not one listener throughout the entire test was able to identify which was 16/44.1 and which was high rate, and the 16-bit signal wasn't even dithered!
I'm always amused to the assumptions that studios do X.
Audiophiles (audio nerds) would shit bricks if they knew what really goes on.
Often times they and software devs are more concerned than labels are.
And sources? Yeesh... most are in bad shape.
So...I just read the article & comments.
While it could be considered fairly conclusive for what it checked, I (because of my background) noticed a total lack of checking on dimensionality beyond simple stereo. I could have told them: THAT is where you ought to clearly be able to tell a difference (between standard def rates & high-def rates). Just like where 4k & 8k are proving to make better use of and are better showcasing stereoscopic images, higher definition audio makes better use of and better showcases binaural and/or surround sound imaging. The cues that are seemingly un-noticeable or below the level of human acuity are only that way when analyzed against single-sensor systems. IOW, sure!, it's no surprise that you can only resolve a certain amount of contrast within an arc of certain smallness, just like you can only resolve a certain range of frequencies and amplitudes. But those assume one eye (or both acting as one) and one ear (or both acting as one).
Instead, if the assumption is the dimensional sensory construct generated by (eye L + eye R + Brain) or by (ear L + ear R + Brain), it follows that not just the input data, but also difference (relationship) between 2 input sets, can give a finer demarcation and a richer panoply of awareness. And while an ultrasonic tone might not pass notice alone, the same ultrasonic tone with a micro delay between the ears becomes noticeable. Same holds true with the eyes.
One (controversial) way of saying it is that in those circumstances we can see things we couldn't see and we can hear things we couldn't hear.
@Hoser Rob, 24bit ADCs and DACs are still necessary to set up the structure of the 24bit word, even if they aren't fully 100% utilized. These days IT DOESN'T HURT to have them, and 24bits are always important for processing, mixing & maintaining headroom. Playback, meh (not counting the circumstances I just mentioned above). Most "24bit" ADCs and DACs still provide 20 or 21 true bits, and that still equates to ~120-126dB, which is impressive. Well, then why not just use 20bit words, and "SAVE US CONSUMERS THE MONEY". Good question. Let me ask you this: how do we store digital data? - in combinations of 8bit chunk words. 8bit, 16bit, 24bit, 32bit...If you wanted to save 20bit data, you still have to make use of 3 storage bytes (24bits), just that the last 4 bits would be ZEROs. At least, using 24bit systems, the data is filled (and is recoverable and/or operable/usable), even if steeped in noise. Also, while there are some boxes that might cost that much MANY DO NOT! (more like $500-$2000).
<edit>Also, one area that manufacturers don't mention is that it is hard (=expensive) to provide an ADC or DAC that has low jitter (high timebase linearity) and low quantization distortion (high quantization steps linearity). It's actually cheaper to create a 24bit device that is fair in these areas, than it is to create a 16bit device that is exemplary in them. The finer steps partly mask the shoddier technology. That's why 24bit devices that ARE of exemplary quality are as expensive as they are. They earn their keep.</edit>
Also, resampling is no more EVIL than the original sampling is. Because THAT'S WHAT IT IS DOING. If you sample at 96kHz and play back at 96kHz back to analog, and then using that analog source to sample at 44.1kHz, it still might not really be noticeably different than sampling your original source directly to 44.1kHz. This all depends on HOW GOOD those ADC & DAC filters are. Not perfect straight wire, but DAMN close. So too, resampling performs the same 2 algorithms, except it just keeps it in the digital domain the whole time (with hopefully some accompanying math efficiencies that could improve the transfer). Again, it all depends on the QUALITY of the filtering algorithms in the resampling. Maybe your apps are not made with high quality filtering in mind.
One more thing to clarify: SACDs are not 24bit, they are 1bit (but they aren't LPCM, they are DSD encoded). This ~120dB SNR of SACD is equivalent to 20bit LPCM at ~192kHz.
Last edited by Cornucopia; 3rd Nov 2013 at 02:11.
I'll confess I'm having trouble following....
As an example.... you seem to be saying that if I stick a pair of speakers in front of me and run a 16bit/44.1k source through them it'll reproduce everything I can hear just as well as a source of a higher bitdepth/sample rate, and likewise I can stick a pair of speakers behind me and run a 16bit/44.1k source through them and the whole audio spectrum will be reproduced as accurately as possible, but if I run both sets of speakers simultaneously (ie surround sound) then the individual 16bit/44.1k sources are no longer adequate, because then I can hear frequencies I can't actually hear? I'm not understanding the logic behind that.
Likewise I'm not sure I quite understand how 4k and 8k sources make better use of stereoscopic images (I assume you're referring to 3D?) aside from an ability to display it at a higher definition. How does that not also benefit 2D images in exactly the same way? Surely once you're using a resolution which displays maximum possible detail for a given video, it's still displaying the maximum detail whether it be 2D or 3D.
A bit of science - at first:
"Finally, consider 8-bit, four-times-oversampled PCM with
noise shaping. This is also a data rate one-half that of DSD and
double that of CD, with a sampling rate of 4 × 44,100 =
176,400 Hz. It can achieve a noise floor 120 dB below full
scale up to 20 kHz, using 96 dB of noise shaping, and a total
noise power of –19 dBFS. Its frequency response would be
flat to 80 kHz. This example is perhaps the most instructive of
the lot. For a data rate one-half that of DSD, it achieves a
comparable signal bandwidth, with a similar noise power
density up to 20 kHz, but much lower power above this
frequency, and 28 dB lower total noise power. It is fully
TPDF-dithered, and so is completely artefact free. At one-half
the data rate it outperforms DSD on every count! DSD is a
profligate wastrel of capacity."
At second - as there is no ATH curves for high frequency (over 20kHz) there is no psychoacustical models how reduce (lossy compression) data thus author comparison (assumption) is NOK.
To conclude - high quality audio can use less than 24 (or more common 16) bits but it will require additional processing (dithering + noise shaping) and non lossy codec (storage).
General principle with sampling system is simple - higher sampling rate is better, higher bit depth is better (however to fully use high bit depth, ADC and DAC must be also high accuracy).
Lossy compression use psychacoustic models and for hearing over 20kHz such models practically not exist thus high sampling rate/high bitdepth systems probably can use only lossless or no compression solutions.
You gotta do some more research ... higher sample rates are there to avoid putting a "brick wall" filter in the DAC, which tends to cause huge distortion because steep filters make circuits ring. That's been known for decades.
hello_hello already beat me to it.
Pandy, no psychoacoustic models exist for content over 20khz because NOBODY CAN HEAR ANYTHING THAT HIGH. There's no good reason to have any song with more than 32khz samplerate for casual listening.
Relevant thing OP needs to know is that he won't notice a difference with 24-bit unless there are very quiet parts of the song he intends to turn up and it's better to do this while the audio is still 24-bit otherwise he'll turn up audible noise along with the quiet audio.
There's only one song in my collection I can name off the top of my head that I that has very quiet and very loud parts that might benefit from 24-bit since my songs go thru ffdshow normalization pre-processing.
foobar2000's output metre.
Getting off topic a bit (sorry to the OP) but ffdshow normalisation for music? You haven't discovered ReplayGain?
Or have you tried a compressor plugin? I use RockSteady myself. Not for listening to music, but for compressing soundtrack audio. It does a good job, but for music it'd have to be better than ffdshow turning the volume up and down. Or there's LoadMax, which I've only briefly played with, but it's very simple to use and seems to sound good. Or there's a couple of compressor DSPs for foobar2000 if you don't want to use a Winamp plugin with ffdshow for some reason.
I adjust music tracks with ReplayGain myself as nothing I listen to needs compressing as such. Anyway, just some thoughts.....
Last edited by hello_hello; 3rd Nov 2013 at 18:44.
And this is only perception - there is different aspect - accuracy of system itself - even digital filters are approximation idealized one - as You know that recursive filter require very high bitdepth (some IIR filter studies shows required 56 - 72 bit precision required for calculation), non recursive filters are based usually on SinC function and this function have inifinite lenght thus any practical implementation is trade of between resources/latency/accuracy.
Thus i see that high bit depth, high sampling systems (192/24) can be useful to reduce anything that can goes wrong to level beyond perception (any case imaginable solved).
My point about lossy coding and lack of models was to point that there is no knowledge behind models used for lossy compression thus high dept/sampling systems probably should be seen as lossless format compressed candidates only.
And with careful processing 1 bit can be enough - it need to be only very fast.
Technically the Nyquist-Shannon sampling theorem requires twice * f max in sampled signal however with assumption that antiaaliasing filter is brick type with ideal characteristic and remain parts of the system itself are perfect too (there is no distortions in any domain)... this is quite hard to meet (practical implementations is not realistic). And from DSP point of view for correct amplitude representation for non oversampled (low oversampled) systems compensation filter is required which additionally complicate overall system (and introduce additional distortions)
For real life situation required sampling frequency is somewhere (at least) around 2.5 - 5 * f max.
Deleted to combine with my next post.
Last edited by hello_hello; 4th Nov 2013 at 11:00.
Wow - I didn't expect to get that many answers.
The example with the bitmaps was nice - I tried to explain it myself that way before - but I was not sure whether audio and graphics work the same way. I know that a 16 Bit TIFF/PNG can look in generally better than a 24 bit jpg (first of all on cheap screens not being able to show 16 million different colours at the same time). The 24 bit lossy one will have artefacts. The 16 bit will have less "contrast & sharpness".
But I wasn't thinking of the HDR. I am also not sure whether ears can be tricked as well as eyes.
That was the reason why I asked here.
The data in the original post (320kbit) was only an example. And I also did not mentioned 2-channel audio or vbr/cbr for the lossy codec - so every bitrate would be able to be reached.
I have several consumer cameras which save the audio either as AC3 or PCM. But always 16 bit. I am not sure whether there are any camcorders/cameras who support 24 bit.
Knowing that both codecs (I dont want to discuss here whether pcm is a codec or not) exists in 16 and 24 bit I was thinking whether the 16 bit PCM is really a big loss. I know that all will be also limited by the microphone.
Therefore I 'created' the example of a 24 bit source. And limited by the bandwidth I can imagine that i will be able to choose one day perhaps between 16 bit PCM and 24 bit AC3.
I read http://xiph.org/~xiphmont/demo/neil-young.html already earlier - also the statement
16 bits is enough to store all we can hear, and will be enough forever.
So I was not sure - only my feeling told me that 16 bit lossless would have been the better choice.
I simply want to learn how these things work and what is necessary and what not.
My "casual listening" test involved picking a random track and converting it to a wave file several times while listening to it through PC speakers (good quality). Admittedly my ears took a battering at work tonight so I might try again tomorrow, but I wasn't able to convince myself I could hear a difference until I dropped the sample rate down to 22050Hz and even then I'd probably need to ABX it to be certain. At 1600Hz the difference was quite obvious, even when "listening causally".
A quick and simple "casual listening" test for anyone who wants to try it.
Last edited by hello_hello; 4th Nov 2013 at 11:00.
And yes - audio and video works very frequently based on same math and same principles however details can be very different and sometimes opposite.
No - this can be correct - as DCT can have unlimited resolution (how accurate you can calculate cosine function value) in real world any function with unlimited resolution need to be quantized - this where quantization error appear - Mediainfo just report product from AC3 decoder i.e. bit depth after quantization (in fact this is requantization as quantization occurs first in ADC stage).
Last edited by pandy; 4th Nov 2013 at 12:26.