This question is not very easy to answer, but we will try here to do so. The central question is, how well can an Mpeg-1 Layer-II audio stream, which is normally used on SVCDs and DVDs, reconstruct a complex signal like the Dolby Pro Logic audio signals. We know, that especially the phase information between the two channels hides the surround sound information. To explore, how well the audio Mpeg encoder can encode complex signals, we use the most random signal (with the highest entropy): white noise. As encoder we use the TMPGEnc Plus 2.57 from Pegasys Inc..
Two different signals will be used to test, how well TMPGEnc encodes the signals. As a "hard stereo" signal we use a random number generater with uniform distribution to generate two inependent channels with random numbers. We know from theory, that random numbers generate white noise, i.e. the signals contain every frequency with the same power, like white light. The second signal (a more realistic one ?) was generated out of three random signals. The first was mixed with 90% amplitude into both channels (so this was the mono signal). The second and third random signals was mixed with 10% amplitude separatly into the left an right channel. This corresponds to little stereo effect or little Dolby Pro Logic information. Lets call this signal "little stereo".
We used the following bitrates: 64,96,128,160,192 and 384 kbit/s and the methods "joint-stereo" and "stereo" to encode both signals. The signals had a length of 20 seconds. In the following figure, we have plotted the correlations between the original signal and the encoded signals. For the calculation of the correlation, a frequency window of 300 Hz was used.
As higher we choose the bitrate, we get a better reconstruction of the original signal, with both encoding methods (64=red, 96=green, 128=blue, 160=cyan, 192=yellow, 84=black). One further sees, that the cut-off frequency grows with increased bit rate (from 6kHz to 12kHz and finaly 19kHz).
Looking at the 100% stereo signals, it gets clear, that one should use the method "stereo" to encode the signals if bitrates <192 kbit/s are used, but for 128 kbit/s the cut-off frequency is lower for stereo then for joint-stereo. Nevertheless if one uses 192 kbit/s or more one can't see a difference.
In the 90% mono / 10% stereo signal, the joint stereo methods shows, how good it improves the quality! It is always better then the stereo methods, but as in the case of the 100% stereo, from of 192 kbit/s or more, the signals equal and it makes no difference, which method to use.
So, what have we found out:
*Joint-Stereo is not always better than Stereo!
* Use bitrates of 192 kbit/s or more, than you can choose both methods "joint-stereo" or "stereo", which makes no difference, and you will not loose very much information (especially with 384 kbit/s).
* With 160 kbit/s one should always use "stereo", it reconstrcuts the signal better then "joint-stereo" (at least in the range <160 kHz).
* If you need smaller bitrates (<=128 kbit/s), then it depends on the signals. If it is hard stereo one should use "stereo" to have the high frequencies included, if its has less stereo effects one should use "joint-stereo".[/list]
But you never can be sure to have Dolby Pro Logic sound if you use bitrates <192 kbit/s! Note: These tests were made with the encoder TMPGEnc, so I don't know, how other encoders behave!
For more information, just have a look at my website: http://andreas.welcomes-you.com/projects/dv/index.html
+ Reply to Thread
Results 1 to 5 of 5
Interesting work and it's great you've posted your results here. There, however, are a few things to consider.
Firstly, "correlation" with the original audio does not necessarily imply better quality audio, especially with MPEG audio compression.
Secondly, Dolby Surround information is encoded as phase differences between the left and right channels, and there is nothing in your tests that convincingly demonstrates that stereo is better than joint-stereo (or vice versa). The 100% stereo correlation test showing the superiority of stereo at lower bitrates would SUPPORT the hypothesis that stereo is better than joint stereo for Dolby Surround encodings, but it does not prove it.
Furthermore, "joint stereo" stereo encoding actually includes both "joint stereo" (i.e., L-R, L+R) mode AND stereo (L and R) encoding methods, depending on the audio. Generally, there are MORE blocks of "joint stereo" encoding if the audio bitrate is lower and if there is a lot of redundancy between the left and right channels.
I think it is fairly well established that low bitrate joint stereo audio encodings is very detrimental to Dolby Surround encodings (due to the loss of phase information), but what would be interesting to know is whether high bitrate joint stereo recordings would necessarily have the same effect.
If I recall correctly, Dolby Surround only has a bandwidth of 7kHz on the surrounds. Looks like all the differences are only at frequencies above 5kHz. Will these differences have any practicle effect on Dolby Surround then?
Yes. Remember, that Dolby Surround info is encoded as phase information between the channels. This is why I said the tests are not that useful in themselves.
For example, it would be possible to have two audio recordings that have near identical frequency distributions, but for one to have the Dolby Surround encoded info and for one to have NONE (i.e., Dolby surround info "destroyed" if you like). You can easily demonstrate this by doing a downmix of an AC3 5.1 audio to stereo. If you set one to downmix to Dolby Surround encoded stereo and one to just plain stereo.
That is, the "correlation" in the tests between the original and encoded audio in terms of frequency distribution doesn't mean much with regards to whether Dolby Surround info has been affected as well.
1. You are correct in the statement, that correlation is not a measure, how good the audio signal sounds. Since we have no measure for this, we use at least the correlation to give any information, how good the signal is reconstructed.
2. Dolby Surround: When havin a look at the specifications of Dolby Surround, one sees, that the surround signal is mixed into the Stereo Channels with each a + or - 90 degree shift. This is what the definition says. Since I couldn't find any definition of this so called "phase shift" the encoder is difficult to construct.
Most implementations use the trick of doing a 0/180 degree "phase shift" which means, adding the surround signal to the left channel and the the same surround signal as negativ to the right channel. In Formulas:
L_T = L + C/sqrt(2) + S/sqrt(2)
R_T = R + C/sqrt(2) - S/sqrt(2)
where L,R,C,S are the true left,right,center and surround channel, and L_T and R_T are the total left and total right channel after mixing. The correct way with a phase shift would be something like:
L_T = L + C/sqrt(2) + i*S/sqrt(2)
R_T = R + C/sqrt(2) - i*S/sqrt(2)
where i = sqrt(-1). (Reference: http://audiolab.uwaterloo.ca/~jeffb/thesis/node14.html) But one gets the problem of complex values...
To decode the Signal (as specified by dolby) one uses:
Left = L_T = L + (C+i*S)/sqrt(2)
Right = R_T = R + (C-i*S)/sqrt(2)
Center = (L_T+R_T)/sqrt(2) = (L+R)/sqrt(2) + C
Surround = (L_T-R_T)/sqrt(2) = (L_R)/sqrt(2) + i*S
The recoverd Signals have always a extra term (of magnitude max. 2*1/sqrt(2) = 3dB). This recovery is actually very bad, but still good enough for our home theater.
No what do I want to say with that: The Problem is to find a good encoder, then one could try to test the loss of hte dolbysurround signal.
Using the first methode (0/180 degree) one gets into problems, when a signal fades from the rear (S=surround) to the front (C=center). What happens is the following: The signal would start at the rear, this would be reconstructed, but when fading to the front, the signal would fade to the left since:
Left = L_T = L + (C+i*S)/sqrt(2)
Right = R_T = R + (C-S)/sqrt(2)
the right channel would vanish (C=S -> Right=0). That's why the methode (0/180) is not really the best.
Is there a solution ? Well, one has to look for so called allpass networks, which can change the phase over the frequency range. In analog circuits, this can be used, but I really don't know, how to implement this in a "digital" algorithm.
But I will do some test with the 0/180 degree methode, since nearly all "Dolby surround downmix" algorithms use this. Over the christmas days, I will try to do so.
BTW for technical information about Dolby, please have a look here: http://www.dolby.com/tech/