Dolby E is a lossy audio codec introduced by Dolby Laboratories in 1999. It achieves an approximately 4:1 compression ratio and can withstand multiple encode-decode cycles without any audible degradation. The sample rate of the Dolby E encoded audio should always be at 48 kHz. There are three data modes available for Dolby E, which are measured in bits and determine the number of channels that Dolby E can carry. The 16-bit data mode with a maximum of 6 channels. The 20-bit and 24-bit data modes with a maximum of 8 channels. The bit depth of the Dolby E data mode is carried as information in the metadata and doesnít correlate to the bit depth of the Dolby E encoded audio. The latter is unknown information. The bit depth of the Dolby E data mode is what is shown in MediaInfo.
A Dolby E stream/signal is carried within an AES3 stream/signal in .ts (Transport Stream) files. An AES3 stream/signal can carry two channels of PCM audio data, up to 24 bits - 192 kHz. AES3 is made up of audio blocks, where each one of them contains 192 consecutive frames. An AES3 frame is divided into two subframes, one for each of the two channels. Each subframe consists of 32 bits: 24 bits of maximum resolution (information) for each audio sample combined with 4 flag bits and 4 synchronization bits. The Dolby E data mode defines the number of bits from each of the two AES3 subframes, that are utilized by the Dolby E stream/signal.
Dolby E in (16-bit, 20-bit, 24-bit) data mode is carried as AES3 PCM of 2 channels - 48 kHz - (16, 20, 24) bits, which means that (16, 20, 24) bits from each of the two AES3 subframes are utilized by Dolby E. AES3 PCM of 2 channels - 48 kHz - (16, 20, 24) bits, has a bitrate of (1536, 1920, 2304) kbps, which is also the Dolby E bitrate in (16-bit, 20-bit, 24-bit) data mode. Dolby E is represented as AES3 PCM, while in fact, it's AES3 NON-PCM. During playback, if Dolby E decoding is not supported, the audio data may be misinterpreted as AES3 PCM data instead of AES3 NON-PCM data, which results in NOISE playback.
The meaning of the frame is vague in audio technology, as it means one thing in the case of AES3 and another thing in the case of Dolby E. Every Dolby E frame contains several audio samples since the frame-rate of Dolby E is equal to the frame-rate (X value) of the accompanying video. Consequently, the total number of samples per Dolby E frame is: 48000 (samples/second) / X (frames/second) = 48000/X SPF (Samples Per Frame).
Minnetonka Dolby E Encoder/Decoder Manuals have a nice explanation of Programs and Program Configurations, in the Introduction and Appendix A sections.
SMPTE 302M: Television - Mapping of AES3 Data into MPEG-2 Transport Stream
SMPTE 337M: Format for Non-PCM Audio and Data in an AES3 Serial Digital Audio Interface
Since it is difficult to obtain professional software such as Neyrinck SoundCode and Minnetonka SurCode, these steps provide a way to decode Dolby E audio with FFmpeg. The FFmpeg Dolby E decoder isn't a licensed Dolby E decoder. Itís more of a reverse-engineered decoder. It produces an output that is almost identical, if not entirely identical, to the output of the licensed Dolby E decoders. Also, it may miss a few audio samples upon decoding, either from the beginning or from the end of the audio. However, FFmpeg is able to perfectly extract the AES3 (Dolby E) bitstream from the .ts file.
All CREDITS to foo86, for creating this decoder!
On https://wiki.multimedia.cx/index.php/Dolby_E it's stated that the internal sample rate of Dolby E varies between 42.965 kHz and 53.760 kHz, depending on the associated video frame-rate, and that this sample rate is converted back to 48 kHz after decoding. This claim is false, see here:
However, the FFmpeg commands below, decode Dolby E to a sample rate of 44800 Hz, 53706 Hz, or whatever, which made this claim seem correct. It might be an FFmpeg bug, so parameter -ar 48000 is used below to 'fix' this issue via resampling. Thanks to Cornucopia!
ffmpeg -formats, ffmpeg -formats | findstr PCM (Windows), ffmpeg -formats | grep PCM (Linux)
Whatever is the bit depth of the Dolby E data mode, raw format s24le and codec pcm_s24le are used in command 1 of each combination. These parameters work for each data mode (16-bit, 20-bit, 24-bit). Moreover, if the bit depth of the Dolby E data mode is 16 bits, raw format s16le and codec pcm_s16le could be used too.
Whatever is the bit depth of the Dolby E encoded audio, in combination 1, command 3 utilizes codec pcm_s24le, and in combination 2, commands 3 and 4 utilize codec pcm_s24le. These parameters fit perfectly for 24-bit audio. They also work for lower bit depth audio since it can be represented as 24-bit audio. Moreover, if the bit depth of the Dolby E encoded audio is 16 bits or lower, codec pcm_s16le could be used for the same reasoning too.
Hypothetically in this step, Dolby E carries 8 channels. Also, 0:0 is video and 0:1 is audio (Dolby E) in FFmpeg. The pattern of -map parameter is, input file : stream (video, audio, etc.)
Hit one of the two command combinations.
1) ffmpeg -non_pcm_mode copy -i input.ts -map 0:1 -c:a pcm_s24le -f s24le out.dat (or out.wav)
2) ffmpeg -i out.dat (or out.wav)
3) ffmpeg -i out.dat (or out.wav) -c:a pcm_s24le -ar 48000 -map_channel 0.0.0 out0.wav -c:a pcm_s24le -ar 48000 -map_channel 0.0.1 out1.wav -c:a pcm_s24le -ar 48000 -map_channel 0.0.2 out2.wav -c:a pcm_s24le -ar 48000 -map_channel 0.0.3 out3.wav -c:a pcm_s24le -ar 48000 -map_channel 0.0.4 out4.wav -c:a pcm_s24le -ar 48000 -map_channel 0.0.5 out5.wav -c:a pcm_s24le -ar 48000 -map_channel 0.0.6 out6.wav -c:a pcm_s24le -ar 48000 -map_channel 0.0.7 out7.wav
1) ffmpeg -non_pcm_mode copy -i input.ts -map 0:1 -c:a pcm_s24le -f s24le out.dat (or out.wav)
2) ffmpeg -i out.dat (or out.wav)
3) ffmpeg -i out.dat (or out.wav) -c:a pcm_s24le -ar 48000 abc.wav
4) ffmpeg -i abc.wav -c:a pcm_s24le -map_channel 0.0.0 out0.wav -c:a pcm_s24le -map_channel 0.0.1 out1.wav -c:a pcm_s24le -map_channel 0.0.2 out2.wav -c:a pcm_s24le -map_channel 0.0.3 out3.wav -c:a pcm_s24le -map_channel 0.0.4 out4.wav -c:a pcm_s24le -map_channel 0.0.5 out5.wav -c:a pcm_s24le -map_channel 0.0.6 out6.wav -c:a pcm_s24le -map_channel 0.0.7 out7.wav
Parameter explanation of -map_channel x.y.z: x=container (input file), y=stream (audio), z=channel
In the above command combinations, command 2 shows the Program Configuration of Dolby E.
In the case of Configuration 0 (8-channel Dolby E), the commands work perfectly. On the contrary, in the case of Configuration 15 (6-channel Dolby E) both combinations should not output out6 and out7, and in the case of Configuration 20 (4-channel Dolby E) both combinations should not output out4, out5, out6, and out7.
In both combinations, command 1 utilizes SMPTE 302M decoder to process AES3 NON-PCM DATA in .ts file and obtain SMPTE 337M compliant stream (Dolby E).
In combination 1, command 3 decodes Dolby E to a multi-channel LPCM and simultaneously splits the multi-channel LPCM into individual per-channel files.
In combination 2, command 3 decodes Dolby E to a multi-channel LPCM file and command 4 splits the multi-channel LPCM file into individual per-channel files.
It seems that the order of the output channels out0, out1, out2, out3, out4, out5, out6, out7 follows the order of the Channels column, of the table in Appendix A of Minnetonka Manuals. From left to right and from up to down. As it looks, each Dolby E Program follows the SMPTE layout.
Configuration 0: PROGRAM1 out0=L, out1=R, out2=C, out3=LFE, out4=Ls, out5=Rs / PROGRAM2 out6=L, out7=R
Configuration 1: PROGRAM1 out0=L, out1=R, out2=C, out3=LFE, out4=Ls, out5=Rs / PROGRAM2 out6=C / PROGRAM3 out7=C
Name matching of the channels:
(FFmpeg) FC, LFE, FL, FR, SL, SR, BL, BR = C, LFE, L, R, Ls, Rs, Lb, Rb (Minnetonka, MediaInfo)
SMPTE 5.1 layout: L, R, C, LFE, Ls, Rs
Film 5.1 layout: L, R, C, Ls, Rs, LFE
SMPTE 7.1 layout: L, R, C, LFE, Ls, Rs, Lb, Rb
Film 7.1 layout: L, R, C, Ls, Rs, Lb, Rb, LFE
Depends on the Dolby E Program Configuration.
In the case of Configuration 0 hit these two commands to create a 5.1 surround mix Dolby_E_P1 and a stereo mix Dolby_E_P2. In the case of Configuration 1, only the first command is required, and for formality reasons, out6 and out7 could respectively be Dolby_E_P2 and Dolby_E_P3.
ffmpeg -i out0.wav -i out1.wav -i out2.wav -i out3.wav -i out4.wav -i out5.wav -filter_complex "[0:a][1:a][2:a][3:a][4:a][5:a]join=inputs=6:channel_layout=5.1(side):map=0.0-FL|1.0-FR|2.0-FC|3.0-LFE|4.0-SL|5.0-SR[a]" -map "[a]" -c:a pcm_s24le Dolby_E_P1.wav
ffmpeg -i out6.wav -i out7.wav -filter_complex "[0:a][1:a]join=inputs=2:channel_layout=stereo:map=0.0-FL|1.0-FR[a]" -map "[a]" -c:a pcm_s24le Dolby_E_P2.wav
Since the Dolby E encoded audio (whatever the bit depth) is decoded as 24-bit audio in STEP 1, codec pcm_s24le is used in STEP 2 too.
Taylor Swift - Shake It Off 2014 VMA Awards 1080i HDTV Dolby E 36Mbps MPEG-2-CtrlHD
Download FFmpeg from http://ffmpeg.org/download.html and place the media file in the bin folder. Then hit these commands (implementing combination 1) in cmd:
1) cd C:\Users\Atlas\Downloads\ffmpeg-2021-01-12-git-ca21cb1e36-full_build\bin
2) ffmpeg -non_pcm_mode copy -i "Taylor Swift - Shake It Off 2014 VMA Awards 1080i HDTV Dolby E 36Mbps MPEG-2-CtrlHD".ts -map 0:2 -c:a pcm_s24le -f s24le out.dat
3) ffmpeg -i out.dat
4) ffmpeg -i out.dat -c:a pcm_s24le -ar 48000 -map_channel 0.0.0 out0.wav -c:a pcm_s24le -ar 48000 -map_channel 0.0.1 out1.wav -c:a pcm_s24le -ar 48000 -map_channel 0.0.2 out2.wav -c:a pcm_s24le -ar 48000 -map_channel 0.0.3 out3.wav -c:a pcm_s24le -ar 48000 -map_channel 0.0.4 out4.wav -c:a pcm_s24le -ar 48000 -map_channel 0.0.5 out5.wav -c:a pcm_s24le -ar 48000 -map_channel 0.0.6 out6.wav -c:a pcm_s24le -ar 48000 -map_channel 0.0.7 out7.wav
5) ffmpeg -i out0.wav -i out1.wav -i out2.wav -i out3.wav -i out4.wav -i out5.wav -filter_complex "[0:a][1:a][2:a][3:a][4:a][5:a]join=inputs=6:channel_layout=5.1(side):map=0.0-FL|1.0-FR|2.0-FC|3.0-LFE|4.0-SL|5.0-SR[a]" -map "[a]" -c:a pcm_s24le Dolby_E_P1.wav
6) ffmpeg -i out6.wav -i out7.wav -filter_complex "[0:a][1:a]join=inputs=2:channel_layout=stereo:map=0.0-FL|1.0-FR[a]" -map "[a]" -c:a pcm_s24le Dolby_E_P2.wav
Modify the path in the 1st command with your own one.
The 3rd command shows Configuration 0.
Instead of the 4th command, these can be used (implementing combination 2):
ffmpeg -i out.dat -c:a pcm_s24le -ar 48000 abc.wav
Decoding Dolby E to a 7.1 "pseudo surround" mix file with L, R, C, LFE, Lb, Rb, Ls, Rs channel layout.
Channels L, R, C, LFE, Lb, Rb of the 7.1 "pseudo surround" mix make the L, R, C, LFE, Ls, Rs channels of the 5.1 surround mix, and channels Ls, Rs of the 7.1 "pseudo surround" mix make the L, R channels of the stereo mix.
ffmpeg -i abc.wav -c:a pcm_s24le -map_channel 0.0.0 out0.wav -c:a pcm_s24le -map_channel 0.0.1 out1.wav -c:a pcm_s24le -map_channel 0.0.2 out2.wav -c:a pcm_s24le -map_channel 0.0.3 out3.wav -c:a pcm_s24le -map_channel 0.0.4 out4.wav -c:a pcm_s24le -map_channel 0.0.5 out5.wav -c:a pcm_s24le -map_channel 0.0.6 out6.wav -c:a pcm_s24le -map_channel 0.0.7 out7.wav
Splitting the 7.1 "pseudo surround" mix into individual per-channel files.
Dolby E carries important metadata, such as DialNorm, Dynamic Range Control, etc. which are to be applied to the AC-3 encoder for television broadcast. Note that from a Broadcasting Station to another, Dolby E may be transcoded to Dolby E, and thus metadata might be applied there too. Furthermore, regarding the Program Configurations that include the 2-channel (L, R) Program, it might actually not be stereo but Dolby Surround (L, R, C, S) which is encoded as two channels (Lt, Rt). This information is located in the Dolby E metadata. The (Lt, Rt) channels can regenerate the (L, R, C, S) channels if a Dolby Pro Logic Surround Decoder is available, or be played back as stereo (L=Lt, R=Rt) too.
Dolby E is a tricky codec. In the Broadcasting Industry, hardware Dolby E encoding and decoding equipment is favored over software solutions. However, that hardware equipment introduces a delay to the audio upon encoding and decoding. This delay is equivalent to the time duration of a frame of the accompanying video. To address this issue, two methods are most commonly used.
1. Induce one-frame delay to the video, whenever the audio is encoded or decoded by hardware equipment. Thus, the video and audio will always be in sync.
2. Delay the video by a total of two frames, before the audio goes through the hardware encoder and subsequently the hardware decoder. So, in the between interval, from the hardware encoder to the hardware decoder, the audio will be ahead of the video, with a time difference equal to the duration of one video frame.
When a FEED is being transmitted from a Broadcasting Station to another, it's unknown if the video and Dolby E are in sync or out of sync, during the signal transmission. That depends on the method that's being used to manage the Dolby E audio delay. In the case of the 1st method, the video and Dolby E are in sync during transmission. But in the case of the 2nd method, Dolby E is ahead of the video, with a time difference equal to the duration of one video frame. Consequently, the only ones who know are the engineers of the Broadcasting Stations.
Now, the Dolby E stream is usually accompanied by one or more stereo MP2 streams in satellite FEED signals. The delay induced upon the Dolby E stream does not affect these MP2 streams. At least one MP2 stream contains the live event main audio mix. That MP2 stream is most certainly supposed to be in sync with the video. In addition, Program 1 of the Dolby E stream, most definitely contains the live event main audio mix too. Consequently, in the case of Configuration 0, supposing that Program 2 (stereo) also contains the live event main audio mix (which is the most common practice), a comparison between the waveforms (per channel) of this MP2 track and Program 2 may show if thereís any delay (positive or negative) of Dolby E. A useful indicator would be the point where the crests and troughs of the waveforms occur in the time domain.
In the Broadcasting Industry, the most common specifications for Dolby E are the following.
1. Program Configuration 0, which is a 5.1 surround track and a stereo or (Lt, Rt) track.
2. The bit depth of the Dolby E data mode is 20 bits.
3. The bit depth of the Dolby E encoded audio is 16 bits.
4. The LFE channel is almost silenced (low amplitude levels).
+ Reply to Thread
Results 1 to 6 of 6
Last edited by delta10; 9th Mar 2021 at 17:59.
I give only 1 wee bit of criticism that I notice so far: the bit about sample rates is misleading, as the possibility of variation there is only WRT trick-play VTRs. And in doing so, it outputs at those rates, giving faster+higher pitched or slower+lower pitched audio, to MATCH the faster or slower playback framerates. Otherwise it is ALWAYS locked to 48kHz.
See this document as reference: https://developer.dolby.com/globalassets/professional/dolby-e/dolby-e-tech-doc_1.2.pdf
Thanks for the feedback! In the document you're referring to, there are descriptions about that matter in:
Chapter 4: Transport Specifications, 4.1 Video Synchronization, 4.1.1 Frame Rate Synchronization
Chapter 5: System Integration Issues, 5.2 Program Play Feature
Chapter 9: Chapter for Specific Product Categories, 9.1 Storage Products, 9.1.1 Video Tape Recorders (VTRs)
FEED and Dolby E specs are attached. If the parameter -ar 48000 is not utilized, the decoded LPCM will have a sample rate of 53706 Hz.
Last edited by delta10; 7th Mar 2021 at 12:30.
working in broadcast industry, i have a perfect understanding of how we use it internally (in form of files mostly in mxf) but what i didn't get by now is how can it make sense to transport it over satellite? I mean what end customer's device is able to deal with it?
As i am already here... here is how i decode DBE from an mxf File that has 5.1 config (just as described above) in one shot.
There are 2 ffmpeg instances, one for demuxing the 2 PCM channels that contain my 16 bit dolbyE and writing to STDOUT, the second reading live from STDIN and writing all channels to separate wav files.
The Input is a typical broadcaster's format: MXF XDCAMHD, means we have 8 tracks, each 1 channel. In this example, we take the dolbyE from channel 3/4 (which is 2/3 from ffmpeg perspective when using the 0:a:X syntax)
ffmpeg.exe -i "___INPUT___" -filter_complex "[0:a:2][0:a:3]join=inputs=2:channel_layout=stereo[a]" -map "[a]" -c:a pcm_s16le -ar 48000 -f s16le - | ffmpeg -i - -acodec pcm_s24le -ar 48000 -ac 1 -map_channel 0.0.0 e:\temp\ch0.wav -acodec pcm_s24le -ar 48000 -ac 1 -map_channel 0.0.1 e:\temp\ch1.wav -acodec pcm_s24le -ar 48000 -ac 1 -map_channel 0.0.2 e:\temp\ch2.wav -acodec pcm_s24le -ar 48000 -ac 1 -map_channel 0.0.3 e:\temp\ch3.wav -acodec pcm_s24le -ar 48000 -ac 1 -map_channel 0.0.4 e:\temp\ch4.wav -acodec pcm_s24le -ar 48000 -ac 1 -map_channel 0.0.5 e:\temp\ch5.wav
ffmpeg.exe -i "___INPUT___" -filter_complex "[0:a:2][0:a:3]join=inputs=2:channel_layout=stereo[a]" -map "[a]" -c:a pcm_s16le -ar 48000 -f s16le - | ffmpeg -y -i - c:\temp\out.wav
Last edited by emcodem; 18th May 2021 at 10:07.
Nice! Your input is very interesting. Most people don't work in the Broadcasting Industry, and therefore don't have access to MXF files with Dolby E audio.
Regarding the satellite FEEDS that include Dolby E audio, they're not meant for end consumers. Dolby E is supposed to not degrade the audio quality greatly. Thus, instead of transcoding an MP2 track from the FEED for TV Broadcast, it's preferable to transcode a Dolby E track in order to achieve the highest possible quality. But I think that you already know all of that. However, there are people who manage to intercept and decrypt these signals, using the right equipment and software. BISS encryption (older protection system) has been bypassed, while BISS2 (newer protection system) hasn't yet. BISS is slowly being replaced by BISS2. Finally, Dolby E might be transcoded to FLAC or DTS-HD MA, and the resulting file is then shared on the Internet.
P.S. Old versions of Steinberg Nuendo, Minnetonka SurCode (Dolby E), and DTS-HD MAS have properly been cracked a long time now. That's how some people (end consumers) deal with Dolby E.
Last edited by delta10; 1st Apr 2021 at 08:44.
Thanks for the enlightening informations. I had a feeling it could only be about that but i wouldnt want to believe that the codec was implemented in ffmpeg just out of that usecase. Thinking of it twice, it makes sense because no broadcaster would ever pay for such gray-area codec implementation. You know, DolbyE should be always a Filter instead of a decoder but in ffmpeg it was implemented as decoder which is the reason for us to need 2-step processing (or at least 2 ffmpeg instance processing).
By the way, i should have mentioned in my first answer: thanks for the guide, it is the best connection between DolbyE and how it works in ffmpeg i ever found!
Last edited by emcodem; 1st Apr 2021 at 15:43.