DV file 4Ch audio stream separation

Thread

21st Oct 2014 19:27 #1

Member

I have been capturing the content from sony DVCAM tape device using firewire and enosoft dv processor.
DVCAM was set to record 4 audio chanels at 32Khz 12bit, so the tape/media was recorded this way.
Upon copying the content/capturing, I got as two sets of stereo channel

Inside DV container, there was a DV25 video stream and 2 audio streams each containing 1 stereo channel.

Code:

Format                                   : DV
File size                                : 743 MiB
Duration                                 : 3mn 36s
Overall bit rate mode                    : Constant
Overall bit rate                         : 28.8 Mbps

Video
Format                                   : DV
Duration                                 : 3mn 36s
Bit rate mode                            : Constant
Bit rate                                 : 24.4 Mbps
Width                                    : 720 pixels
Height                                   : 576 pixels
Display aspect ratio                     : 4:3
Frame rate mode                          : Constant
Frame rate                               : 25.000 fps
Standard                                 : PAL
Chroma subsampling                       : 4:2:0
Bit depth                                : 8 bits
Scan type                                : Interlaced
Scan order                               : Bottom Field First
Compression mode                         : Lossy
Bits/(Pixel*Frame)                       : 2.357
Stream size                              : 630 MiB (85%)

Audio #1
ID                                       : 0
Format                                   : PCM
Duration                                 : 3mn 36s
Bit rate mode                            : Constant
Bit rate                                 : 768 Kbps
Encoded bit rate                         : 0 bps
Channel(s)                               : 2 channels
Sampling rate                            : 32.0 KHz
Bit depth                                : 12 bits
Stream size                              : 19.8 MiB (3%)

Audio #2
ID                                       : 1
Format                                   : PCM
Duration                                 : 3mn 36s
Bit rate mode                            : Constant
Bit rate                                 : 768 Kbps
Encoded bit rate                         : 0 bps
Channel(s)                               : 2 channels
Sampling rate                            : 32.0 KHz
Bit depth                                : 12 bits
Stream size                              : 19.8 MiB (3%)

My goal is to get all this wrapped into .MOV container with four separate audio channels, by splitting two stereo streams into four separate mono or four separate double mono audio streams, and without using NLE software, but preferably ffmpeg command line or something similar, and on windows OS.

First I try just to extract existing audio streams with ffmpeg and ffmbc like this:

Code:

ffmpeg -i input.dv -filter_complex channelsplit out.mka

the problem is that only first audio stream is being processed while there is an error accessing the second.
Just to mention that the input file is not corrupted but I am able to play and select and hear audio streams correctly (two by two)

Code:

Input #0, dv, from 'TEST_21102014_RAWDV.dv':
  Metadata:
    timecode        : 20:15:48:23
  Duration: 00:03:36.36, start: 0.000000, bitrate: 28800 kb/s
    Stream #0:0: Video: dvvideo, yuv420p, 720x576 [SAR 16:15 DAR 4:3], 28800 kb/s, 25 fps, 25 tbr, 25 tbn, 25 tbc
    Stream #0:1: Audio: pcm_s16le, 32000 Hz, stereo, s16, 1024 kb/s
    Stream #0:2: Audio: pcm_s16le, 32000 Hz, stereo, s16, 1024 kb/s
Output #0, matroska, to 'out.mka':
  Metadata:
    timecode        : 20:15:48:23
    encoder         : Lavf56.9.101
    Stream #0:0: Audio: vorbis (libvorbis) (oV[0][0] / 0x566F), 32000 Hz, 1 channels (FL), fltp
    Metadata:
      encoder         : Lavc56.8.102 libvorbis
    Stream #0:1: Audio: vorbis (libvorbis) (oV[0][0] / 0x566F), 32000 Hz, 1 channels (FR), fltp
    Metadata:
      encoder         : Lavc56.8.102 libvorbis
Stream mapping:
  Stream #0:1 (pcm_s16le) -> channelsplit
  channelsplit:FL -> Stream #0:0 (libvorbis)
  channelsplit:FR -> Stream #0:1 (libvorbis)
Press [q] to stop, [?] for help
TEST_21102014_RAWDV.dv: Input/output error129.3kbits/s
size=    3420kB time=00:03:36.35 bitrate= 129.5kbits/s
video:0kB audio:3277kB subtitle:0kB other streams:0kB global headers:6kB muxing overhead: 4.356943%

First I try to extract each channel of an input to specific output like:

Code:

ffmpeg -i input.dv -map_channel 0.1.0 OUTPUT_CH0.mka -map_channel 0.1.1 OUTPUT_CH1.mka

so I got first stream as two separate channels in separate files, again with the notification about the Input/output error.

Code:

Input #0, dv, from 'input_RAWDV.dv':
  Metadata:
    timecode        : 20:15:48:23
  Duration: 00:03:36.36, start: 0.000000, bitrate: 28800 kb/s
    Stream #0:0: Video: dvvideo, yuv420p, 720x576 [SAR 16:15 DAR 4:3], 28800 kb/s, 25 fps, 25 tbr, 25 tbn, 25 tbc
    Stream #0:1: Audio: pcm_s16le, 32000 Hz, stereo, s16, 1024 kb/s
    Stream #0:2: Audio: pcm_s16le, 32000 Hz, stereo, s16, 1024 kb/s
File 'OUTPUT_CH0.mka' already exists. Overwrite ? [y/N] y
File 'OUTPUT_CH1.mka' already exists. Overwrite ? [y/N] y
-map_channel is forwarded to lavfi similarly to -af pan=0x4:c0=c0.
[pan @ 02b603e0] This syntax is deprecated. Use '|' to separate the list items.
[pan @ 02b603e0] Pure channel mapping detected: 0
-map_channel is forwarded to lavfi similarly to -af pan=0x4:c0=c1.
[pan @ 02c6cea0] This syntax is deprecated. Use '|' to separate the list items.
[pan @ 02c6cea0] Pure channel mapping detected: 1
Output #0, matroska, to 'OUTPUT_CH0.mka':
  Metadata:
    timecode        : 20:15:48:23
    encoder         : Lavf56.9.101
    Stream #0:0: Audio: vorbis (libvorbis) (oV[0][0] / 0x566F), 32000 Hz, mono, fltp
    Metadata:
      encoder         : Lavc56.8.102 libvorbis
Output #1, matroska, to 'OUTPUT_CH1.mka':
  Metadata:
    timecode        : 20:15:48:23
    encoder         : Lavf56.9.101
    Stream #1:0: Audio: vorbis (libvorbis) (oV[0][0] / 0x566F), 32000 Hz, mono, fltp
    Metadata:
      encoder         : Lavc56.8.102 libvorbis
Stream mapping:
  Stream #0:1 -> #0:0 (pcm_s16le (native) -> vorbis (libvorbis))
  Stream #0:1 -> #1:0 (pcm_s16le (native) -> vorbis (libvorbis))
Press [q] to stop, [?] for help
input_RAWDV.dv: Input/output error 65.0kbits/s
size=    1719kB time=00:03:36.35 bitrate=  65.1kbits/s

I was able to send output to wav as well, but I couldn't ever get channel 3 and 4 as separate due to the fact that stream 2
for some reason is not accessible from original file, so with this:

Code:

ffmpeg -i input_RAWDV.dv -map_channel 0.2.0 OUTPUT_CH3.mka -map_channel 0.2.1 OUTPUT_CH4.mka

Stream #0:0: Video: dvvideo, yuv420p, 720x576 [SAR 16:15 DAR 4:3], 28800 kb/s, 25 fps, 25 tbr, 25 tbn, 25 tbc
Stream #0:1: Audio: pcm_s16le, 32000 Hz, stereo, s16, 1024 kb/s
Stream #0:2: Audio: pcm_s16le, 32000 Hz, stereo, s16, 1024 kb/s

result was original stream 1 output in both files.

There is also a method with panning but since I couldn't reach stream 2 I didn't try that.

Any suggestion on how to get raw DV with two stereo audio streams rewrapped into .MOV container with four separate audio streams/channels?

Thanks!

Quote

21st Oct 2014 20:36 #2
poisondeathray

View Profile

View Forum Posts

Private Message
Member

Join Date
Sep 2007

Location
Canada
It's a bit weird in ffmpegland, but you need to specify -map for each time you call an input stream, and -map_channel explicitly for each input and output filetream:channel . That's the reason why you couldn't access the 2nd audio stream or 0:2 (you didn't have a -map 0:2). So if you had 5 streams you want output (1 video, 4 mono audio) , you would need 5 -map commands, and 4 map_channel commands

Code:

ffmpeg -i input.avi -map 0:0 -map 0:1 -map 0:1 -map 0:2 -map 0:2 -c:v copy -c:a copy -map_channel 0.1.0:0.1 -map_channel 0.1.1:0.2 -map_channel 0.2.0:0.3 -map_channel 0.2.1:0.4 output.mov

Stream #0:0: Video: dvvideo, yuv411p, 720x480 [SAR 8:9 DAR 4:3], 28771 kb/s,
29.97 fps, 29.97 tbr, 29.97 tbn, 29.97 tbc
Stream #0:1: Audio: pcm_s16le, 32000 Hz, stereo, s16, 1024 kb/s
Stream #0:2: Audio: pcm_s16le, 32000 Hz, stereo, s16, 1024 kb/s
Output #0, mov, to 'out.mov':
Metadata:
encoder : Lavf56.0.100
Stream #0:0: Video: dvvideo (dvcp / 0x70637664), yuv411p, 720x480 [SAR 8:9 D
AR 4:3], q=2-31, 28771 kb/s, 29.97 fps, 30k tbn, 29.97 tbc
Stream #0:1: Audio: pcm_s16le (sowt / 0x74776F73), 32000 Hz, stereo, 1024 kb
/s
Stream #0:2: Audio: pcm_s16le (sowt / 0x74776F73), 32000 Hz, stereo, 1024 kb
/s
Stream #0:3: Audio: pcm_s16le (sowt / 0x74776F73), 32000 Hz, stereo, 1024 kb
/s
Stream #0:4: Audio: pcm_s16le (sowt / 0x74776F73), 32000 Hz, stereo, 1024 kb
/s
Stream mapping:
Stream #0:0 -> #0:0 (copy)
Stream #0:1 -> #0:1 (copy)
Stream #0:1 -> #0:2 (copy)
Stream #0:2 -> #0:3 (copy)
Stream #0:2 -> #0:4 (copy)

Seems to work ok here on a sample with similar specs (but NTSC, not PAL DV). If it doesn't for you, it migth be that your files are slightly different, then post a small sample (direct stream copy video & audio in vdub)
Last edited by poisondeathray; 21st Oct 2014 at 21:05.
Quote
22nd Oct 2014 05:57 #3
logicom

View Profile

View Forum Posts

Private Message
Member

Join Date
Oct 2008

Location
Serbia
all right thanks!

could you please assist me now with panning because I am getting 4 channels while first two are copy of original stream 0:1 and second two as copy of stream 0:2. So how I could incorporate panning - something like this:

Code:

-filter_complex "[0:1]pan=1:c0=c0[left1];[0:1]pan=1:c0=c1[right1]"

into the previous command in order to get 4 channels separated
by the content.

Regards.
Quote
22nd Oct 2014 08:26 #4
poisondeathray

View Profile

View Forum Posts

Private Message
Member

Join Date
Sep 2007

Location
Canada
Open up the original in audacity or audio editor and look at the waveform. Chances are the L+R are identical in the source for each audio track

Quote
22nd Oct 2014 09:45 #5
logicom

View Profile

View Forum Posts

Private Message
Member

Join Date
Oct 2008

Location
Serbia
Hi,

that is not the case, waveforms are different as there are four different languages recorded on each channel. Sony DVCAM can record 4 separate tracks at 32Khz 12bit, but those 4 channels are organized as two sets of stereo, therefore I have to use pan feature now to detach those and remux them separately. I manage to do it with "FOCUS HD File Converter Pro", and rewrap to MOV without re-transcoding but this tool is not as good (I get double speed video/ 50frames/sec at playout) as ffmpeg so I would like to stick with ffmpeg.

Regards.

Last edited by logicom; 22nd Oct 2014 at 09:55.

Quote
22nd Oct 2014 09:55 #6
poisondeathray

View Profile

View Forum Posts

Private Message
Member

Join Date
Sep 2007

Location
Canada
Post a short sample and I'll have a look at it later

Quote
22nd Oct 2014 10:07 #7
logicom

View Profile

View Forum Posts

Private Message
Member

Join Date
Oct 2008

Location
Serbia
Hi,

Sorry but that wouldn't be convenient, due to the ownership of this material.
I would simply have to figure out ffmpeg command line, or someone might help with it's formulation for this particular task.

Regards.

Last edited by logicom; 22nd Oct 2014 at 11:33.

Quote
22nd Oct 2014 13:58 #8
poisondeathray

View Profile

View Forum Posts

Private Message
Member

Join Date
Sep 2007

Location
Canada
You're right, I looked closer, the previous code didn't work

So apparently -acodec copy (or -c:a copy) won't work for this with ffmpeg. You would need to up to 16bit

Try ffmbc , this should work:

Code:

ffmbc -i input.dv -vcodec copy -an output.mov -acodec pcm_s16le -newaudio -map_audio_channel 0:1:0:0:1:0 -acodec pcm_s16le -newaudio -map_audio_channel 0:1:1:0:2:0 -acodec pcm_s16le -newaudio -map_audio_channel 0:2:0:0:3:0 -acodec pcm_s16le -newaudio -map_audio_channel 0:2:1:0:4:0

If it still doesn't work properly then you can break it out into steps with a ffmpeg / ffmbc batch file
Quote
22nd Oct 2014 15:29 #9
Cornucopia

View Profile

View Forum Posts

Private Message
Member

Join Date
Oct 2001

Location
Deep in the Heart of Texas
Is the fact that those DV streams are 12bit and may not be parsed correctly (as there is no "pcm_s12le" option in ffmpeg) a complicating factor?

For this problem, however, I think I would open the file in an NLE that is known to work correctly with 4ch 12bit 32kHz LPCM DV audio, such as PremierePro or the other pro NLEs. Then convert to 16bit 48kHz LPCM x 4 mono channels - it will be much more workable in that form.

Scott

Quote
22nd Oct 2014 15:49 #10
poisondeathray

View Profile

View Forum Posts

Private Message
Member

Join Date
Sep 2007

Location
Canada
^ Yes, that is why -acodec copy or -c:a copy didn't work.

But that last code with ffmbc should work because it uses 16bit "-acodec pcm_s16le". I made a fake dual stereo file to test , with different channel waveforms (so easily examined in an audio editor) and it worked ok

Quote
23rd Oct 2014 12:08 #11
logicom

View Profile

View Forum Posts

Private Message
Member

Join Date
Oct 2008

Location
Serbia
Hi,

this worked with ffmbc.

Thaks!

Quote
28th Oct 2014 09:55 #12
logicom

View Profile

View Forum Posts

Private Message
Member

Join Date
Oct 2008

Location
Serbia
But,

there is a problem with lipsync, once I play .dv file it is perfect, but in mov audio is going ahead in relation to picture for around 150-200msec
Is there anything that can be done in respect to the timecode, beacose simply creating newaudio and appending doesnt work well? and there mus not be second pass with -itsoffset beacose this is to rigid. So question is how to maintain perfect sync in mov as it is in .dv file, but to map the channels as separate?

Regards.

Quote
28th Oct 2014 11:59 #13
poisondeathray

View Profile

View Forum Posts

Private Message
Member

Join Date
Sep 2007

Location
Canada
I don't know; I don't have an authentic sample to play with , that is long enough to test sync

Did you try the ffmpeg version, just changing -c:a copy to c:a pcm_s16le?

Stream mapping works fine (recall that the issue was 12bit PCM, which ffmpeg/ffmbc doesn't have a native implementation for, so can't be "copied"), but I can't properly test sync

Code:

ffmpeg -i input.dv -map 0:0 -map 0:1 -map 0:1 -map 0:2 -map 0:2 -c:v copy -c:a pcm_s16le -map_channel 0.1.0:0.1 -map_channel 0.1.1:0.2 -map_channel 0.2.0:0.3 -map_channel 0.2.1:0.4 output2.mov
Quote
29th Oct 2014 05:25 #14
logicom

View Profile

View Forum Posts

Private Message
Member

Join Date
Oct 2008

Location
Serbia
Hi,

yes I did try, but I have got the same resault with the ffmpeg or ffmbc on longer samples 1.5hour aprox. and a/v sync is not there - drifted
One of the outputs I made was/is better in sync (less drift) but I did that one with soxr invoked

Code:

ffmpeg -i output.dv -map 0:0 -map 0:1 -map 0:1 -map 0:2 -map 0:2 -c:v copy -af "aformat=sample_fmts=fltp,aresample=resampler=soxr:osr=44100:dither_method=0" -acodec pcm_s16le -map_channel 0.1.0:0.1 -map_channel 0.1.1:0.2 -map_channel 0.2.0:0.3 -map_channel 0.2.1:0.4 output.mov

I am not sure, how to make perfect sync as it is on original .dv file, it seems like a muxer problem ?

Regards.
Last edited by logicom; 29th Oct 2014 at 12:03.
Quote
30th Oct 2014 18:13 #15
logicom

View Profile

View Forum Posts

Private Message
Member

Join Date
Oct 2008

Location
Serbia
Hi,

would it be wise to use Libswresample library with the similar arguments :

Code:

-filter:a aresample=48000:async=1:min_comp=0.01:comp_duration=1:max_soft_comp=256000000:min_hard_comp=0.1

maybe I misunderstood max_soft_comp but I put number which reflects maximum expected a/v length expressed in audio frames?

as it is stated in documentation:

-async samples_per_second

Audio sync method. "Stretches/squeezes" the audio stream to match the timestamps, the parameter is the maximum samples per second by which the audio is changed. -async 1 is a special case where only the start of the audio stream is corrected without any later correction.

Just from curiosity how it would be to use async option on a different way, maybe value in frames for one sec, or value as the frequency the alignment occurrence per segment ? How does this correlates with comp_duration value ?

How about influencing muxer in AVFormatContext by reducing max_delay integer or no-buffer option?

Could someone assist on this please?

Regards.
Quote
30th Oct 2014 18:18 #16
poisondeathray

View Profile

View Forum Posts

Private Message
Member

Join Date
Sep 2007

Location
Canada
I don't know what those audio functions do exactly, but I suggest you open a new thread dedicated to that, otherwise some that *does* know, might miss it

Quote

DV file 4Ch audio stream separation

Thread Tools