Is there a way to tag and mux two AVC video streams and a sound stream into a well-known container with some magic tags that are likely to indicate 3D-ness to applications or devices that may read them?
I can build the described file and ffprobe/mediainfo confirm the desired streams, but none of my apps seem to see the second stream or sense any 3D-ness.
I am starting my workflow with separate Mp4-wrapped AVC L/R clips (from a sync-d dual GoPro rig), and trying to re-mux them into some 'well known' dual-stream file container (trying to emulate the Fuji 3d AVIs or the newer BD3D m2ts formats), such that the result contains my original AVC video streams and sound and looks like '3D' to devices/apps that would respond to that flag. I especially need the Magix VP-X editor to 'see' the 3D-Stereo, the way it sees the Fuji AVIs and M2TS dualstream clips.
In a nutshell, I'm looking to use remuxing tools and some magic metadata to build a general dual-stream source-file format that behaves like the Fuji or BD3D formats, but contains both of my *original* AVC L/R streams, and at least one of the original sound streams. This would useful for anybody's 3D-stereo production workflow.
While viable, simply converting to a full frame SbS in a lossless rgb format (lagarith/cineform/etc.) seems like a brute-force (and large) solution when a clever remux and tagging of the original data streams may be possible.
The thing that I seem to be missing, isn't the actual construction of the desired dual-stream output file, but rather, virtually none of the target apps/devices recognize the presence of second stream and simply behave like it's a normal video clip. Not exactly a failure, but not solving my problem either...
Mediainfo and ffprobe see the various streams, and demuxing the result shows that the streams have maintained their integrity, so I'm looking to
find the ultimate container, stream, and metadata combination that would let me package two original video streams (AVC or other?) into a single container and
appear 3D-ish enough to any/all/my apps to compliment my workflow.
My efforts to add metadata has been mixed, some elements 'take' (e.g. title=yadayada), and other metadata directives don't (e.g. StereoView_Count=2)....
Even the most verbose container and stream header viewers don't seem to indicate *obvious* differences between my new files and those that 'work'. Some files show differences, but I can't replicate those flags in my files with the tools I know of.
I'm also very willing to 'cheat' and use 'header foo' or bogus tool options to lie to my apps about what's inside the files, so long as I can browse and preview the files in a fairly normal way. @Cornucopia once mentioned an AVI hack (i think) where he fibbed to a tool about a video stream being a sound stream and tweaked it back later to useful effect. I'm all about creative hacks if needed...
One thing that occurred to me was putting both original AVC vid streams in a BD3D m2ts container and rather than 2d+delta, just have full 2DL + 2DR... would that be allowed? and does FRIM/TsMuXeR let you try such a thing? My efforts failed because I didn't feed it an 'mvc' file required for the 3D mode, but maybe there's a trick to try...
Appreciating the wealth of discussion on these forums (and doom9, and dvinfo, and ...), I've combed (for days) and *really* enjoyed reading the many conversations regarding this apparently common interest, but am concluding that I've missed something so basic that nobody bothers to mention it, or perhaps there's no method or consensus, given the usual trade-offs and variety of 'standards'.
After lurking for years here (you folks have collectively tutored me through most of what I describe above), I feel like I've done my homework here, and humbly
look to the wisdom of this community to help me around this final file-building stage.
so... Is there a way to tag and mux two AVC video streams and a sound stream into a well-known container with some magic tags that are likely to indicate 3D-ness to applications or devices that may read them?
cheers and TIA,
+ Reply to Thread
Results 1 to 20 of 20
About to head out to work now so can't fully discuss, but couple of things:
1. Mkv & Wmv have known flags, mp4 & m2ts basically use a combination of NAL/SEI metadata. AVI & MOV has always been a crap shoot.
2. It's never been just about the input/storage/muxing. The other part of the equation is the enhanced/recognizing player/reader, and that's where many apps have never progressed to.
My FUJI3d clps work because I have used Stereoscopic Player on pc (and begged & begged the author to include this format we're talking about, until he did), and on the built-in media player on my LG 3dtvs which already support it.
Off top of my head, I couldn't say which, but I do know I and a few others (see history) we able to mux into MKV. And a couple of players, incl. Stereoscopic player, support this.
True - BD3D uses AVC+MVC (base+dependent/derivative) steam combination, but it outputs via HDMI using standard 3D packing options, and it's certainly possible that the the SEI metadata can be added - like 3DBD - to signal to HDMI to enable the 3D packing option with dual-mux files. Of course, that again requires a player that supports dual-muxed full streams and the player must be able to composite the images as needed to fit the packing mode.
<edit>...Just noticed PowerDVD supports MK3D.</edit>
Last edited by Cornucopia; 21st Feb 2018 at 13:33.
thanks for the quick reply.
re: WMV, right or wrong, the windows-centric flavor of WMV (and similar IOS bias of MOV) has me shying from using either over a more 'standards based' mp4/m2ts/mkv container, but... if something can wrap my original data (?), has the 3D flags, and decent device/app support, I have no problem going that way, esp. since I tend to use windows workflows in spite of my *nix preferences.
Being a crapshoot, I mention AVI, mostly because the FUJI3D format seems to 'just work' in more 3D app/device contexts than any of the other formats I've run into. I wish I could just replace the streams with my own and tweak the headers just enough to tell the various apps to at least *try* decoding my AVCs vs the native MJPEGs used by Fuji.
I'm wondering if you or anyone has the magic NAL/SEI metadata recipe for mp4/m2ts files that would convince apps and devices that there was great 3D stuff inside. Any guidance on defining and injecting this sort of metadata into a custom file is appreciated. I'm happy to hammer through variations that might work if I can figure out how to tell my muxing tools what I'd like (ffmpeg/mp4box/???/etc.) the file to look like. It looks like MP4maker was written to solve a similar problem and might be hackable to this end as well. I'd consider this path if it felt like it was likely to work.
With this in mind, I figure (?) for my long-term workflow, hacking together a container with specific flags - creating a format that doesn't 'bother' the usual viewing and handling apps, but also does 'trigger' 3D features in the devices I use or can get (through attrition), would be worth the effort, as I build and convert my collection for the long-term. Clearly no holy grail here, but possibly something that works 'better than most' would be a great baseline, and can be remuxed later to a newer format, so long as the original data streams are preserved.
yeah, an AVI that had the FUJI3D formating stuffed with my original data streams would be quite sufficient...
Sorry about the BD3D m2ts language. I don't think m2ts is 3D-specific, but it is the suffix for the 3D mvc blu-ray files I've been generating, so I'm tryng to find a way of describing an MT2S (suffixed) file that contains the 3D-specific 2d+delta dual-stream data layout being discussed here. I'll happily use 'MVC' if that's the correct/preferred term for Blu-Ray 3D. Sematics matter to me. I still wonder if MVC (Multi-Video-Container...) can successfully contain arbitrary 'multi-videos' in a way that's meaningful to the various 3D apps/devices out there...
I've found that TsMuXeR can generate such a dual-stream (AVC) mt2s file, but doesn't have any 3D flagging and doesn't trigger any 3D behavior in my apps or devices (without the flags, this doesnt' surprise...). I'd like to 'lie' about the 3D-ness in the file and see any of the apps happen to do-the-desired-thing...
Last edited by mindsong; 21st Feb 2018 at 14:08. Reason: semantic clarification and added TsMuXeR thought
TrackOperation". And that is something I haven't seen in any software ever.
You can of course simply mux 2 video tracks into a single file like you can mux 2 audio tracks or 2 subtitle tracks. But there won't be any information in the file about how those 2 tracks are supposed to be 2 views of the same movie.
Maybe there is a player that can play such files as 3D anyways? IDK.
I would cynically speculate that almost all of the FUJI3D support is more closely triggered by the Fuji-specific metadata, rather than the general dual-stream formatting of the file, and/or any general 3D tagging in those same files. This implies that trying to emulate that format is probably a really ugly hack, and given the 1280x720 24fps native format of that device, I wouldn't count on any success trying to stuff 1080p 60fps into that envelope...
Scott, your comment about the packing and presentation dynamics hits the nail on the head. It would be great if the container specs and stream specs were more consistently treated as separate cues to the supporting applications, but the bulk of programmers seem to merge the two to treat such files with a single signature and either recognize/support them or not. E.g. you would think that a 3D-TV that successfully decodes AVC pairs in one container (e.g. MP4) could recognize those same AVCs in another (e.g. MK3D), when it can clearly parse an MK3D with mpeg2 streams.
I think I'll switch from my MP4 dual-stream experiments to MKV/MK3D, and see if my specific apps respond any better. It feels like either container has tools and support, and having 3D-specific flags is nice, even if not widely supported (yet .
*edit*: that is, unless someone can confirm, or believes that the MVC formated dual-AVC experiments with the right tags might lead to some success. So far, dual AVCs in an MVC container without any 3D tags/hacks don't do what I need (not surprised).
Last edited by mindsong; 21st Feb 2018 at 15:08. Reason: added MVC thought
Much like I use AVI for lossless intermediates, my quest here is preview-able source and intermediate files, and certainly not for efficient presentation outputs. This doesn't really align with most application and device development goals, so I appreciate that this exercise will probably never be optimal. I would bet that if anyone out there knows what's viable, they'd be in this community... Can't hurt to ask, and the feedback is already quite compelling.
bingo - that's where I am now... so I fear, when forced to interleave and otherwise re-format/consolidate the actual streams, that the general goals of format simplicity, of keeping the originals 'pure' and merely using the packaging and flags as a trigger for the apps starts to get lost in the hackery. urg. I wonder that perhaps there's anything out there can see the two separate streams and be told to present them in a SbS or similar mode, rather than forcing a consolidation of the streams, etc.
Probably time to go build some new file variations and see how the apps/devices I've got respond. Any suggestions are welcomed, but I've got a few ideas I can go with right away. Thanks for the insights.
This link from 2013: https://www.google.com/url?sa=t&source=web&rct=j&url=https://mpeg.chiariglione.org/sit...2mEgrvpY0AzzwC
Explains at the very end the way to include 2 possibly full rez and fully independent image streams in MPEG2 and AVC encoding, using some stereo program/video descriptors as signaling metadata for the (dual muxed) service-compatible streams.
I don't think they were envisioning what you and I were hoping for, but they left the door open. Should be able to be supported by m2ts and/or mp4 I would think.
I am assuming HEVC could find a place is this system as well. Now, lossless?...
@Cornucopia, Thanks for that link! I'm going to go digest that next.
So, given an arbitrary pair of L/R clips from separate sources... What to most folks do to manage them before final-use edits (vs clean-up tweaks)?
- keep them separate or bind them?
- fine-tune the alignment, or do that later?
- color correct/match the pair?
This question relates to the main topic in that maybe I'm solving the entire pre-edit processing issue the wrong way...
If I can quickly do these three things and save the *steps* as a 'take' in Magix, then I don't actually have to export/save a lossless intermediate to use as a clip in an edit later on. It's simply a reference to a file and some rules to tune it, but the editor treats it just like a raw clip. Very appealing paradigm for me, but Magix needs to see the source(s) as a 3D clip, so packaging (like the FUJI3D AVIs work) is pretty important to making this elegant/efficient.
This implies a dependency on the editor, but allows me to create and access a 'pseudo-clip' composed of a reference to an original source-clip and fixit directions, and can be quickly exported as a lossless intermediate if needed for other tools/processing needs.
Do other folks export and archive the ready-to-use clips in a more generic lossless SbS full-frame format, where they can be used anywhere/anyway? I like that idea and started that way except for managing the size issues, and finding I'd done a lot of preparation work, management, and storage on clips I never ended up using (and probably won't).
I could also see using an AVIsynth frame-serve model with per-clip-pair tweaking scripts and frameserve them to present the 'fixed' clips to the previewing/editing context as well having the same advantage as my editing/take approach, but it has a bit more of a batch processing feeling, with fine-tuning happening in the edit stage. Once set up, that model would be fairly transparent to any app that was used to browse or view the clip(s), but is also limited to a SbS or T/B file processing paradigm.
Curious what folks (esp. 3D stereographers) are settling on, and if trying to mux these pairs into a container (even if automated someday) is worth the investment.
@Cornucopia, That last section of the document looks like exactly what we'd like (vs a hacked variation of our more usual formats) albeit with a broadcast flavor, though that intent/use may not matter if the mentioned flags and descriptors can actually be mapped to a standard data structure - meaning we can build the thing and it can be read by most apps, that would be able to use/ingest what they like, and ignore/skip what they don't handle/understand.
The description is a mix of very specific flags that I would guess get placed in an already defined (well known/used) stream structure, and it probably reads more clearly to someone who understands that framework. Said in English: I bet someone who understands these headers would know exactly how to build the stream/container described in that paragraph. I wonder how many apps and devices are built on libraries that would recognize this format?, and I wonder how hard it would be to hex-edit something that exists into one of these entities, just to test the concept.
As I mentioned above, TSMuxeR will happily create an m2ts file with two distinct AVC streams and audio streams that will play as a single stream clip in most any player I tried. I wonder if tweaking some of the mentioned flags would be meaningful, or do additional descriptors need to be inserted/added to the stream to make it meet the described spec. Maybe there's a real specification that goes with this paper's more general description. ffmpeg creates MP4s that may be tweakable in the same way.
I feel inspired and really ignorant in the same breath.
Promising as a solution, hoping it was implemented here and there, even if it never caught on. Gonna look around for follow-up/specs. I wish there was a standards reference number for that feature/variation that would make it more separately identifiable.
cheers and thanks for the lead,
it looks like MK3D has value in the broader scheme of 3D video packaging, the available flags in the mode I would think is most relevant, don't allow for two distinct streams, but rather a single stream that can take on the characteristics of (all of) the best known single-file modes:
StereoMode field values:
1: side by side (left eye is first)
2: top-bottom (right eye is first)
3: top-bottom (left eye is first)
4: checkboard (right is first)
5: checkboard (left is first)
6: row interleaved (right is first)
7: row interleaved (left is first)
8: column interleaved (right is first)
9: column interleaved (left is first)
10: anaglyph (cyan/red)
11: side by side (right eye is first)
12: anaglyph (green/magenta)
13: both eyes laced in one Block (left eye is first) (field sequential mode)
14: both eyes laced in one Block (right eye is first) (field sequential mode)
As I'm trying to preserve my original streams, and this MKV mode looks more architected toward presentation type outputs, I'm wondering if anyone knows of an alternative MK3D mode, or if perhaps AVCs can be losslessly muxed into field-sequential (modes 13 or 14) streams. I would guess not, but wondering if it's not common, vs not possible...
My mistake: I mentioned I had previously successfully created a dual-stream MK3D which was properly decoded as 3D. What happened is I created a standard dual-stream MKV (which can be opened in something like stereoscopic player), and on another occasion I created an MVC-encoded file as MKV that was properly decoded as 3D. I conflated those 2 occurrences.
Looks to me like MKV needs to get back on the effort to make complete, thorough & official the settings and metadata for 3D (including the options for MVC, AVC+MVC, 2D+depth, 2D+depth+occlusion+occlusion depth, anaglyph blue+yellow, field-sequential, frame-sequential, sisvel tile format, and DUAL STREAM). I know it has lost the focus & enthusiasm of a couple of years ago, but it hasn't and won't go away.
I appreciate your dignifying my journey down this path. It appears the question has been well-considered thus far. Perhaps the discussion will serve to educate others that follow.
As I continue to consider this possibility, it does occur to me that I'm kind of asking for a presentation format for an intermediate-file purpose. I suppose the lack of an easy solution is related to this duality of purpose.
Not quite ready to surrender, but understanding the problem much better for the exploration.
I do like the sound of that specific mpeg standard document you cited, but fear, like the options @sneaker mentioned, the lack of SW/libraries makes the point a bit moot.
RE: Matroska development, the vp8/vp9 stuff seems to echo/extend the matroska format and is quite lively given the google backing. There was another parallel project (fork?) something like AV10 (part of the blur...) coming online as an open media standard that could answer to this need, but I don't think we'll see COTS editor support for quite a while, so my current plans should probably assume it has to be done with current remuxing and tagging cleverness. Esp. considering the current falloff in 3D industry support/economics. I'm still sold, though, and think good 3D-stereo is the cat's meow... Stereo3D-VR360 is within our lifetimes I'm betting.
Even having Stereoscopic player respond to both separate streams in that first mentioned MKV is useful info. Did you set any flags in the MK3D sense, or was the MKV pretty generic and simply contained dual streams?
Ever forward, gotta go figure out if I need to stuff more flags in a container header, or if changing flags in an existing header will result in anything interesting. More blur to come.
It looks like this toolkit is quite useful for a similar but different target result and doesn't help much in getting to my specific goal. It may find a place in my toolbox someday, but it provides me with what I already have (two separate streams), and lets me re-save those to a format that doesn't preserve both streams without loss (I don't believe 2 x264 streams converted to an MVC base+deltas is lossless).
I appreciate the tip, and have bookmarked the product site for future reference.
Having made very little progress in muxing together many variations of my raw x.264 video sources, nothing seems to 'trigger' the 3D modes of my devices or editor. I'm not *really* too surprised, but was hopeful.
I'm now thinking I may be able to surrender some quality from one of my original x.264 video streams to the BD3D MVC right-eye delta-based stream through re-encoding *IF* I can use the *original* left-eye stream as the independent MVC stream.
So the output would be an m2ts blue-ray 3D MVC file with sound, the left base (my original AVC clip), and a re-encoded version of my right AVC clip as teh delta-based stream.
Question: does anyone know if the FRIM MVC encoding wrapper, or any of the other MVC encoders will allow the base video stream to be simply mux-d in, and only transcode the second stream into the delta-based right-eye stream?
I would then have a file with a lossless base stream (that edits well and looks good in 2D in any player), and a slightly lossy (?) re-encoded delta-based right-eye stream, all in one fairly standard package that would likely survive the test of time and devices for a good while.
Any thoughts?, especially in whether or not anyone knows if or how to create a legit MVC 3D file without re-encoding the base video stream?
and how much loss would one expect in that re-encoded second delta-based video stream (relative to the still-original left-eye stream), and would it be irritating to watch?