VideoHelp Forum
+ Reply to Thread
Results 1 to 6 of 6
Thread
  1. I'm curious about the process of removing objects from video. I know that it's possible, since Hollywood does it all the time. (The coffee cup from Game of Thrones is always the first example that comes to my mind.) However, poking around the web, it seems like the only real solutions for removing content are a) zooming/cropping the video or b) pulling the individual frames into Photoshop and doing context-aware fill one-by-one. Neither option is always practical.

    The only other tool that might do the job is context-aware fill in AfterEffects, since it apparently has temporal awareness. Can anybody testify to its effectiveness? Adobe products are a bit expensive, and I'd rather not pay for them if I don't have to. Are there any low-cost or even FOSS solutions. I have some opening credits I'd like to remove from some videos and am hoping there's an easy and inexpensive solution.

    I'd also like to know how Hollywood makes these changes. What software do they use?

    Thanks to the community for their help!
    Quote Quote  
  2. It depends mainly on the source characteristics .

    The main determining factor is usable data in nearby frames . e.g. If it was static credits, static background, and no camera movement - then temporal would be useless and it would be a completely spatial repair

    Sometimes AE Context Aware Fill works well, but sometimes it does a terrible job - it's something you have to test out

    There are free machine learning temporal inpainters such as ProPainter . More difficult to use (I think selur has a hybrid version based on dan64's vapoursynth port of propainter)
    https://github.com/sczhou/ProPainter
    https://github.com/dan64/vs-propainter

    Sometimes the research based machine learning methods work better than AE, sometimes worse. Again something you have to test

    There are probably dozens automasking/autosegmenting programs , some with GUI's which have some user input . YOLO based ones are probably the most popular. There might be some text trained models available somewhere. Search github

    In general, the tighter /more accurate the mask, the more usable data, the better quality the fill. You have to make sure you zoom in and cover things like compression and edge artifacts . But a sloppy large mask will generally produce a more blurry , less detailed fill, because you have less usable data (that could have been used)

    Training your own custom model for both automask/inpainting is probably the best way for clean accurate results - but it takes lots HW and time. Generally not feasible for typical home user

    The gold standard for Hollywood is still rotoscoping / manual repair , usually in nuke . Some of the "automatic" methods like AE can assist to reduce the time required, but to make it perfectly seamless you often need some manual touch up, or fix some problem sections . Very difficult/complex removes use projection painting in a 3d applications. Textures are projected onto 3d geometry

    But objects are usually not "removed" if possible in Hollywood - the VFX shots are designed in advance for easy removal - such as a 2nd clean plate shot using locked camera rig that has the same set path - this makes it super easy for a 100% perfect remove.
    Last edited by poisondeathray; 21st Jun 2024 at 13:07.
    Quote Quote  
  3. Thank you for all of the wonderful information! I'll take a look at the links you sent me.

    Which method do you think they used to remove the coffee cup, then?
    Quote Quote  
  4. Originally Posted by Spiralagnus View Post

    Which method do you think they used to remove the coffee cup, then?

    Not sure what they used specifically, but likely Nuke - because that is the tool of choice in the VFX industry. That' s the answer for 99.99% of cases. Even if other tools were used in the pipeline, Nuke is also used 99.99% of the time.

    The coffee cup scene is a technically simple shot to fix - so you could use almost anything. Even 20 year old software, 20 year old techniques with a point tracker with rotoscoping - it just takes a bit longer

    But a home consumer today could just as easily use AE and fill
    https://www.youtube.com/watch?v=b7ZvF_sF1eA

    But if you look closely, the repair in that tutorial isn't great - besides the noise and compression artifacts (partially from YT, partially because that tutorial probably used low quality source material with compression artifacts), the line of the table is not straight and has a bump

    It's not shown in that tutorial, but you would also need to grain match for high quality source material, like the original digital intermediate, or even a consumer BD. The filled areas are usually softer and stick out, you usually need to add finishing touches for a seamless composite
    Quote Quote  
  5. I hadn't heard of Nuke before, but I'm glad they have a non-commercial free version. Do you think they used a context-aware fill like the one in AfterEffects?
    Quote Quote  
  6. Originally Posted by Spiralagnus View Post
    Do you think they used a context-aware fill like the one in AfterEffects?
    It's possible; You'd have to look at the source they used. It's possible that they used it (they obviously had access a high quality DI) and got decent results right away, and maybe just touched it up a bit. Definitely not with the source material that tutorial used, too many quality problems to begin with. When your fill uses problem areas like compression artifacts, the problems multiply . Fill area becomes contaminated with artifacts. You'd have to do other things like clean up first.
    Quote Quote  



Similar Threads

Visit our sponsor! Try DVDFab and backup Blu-rays!