I've been making fast-paced montages on Minecraft and was looking into doing some clever editing on my footage to replace the map (area around me) to a 3D map (e.g blender with fancy lighting/stuff you wouldn't be able to make in Minecraft)

This video perfectly shows the input video / output result I'm looking for

I've made a texture pack which replaces every texture with something that is (I suppose) easy to track and easy to chroma key out:

What I've been looking into:
⠂Replay Mod exists and has Blender export support but it's still experimental, is not in the version I play in and is awful at first person replay
⠂What I've looked into first was AI detection that does video to 3D scene (pointcloud) but it was extremely cumbersome and slow (and probably inaccurate lol)
⠂Then, when moving to 3D motion tracking in Blender and After Effects to export the camera path, it had a hard time following flicks and fast movement in general, but I may have not tried hard enough, let me know fast first person movement 3D tracking is done at all.

Do you knowledgeable fellas know what I should try?