I broke down a h265 video into frames and try to merge them back together with its original audio, subtitle and fonts. The resulting video has an issue where when I try to seek at some points in the video, the audio stops playing. And even when I don't seek, at a point the video hangs and the audio keeps playing. Seconds later, the video resumes, but now, video and audio are out of sync. This doesn't happen with the original video.
The reason I'm breaking the video into frames and merging them is because I want to upscale each frame. But I'm going to leave that part out because this issue occurs with the original unscaled frames.
Here's the details of the original video. Notice it has video, audio and two font streams.
Here's how I break it into framesCode:.\ffmpeg.exe -i "input.mkv" Input #0, matroska,webm, from 'input.mkv': Metadata: encoder : libebml v1.3.10 + libmatroska v1.5.2 creation_time : 2021-01-07T00:20:19.000000Z Duration: 00:23:02.05, start: 0.000000, bitrate: 320 kb/s Stream #0:0: Video: hevc (Main), yuv420p(tv), 1280x720, SAR 1:1 DAR 16:9, 23.98 fps, 23.98 tbr, 1k tbn, 23.98 tbc (default) Metadata: BPS-eng : 278671 DURATION-eng : 00:23:02.006000000 NUMBER_OF_FRAMES-eng: 33135 NUMBER_OF_BYTES-eng: 48140731 _STATISTICS_WRITING_APP-eng: mkvmerge v46.0.0 ('No Deeper Escape') 64-bit _STATISTICS_WRITING_DATE_UTC-eng: 2021-01-07 00:20:19 _STATISTICS_TAGS-eng: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES Stream #0:1(jpn): Audio: aac (HE-AAC), 48000 Hz, stereo, fltp Metadata: BPS-eng : 36166 DURATION-eng : 00:23:02.016000000 NUMBER_OF_FRAMES-eng: 32391 NUMBER_OF_BYTES-eng: 6247833 _STATISTICS_WRITING_APP-eng: mkvmerge v46.0.0 ('No Deeper Escape') 64-bit _STATISTICS_WRITING_DATE_UTC-eng: 2021-01-07 00:20:19 _STATISTICS_TAGS-eng: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES Stream #0:2(eng): Subtitle: ass (default) Metadata: BPS-eng : 76 DURATION-eng : 00:21:20.790000000 NUMBER_OF_FRAMES-eng: 246 NUMBER_OF_BYTES-eng: 12264 _STATISTICS_WRITING_APP-eng: mkvmerge v46.0.0 ('No Deeper Escape') 64-bit _STATISTICS_WRITING_DATE_UTC-eng: 2021-01-07 00:20:19 _STATISTICS_TAGS-eng: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES Stream #0:3: Attachment: ttf Metadata: filename : Roboto-Medium.ttf mimetype : application/x-truetype-font Stream #0:4: Attachment: ttf Metadata: filename : Roboto-MediumItalic.ttf mimetype : application/x-truetype-font
Here's how I merge the frames back to video with all the original streams except the videoCode:.\ffmpeg.exe -i "input.mkv" -qscale:v 1 -qmin 1 -qmax 1 -vsync 0 "InputFolder/frame%08d.png"
Here's the details of the resulting video:Code:.\ffmpeg.exe -r 23.98 -i "InputFolder\frame%08d.png" -i "input.mkv" -map 0:v:0 -map 1 -map -1:v -c:a copy -c:v libx265 -r 23.98 -pix_fmt yuv420p "output.mkv"
One thing to note is that I've done this successfully numerous times with h264 videos. No audio issues. Another thing to note which might be more relevant is that when I merge the frames with only the original audio stream (as opposed to all original streams except video), the audio issue does not occur. Also, when I merge the frames with the original audio AND subtitle stream, i.e without the fonts, the issue remains.Code:.\ffmpeg.exe -i "output.mkv" Input #0, matroska,webm, from 'output.mkv': Metadata: ENCODER : Lavf58.45.100 Duration: 00:23:02.05, start: 0.000000, bitrate: 245 kb/s Stream #0:0: Video: hevc (Main), yuv420p(tv), 1280x720 [SAR 1:1 DAR 16:9], 23.98 fps, 23.98 tbr, 1k tbn, 23.98 tbc (default) Metadata: ENCODER : Lavc58.91.100 libx265 DURATION : 00:23:01.777000000 Stream #0:1(jpn): Audio: aac (HE-AAC), 48000 Hz, stereo, fltp (default) Metadata: BPS-eng : 36166 DURATION-eng : 00:23:02.016000000 NUMBER_OF_FRAMES-eng: 32391 NUMBER_OF_BYTES-eng: 6247833 _STATISTICS_WRITING_APP-eng: mkvmerge v46.0.0 ('No Deeper Escape') 64-bit _STATISTICS_WRITING_DATE_UTC-eng: 2021-01-07 00:20:19 _STATISTICS_TAGS-eng: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES DURATION : 00:23:02.046000000 Stream #0:2(eng): Subtitle: ass (default) Metadata: BPS-eng : 76 DURATION-eng : 00:21:20.790000000 NUMBER_OF_FRAMES-eng: 246 NUMBER_OF_BYTES-eng: 12264 _STATISTICS_WRITING_APP-eng: mkvmerge v46.0.0 ('No Deeper Escape') 64-bit _STATISTICS_WRITING_DATE_UTC-eng: 2021-01-07 00:20:19 _STATISTICS_TAGS-eng: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES ENCODER : Lavc58.91.100 ssa DURATION : 00:21:21.580000000 Stream #0:3: Attachment: ttf Metadata: filename : Roboto-Medium.ttf mimetype : application/x-truetype-font Stream #0:4: Attachment: ttf Metadata: filename : Roboto-MediumItalic.ttf mimetype : application/x-truetype-font
If anyone needs me to upload the original video somewhere so they can reproduce it, let me know.Code:.\ffmpeg.exe -r 23.98 -i "InputFolder\frame%08d.png" -i "input.mkv" -map 0:v:0 -map 1:a:0 -c:a copy -c:v libx265 -r 23.98 -pix_fmt yuv420p "output.mkv" Produces no audio issues. But this isn't good for me because I want the subtitles and fonts from the original video.
+ Reply to Thread
Results 1 to 18 of 18
-
-
-
Probably easier to use mkvtoolnix-gui (the GUI) at first. There is an option to the show command line, you can also look at the documentation and examples included there
Add original file, uncheckmark original video stream, add replacement stream, push start multiplexing. The replacement stream can be the same one you made from ffmpeg earlier, you're just using mkvmerge to do the multiplexing step . It's generally more reliable than ffmpeg
For encoding the replacement stream, I would pay more attention to your encoding settings and parameters. You are using default libx265 settings, and the quality might not be ideal. Also -pix_fmt yuv420p will use swscale and Rec601 to convert RGB to YUV . Assuming your png images were done correctly upstream, this means the output YUV video will have "SD" colors . Maybe you were upscaling to "SD" , but I think it's unlikely. Normally Rec709 would be used for HD -
Probably easier to use mkvtoolnix-gui (the GUI) at first. There is an option to the show command line, you can also look at the documentation and examples included there
Add original file, uncheckmark original video stream, add replacement stream, push start multiplexing. The replacement stream can be the same one you made from ffmpeg earlier, you're just using mkvmerge to do the multiplexing step . It's generally more reliable than ffmpeg
For encoding the replacement stream, I would pay more attention to your encoding settings and parameters. You are using default libx265 settings, and the quality might not be ideal. Also -pix_fmt yuv420p will use swscale and Rec601 to convert RGB to YUV . Assuming your png images were done correctly upstream, this means the output YUV video will have "SD" colors . Maybe you were upscaling to "SD" , but I think it's unlikely. Normally Rec709 would be used for HD -
Your problem is likely from ffmpeg muxing. Notice the original file was made with mkvmerge.
Try 1 video first, then show the CLI command to adapt it to your program. It cannot output MP4, but MP4 cannot hold all types of streams such as your ass sub streams. mkvmerge can accept MP4 input or elementary streams. For proper MP4 muxing output, use mp4box. These are all commandline programs. I'm just recommending that you use the GUI first, on 1 quick test, to see if it works, so you don't waste your time. No use learning the 50 ipages of documentation, if it doesn't work and it's not the issue. If it works, you have your answer
It's not an issue with encoding. Because when you mux original audio with new video only ( but no other streams) it works ok. But you can improve seeking granularity by reducing max keyframe interval, but the default is 250, and for hundreds of millions of normal videos, this works fine. To reduce the keyframe interval, use -g . For example -g 24 for 1sec interval. An original "24p" BD would have this value. But is not your problem. The problem is ffmpeg muxing
There is a separate point about quality. If you're going to upscale IMO, you might as well do it correctly. Use a lower -crf value for higher bitrates (less quality loss) .
For -pix_fmt, swscale is ok, but you need to specify the 709 matrix for the RGB to YUV conversion . If you don't, the colors will be slightly shifted in most players. There will be a Rec601 vs.709 mismatch. By convention, HD material uses 709. Instead of -pix_fmt, use
Code:-vf scale=out_color_matrix=bt709,format=yuv420p
-
Are you sure the frame count of the source and the reencode are the same?
Extract the time codes to see whether the source if vfr. (or the audio is stretched)users currently on my ignore list: deadrats, Stears555, marcorocchini -
That shouldn't be an issue, since replacement + original audio (but no subs / other streams) works ok .
So it suggests an issue with muxing subs with ffmpeg, or re-writing timestamps with ffmpeg
The thing about mkvmerge is it preserves the original timestamps (not just video timestamps, other streams as well) - so if you just replace just the video stream, everything should be the same except for the replaced video. I'd be very surprised if it didn't work. Usually the culprit is ffmpeg -
Thank you for your suggestions on the ffmpeg parameters. I will tinker with those. I'll also look into muxing.
I want to try out mkvmerge (which is MKVToolNix, right), but I still don't see a way to break a video into frames. I need the actual frames so I can upscale them with a different program. I see some options under "splitting", but doesn't seem like this is what I need. -
I was able to input the original video and the upscaled video (the one with issues) into the program. I selected the video stream of the upscaled one, and everything else from the original, and the result is without issues i.e an upscaled video without the audio issues. Is this what you wanted me to try?
I also noticed something that might be relevant.
[Attachment 71981 - Click to enlarge]
The upscaled video (produced by merging frames with ffmpeg) has more files than the original.Last edited by PeteJobi; 24th Jun 2023 at 14:17. Reason: More info
-
Update: I tried using mkvmerge to remove the extra files I mentioned earlier from the upscaled video. And it worked wonderfully without audio issues.
[Attachment 72052 - Click to enlarge]
So if I can get ffmpeg to not generate those "tags" thing, my problem should be solved. -
In theory - but mkvmerge is doing more than removing tags - it's remuxing the streams
You can try mkvpropedit to remove tags to test your theory . If it works, then that validates your theory, you just have to figure out how to do it in ffmpeg next. mkvpropedit does inplace editing - no remuxing (so very fast)
Code:"mkvpropedit" "input.mkv" --tags all:
-
In theory - but mkvmerge is doing more than removing tags - it's remuxing the streams
If I don't find a ffmpeg solution, I guess what I can do is use ffmpeg to break the video to frames, and to merge the frames back to video without any stream at all, then use mkvmerge to merge that video and the streams from the original video.
This adds an extra step to the process, but if there's no other way.....
How do I get mkvmerge cli? -
There are various issues with ffmpeg mkv muxer, and mp4 muxer. Most GUI's use the actual commandline tools mkvmerge and mp4box , for the multiplexing stage
How do I get mkvmerge cli?
Learn from the example, and have a look at the nice documentation and adapt it to your program -
Thanks. Will do.
I figured I could get rid of ffmpeg completely if I could find another cli program that could split videos into frames and back. I'm trying to minimize file size of dependencies, and ffmpeg is too large for what I use it for. Do you know a smaller software that does this? (or do you think it's a better idea to stick with ffmpeg for that?) -
I don't have any good ideas for a replacement that makes it simpler with smaller binaries, simpler dependencies. I would stick with ffmpeg and mkvmerge
When looking for replacement options, you would still need to convert YUV video to RGB to PNG images , run it through the GAN or whatever processing to upscale, PNG to YUV video . You still need an encoder too and muxer too .
There are implementations that don't need PNG images - , eg. you can run RealESRGAN (or other machine learning algorithms) in vapoursynth , or some though avisynth, but they take up filespace too, either through install or "portable"
If you compile it yourself - you can disable many of the features to make a ffmpeg binary smaller. The precompiled ones that you download usually have many libraries included that are never used, but balloon up the filesize . You can strip out all the things you don't need. A commonly used one for ffmpeg MABS , the ffmpeg autobuild suite
ffmpeg is nice in that it bundles many encoders (libx265, libx264, etc...), demuxers, muxers . It's like a "swiss army knife" but it still has issues
Similar Threads
-
Emerge video and audio and subtitle together
By CrymanChen in forum Newbie / General discussionsReplies: 4Last Post: 6th Nov 2022, 20:51 -
Synchronize audio and video in Subtitle Edit and save
By loninappleton in forum SubtitleReplies: 18Last Post: 4th Mar 2022, 15:45 -
Merging video & audio files
By alwarm in forum Video ConversionReplies: 12Last Post: 17th Sep 2021, 10:14 -
Merging url video and url audio
By lafamar in forum Video Streaming DownloadingReplies: 7Last Post: 9th Apr 2020, 03:52 -
Comparing similar video files to find corrupted frames / audio glitches
By abolibibelot in forum Video ConversionReplies: 20Last Post: 2nd Oct 2019, 10:24