VideoHelp Forum




+ Reply to Thread
Results 1 to 18 of 18
  1. I broke down a h265 video into frames and try to merge them back together with its original audio, subtitle and fonts. The resulting video has an issue where when I try to seek at some points in the video, the audio stops playing. And even when I don't seek, at a point the video hangs and the audio keeps playing. Seconds later, the video resumes, but now, video and audio are out of sync. This doesn't happen with the original video.

    The reason I'm breaking the video into frames and merging them is because I want to upscale each frame. But I'm going to leave that part out because this issue occurs with the original unscaled frames.

    Here's the details of the original video. Notice it has video, audio and two font streams.
    Code:
    .\ffmpeg.exe -i "input.mkv"
    
    Input #0, matroska,webm, from 'input.mkv':
      Metadata:
        encoder         : libebml v1.3.10 + libmatroska v1.5.2
        creation_time   : 2021-01-07T00:20:19.000000Z
      Duration: 00:23:02.05, start: 0.000000, bitrate: 320 kb/s
        Stream #0:0: Video: hevc (Main), yuv420p(tv), 1280x720, SAR 1:1 DAR 16:9, 23.98 fps, 23.98 tbr, 1k tbn, 23.98 tbc (default)
        Metadata:
          BPS-eng         : 278671
          DURATION-eng    : 00:23:02.006000000
          NUMBER_OF_FRAMES-eng: 33135
          NUMBER_OF_BYTES-eng: 48140731
          _STATISTICS_WRITING_APP-eng: mkvmerge v46.0.0 ('No Deeper Escape') 64-bit
          _STATISTICS_WRITING_DATE_UTC-eng: 2021-01-07 00:20:19
          _STATISTICS_TAGS-eng: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
        Stream #0:1(jpn): Audio: aac (HE-AAC), 48000 Hz, stereo, fltp
        Metadata:
          BPS-eng         : 36166
          DURATION-eng    : 00:23:02.016000000
          NUMBER_OF_FRAMES-eng: 32391
          NUMBER_OF_BYTES-eng: 6247833
          _STATISTICS_WRITING_APP-eng: mkvmerge v46.0.0 ('No Deeper Escape') 64-bit
          _STATISTICS_WRITING_DATE_UTC-eng: 2021-01-07 00:20:19
          _STATISTICS_TAGS-eng: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
        Stream #0:2(eng): Subtitle: ass (default)
        Metadata:
          BPS-eng         : 76
          DURATION-eng    : 00:21:20.790000000
          NUMBER_OF_FRAMES-eng: 246
          NUMBER_OF_BYTES-eng: 12264
          _STATISTICS_WRITING_APP-eng: mkvmerge v46.0.0 ('No Deeper Escape') 64-bit
          _STATISTICS_WRITING_DATE_UTC-eng: 2021-01-07 00:20:19
          _STATISTICS_TAGS-eng: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
        Stream #0:3: Attachment: ttf
        Metadata:
          filename        : Roboto-Medium.ttf
          mimetype        : application/x-truetype-font
        Stream #0:4: Attachment: ttf
        Metadata:
          filename        : Roboto-MediumItalic.ttf
          mimetype        : application/x-truetype-font
    Here's how I break it into frames
    Code:
    .\ffmpeg.exe -i "input.mkv" -qscale:v 1 -qmin 1 -qmax 1 -vsync 0 "InputFolder/frame%08d.png"
    Here's how I merge the frames back to video with all the original streams except the video

    Code:
    .\ffmpeg.exe -r 23.98 -i "InputFolder\frame%08d.png" -i "input.mkv" -map 0:v:0 -map 1 -map -1:v -c:a copy -c:v libx265 -r 23.98 -pix_fmt yuv420p "output.mkv"
    Here's the details of the resulting video:

    Code:
    .\ffmpeg.exe -i "output.mkv"
    
    Input #0, matroska,webm, from 'output.mkv':
      Metadata:
        ENCODER         : Lavf58.45.100
      Duration: 00:23:02.05, start: 0.000000, bitrate: 245 kb/s
        Stream #0:0: Video: hevc (Main), yuv420p(tv), 1280x720 [SAR 1:1 DAR 16:9], 23.98 fps, 23.98 tbr, 1k tbn, 23.98 tbc (default)
        Metadata:
          ENCODER         : Lavc58.91.100 libx265
          DURATION        : 00:23:01.777000000
        Stream #0:1(jpn): Audio: aac (HE-AAC), 48000 Hz, stereo, fltp (default)
        Metadata:
          BPS-eng         : 36166
          DURATION-eng    : 00:23:02.016000000
          NUMBER_OF_FRAMES-eng: 32391
          NUMBER_OF_BYTES-eng: 6247833
          _STATISTICS_WRITING_APP-eng: mkvmerge v46.0.0 ('No Deeper Escape') 64-bit
          _STATISTICS_WRITING_DATE_UTC-eng: 2021-01-07 00:20:19
          _STATISTICS_TAGS-eng: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
          DURATION        : 00:23:02.046000000
        Stream #0:2(eng): Subtitle: ass (default)
        Metadata:
          BPS-eng         : 76
          DURATION-eng    : 00:21:20.790000000
          NUMBER_OF_FRAMES-eng: 246
          NUMBER_OF_BYTES-eng: 12264
          _STATISTICS_WRITING_APP-eng: mkvmerge v46.0.0 ('No Deeper Escape') 64-bit
          _STATISTICS_WRITING_DATE_UTC-eng: 2021-01-07 00:20:19
          _STATISTICS_TAGS-eng: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
          ENCODER         : Lavc58.91.100 ssa
          DURATION        : 00:21:21.580000000
        Stream #0:3: Attachment: ttf
        Metadata:
          filename        : Roboto-Medium.ttf
          mimetype        : application/x-truetype-font
        Stream #0:4: Attachment: ttf
        Metadata:
          filename        : Roboto-MediumItalic.ttf
          mimetype        : application/x-truetype-font
    One thing to note is that I've done this successfully numerous times with h264 videos. No audio issues. Another thing to note which might be more relevant is that when I merge the frames with only the original audio stream (as opposed to all original streams except video), the audio issue does not occur. Also, when I merge the frames with the original audio AND subtitle stream, i.e without the fonts, the issue remains.


    Code:
    .\ffmpeg.exe -r 23.98 -i "InputFolder\frame%08d.png" -i "input.mkv" -map 0:v:0 -map 1:a:0 -c:a copy -c:v libx265 -r 23.98 -pix_fmt yuv420p "output.mkv"
    
    Produces no audio issues. But this isn't good for me because I want the subtitles and fonts from the original video.
    If anyone needs me to upload the original video somewhere so they can reproduce it, let me know.
    Quote Quote  
  2. Does the issue occur using mkvmerge instead ?
    Quote Quote  
  3. Instead of -r 23.98 you should use -r 24000/1001
    Quote Quote  
  4. Originally Posted by poisondeathray View Post
    Does the issue occur using mkvmerge instead ?
    I've installed it. How do I feed it the frames?
    Quote Quote  
  5. Originally Posted by ProWo View Post
    Instead of -r 23.98 you should use -r 24000/1001
    Tried that just now. Same issue.
    Quote Quote  
  6. Originally Posted by PeteJobi View Post
    Originally Posted by poisondeathray View Post
    Does the issue occur using mkvmerge instead ?
    I've installed it. How do I feed it the frames?
    Probably easier to use mkvtoolnix-gui (the GUI) at first. There is an option to the show command line, you can also look at the documentation and examples included there

    Add original file, uncheckmark original video stream, add replacement stream, push start multiplexing. The replacement stream can be the same one you made from ffmpeg earlier, you're just using mkvmerge to do the multiplexing step . It's generally more reliable than ffmpeg


    For encoding the replacement stream, I would pay more attention to your encoding settings and parameters. You are using default libx265 settings, and the quality might not be ideal. Also -pix_fmt yuv420p will use swscale and Rec601 to convert RGB to YUV . Assuming your png images were done correctly upstream, this means the output YUV video will have "SD" colors . Maybe you were upscaling to "SD" , but I think it's unlikely. Normally Rec709 would be used for HD
    Quote Quote  
  7. Probably easier to use mkvtoolnix-gui (the GUI) at first. There is an option to the show command line, you can also look at the documentation and examples included there

    Add original file, uncheckmark original video stream, add replacement stream, push start multiplexing. The replacement stream can be the same one you made from ffmpeg earlier, you're just using mkvmerge to do the multiplexing step . It's generally more reliable than ffmpeg
    Truth is I'm not using ffmpeg manually. I wrote myself a little program that runs the commands for me. All I need ffmpeg for is to split video into frames, and merge frames back into video along with audio and subtitles and whatever the original video contained. If there's something else that can do that for me, and work for MP4 as well as MKV, and has a CLI I can use in my code, I'll gladly replace ffmpeg with it (if it's not larger than ffmpeg).

    For encoding the replacement stream, I would pay more attention to your encoding settings and parameters. You are using default libx265 settings, and the quality might not be ideal. Also -pix_fmt yuv420p will use swscale and Rec601 to convert RGB to YUV . Assuming your png images were done correctly upstream, this means the output YUV video will have "SD" colors . Maybe you were upscaling to "SD" , but I think it's unlikely. Normally Rec709 would be used for HD
    I also think it's something with the parameters. I've tried libx264 as well and gotten the same result. I'm not an ffmpeg expert (I only know enough to serve my purpose), so I'll take any advice and pointers you have. Though it sounds like the parameters you mentioned affect image quality? What do you suggest I use for -pix_fmt?
    Quote Quote  
  8. Your problem is likely from ffmpeg muxing. Notice the original file was made with mkvmerge.

    Try 1 video first, then show the CLI command to adapt it to your program. It cannot output MP4, but MP4 cannot hold all types of streams such as your ass sub streams. mkvmerge can accept MP4 input or elementary streams. For proper MP4 muxing output, use mp4box. These are all commandline programs. I'm just recommending that you use the GUI first, on 1 quick test, to see if it works, so you don't waste your time. No use learning the 50 ipages of documentation, if it doesn't work and it's not the issue. If it works, you have your answer

    It's not an issue with encoding. Because when you mux original audio with new video only ( but no other streams) it works ok. But you can improve seeking granularity by reducing max keyframe interval, but the default is 250, and for hundreds of millions of normal videos, this works fine. To reduce the keyframe interval, use -g . For example -g 24 for 1sec interval. An original "24p" BD would have this value. But is not your problem. The problem is ffmpeg muxing

    There is a separate point about quality. If you're going to upscale IMO, you might as well do it correctly. Use a lower -crf value for higher bitrates (less quality loss) .

    For -pix_fmt, swscale is ok, but you need to specify the 709 matrix for the RGB to YUV conversion . If you don't, the colors will be slightly shifted in most players. There will be a Rec601 vs.709 mismatch. By convention, HD material uses 709. Instead of -pix_fmt, use

    Code:
    -vf scale=out_color_matrix=bt709,format=yuv420p
    Quote Quote  
  9. Are you sure the frame count of the source and the reencode are the same?
    Extract the time codes to see whether the source if vfr. (or the audio is stretched)
    users currently on my ignore list: deadrats, Stears555, marcorocchini
    Quote Quote  
  10. Originally Posted by Selur View Post
    Are you sure the frame count of the source and the reencode are the same?
    Extract the time codes to see whether the source if vfr. (or the audio is stretched)
    That shouldn't be an issue, since replacement + original audio (but no subs / other streams) works ok .


    Originally Posted by PeteJobi View Post
    . Another thing to note which might be more relevant is that when I merge the frames with only the original audio stream (as opposed to all original streams except video), the audio issue does not occur. Also, when I merge the frames with the original audio AND subtitle stream, i.e without the fonts, the issue remains.
    So it suggests an issue with muxing subs with ffmpeg, or re-writing timestamps with ffmpeg


    The thing about mkvmerge is it preserves the original timestamps (not just video timestamps, other streams as well) - so if you just replace just the video stream, everything should be the same except for the replaced video. I'd be very surprised if it didn't work. Usually the culprit is ffmpeg
    Quote Quote  
  11. Originally Posted by poisondeathray View Post
    Your problem is likely from ffmpeg muxing. Notice the original file was made with mkvmerge.

    Try 1 video first, then show the CLI command to adapt it to your program. It cannot output MP4, but MP4 cannot hold all types of streams such as your ass sub streams. mkvmerge can accept MP4 input or elementary streams. For proper MP4 muxing output, use mp4box. These are all commandline programs. I'm just recommending that you use the GUI first, on 1 quick test, to see if it works, so you don't waste your time. No use learning the 50 ipages of documentation, if it doesn't work and it's not the issue. If it works, you have your answer

    It's not an issue with encoding. Because when you mux original audio with new video only ( but no other streams) it works ok. But you can improve seeking granularity by reducing max keyframe interval, but the default is 250, and for hundreds of millions of normal videos, this works fine. To reduce the keyframe interval, use -g . For example -g 24 for 1sec interval. An original "24p" BD would have this value. But is not your problem. The problem is ffmpeg muxing

    There is a separate point about quality. If you're going to upscale IMO, you might as well do it correctly. Use a lower -crf value for higher bitrates (less quality loss) .

    For -pix_fmt, swscale is ok, but you need to specify the 709 matrix for the RGB to YUV conversion . If you don't, the colors will be slightly shifted in most players. There will be a Rec601 vs.709 mismatch. By convention, HD material uses 709. Instead of -pix_fmt, use

    Code:
    -vf scale=out_color_matrix=bt709,format=yuv420p
    Thank you for your suggestions on the ffmpeg parameters. I will tinker with those. I'll also look into muxing.
    I want to try out mkvmerge (which is MKVToolNix, right), but I still don't see a way to break a video into frames. I need the actual frames so I can upscale them with a different program. I see some options under "splitting", but doesn't seem like this is what I need.
    Quote Quote  
  12. I was able to input the original video and the upscaled video (the one with issues) into the program. I selected the video stream of the upscaled one, and everything else from the original, and the result is without issues i.e an upscaled video without the audio issues. Is this what you wanted me to try?

    I also noticed something that might be relevant.
    Image
    [Attachment 71981 - Click to enlarge]


    The upscaled video (produced by merging frames with ffmpeg) has more files than the original.
    Last edited by PeteJobi; 24th Jun 2023 at 14:17. Reason: More info
    Quote Quote  
  13. Update: I tried using mkvmerge to remove the extra files I mentioned earlier from the upscaled video. And it worked wonderfully without audio issues.
    Image
    [Attachment 72052 - Click to enlarge]


    So if I can get ffmpeg to not generate those "tags" thing, my problem should be solved.
    Quote Quote  
  14. Originally Posted by PeteJobi View Post

    So if I can get ffmpeg to not generate those "tags" thing, my problem should be solved.



    In theory - but mkvmerge is doing more than removing tags - it's remuxing the streams

    You can try mkvpropedit to remove tags to test your theory . If it works, then that validates your theory, you just have to figure out how to do it in ffmpeg next. mkvpropedit does inplace editing - no remuxing (so very fast)

    Code:
    "mkvpropedit" "input.mkv" --tags all:
    Quote Quote  
  15. In theory - but mkvmerge is doing more than removing tags - it's remuxing the streams
    Yeah, I see that now. I just tried what I did before, but without unchecking the tags or anything else. The resulting video had no issues. So I guess the tags are not the problem.

    If I don't find a ffmpeg solution, I guess what I can do is use ffmpeg to break the video to frames, and to merge the frames back to video without any stream at all, then use mkvmerge to merge that video and the streams from the original video.

    This adds an extra step to the process, but if there's no other way.....

    How do I get mkvmerge cli?
    Quote Quote  
  16. Originally Posted by PeteJobi View Post
    This adds an extra step to the process, but if there's no other way.....
    There are various issues with ffmpeg mkv muxer, and mp4 muxer. Most GUI's use the actual commandline tools mkvmerge and mp4box , for the multiplexing stage

    How do I get mkvmerge cli?
    Show the commandline that you used in the GUI by using multiplexer => show commandline

    Learn from the example, and have a look at the nice documentation and adapt it to your program
    Quote Quote  
  17. Originally Posted by poisondeathray View Post
    Originally Posted by PeteJobi View Post
    This adds an extra step to the process, but if there's no other way.....
    There are various issues with ffmpeg mkv muxer, and mp4 muxer. Most GUI's use the actual commandline tools mkvmerge and mp4box , for the multiplexing stage

    How do I get mkvmerge cli?
    Show the commandline that you used in the GUI by using multiplexer => show commandline

    Learn from the example, and have a look at the nice documentation and adapt it to your program
    Thanks. Will do.

    I figured I could get rid of ffmpeg completely if I could find another cli program that could split videos into frames and back. I'm trying to minimize file size of dependencies, and ffmpeg is too large for what I use it for. Do you know a smaller software that does this? (or do you think it's a better idea to stick with ffmpeg for that?)
    Quote Quote  
  18. I don't have any good ideas for a replacement that makes it simpler with smaller binaries, simpler dependencies. I would stick with ffmpeg and mkvmerge

    When looking for replacement options, you would still need to convert YUV video to RGB to PNG images , run it through the GAN or whatever processing to upscale, PNG to YUV video . You still need an encoder too and muxer too .

    There are implementations that don't need PNG images - , eg. you can run RealESRGAN (or other machine learning algorithms) in vapoursynth , or some though avisynth, but they take up filespace too, either through install or "portable"


    If you compile it yourself - you can disable many of the features to make a ffmpeg binary smaller. The precompiled ones that you download usually have many libraries included that are never used, but balloon up the filesize . You can strip out all the things you don't need. A commonly used one for ffmpeg MABS , the ffmpeg autobuild suite

    ffmpeg is nice in that it bundles many encoders (libx265, libx264, etc...), demuxers, muxers . It's like a "swiss army knife" but it still has issues
    Quote Quote  



Similar Threads

Visit our sponsor! Try DVDFab and backup Blu-rays!