VideoHelp Forum
+ Reply to Thread
Results 1 to 6 of 6
Thread
  1. It's been a while since I had a chance to do any encoding tests, so I figured I would do one with a test source I have never used, namely the nearly 100gb Meridian test file that can be found one Xiph Derf.

    There's a number of things I need to explain about this test:

    I'm sure that many people will complain and criticize this test and in some ways I will actually agree with them, The source is 1058 Mb/s 3840x2160, 59.940 fps, JPEG2000 (BCM@L6), 4:2:2 10-bit and I used Fedora 33 running on an HP laptop that has a i5-1035G1 CPU with an integrated UHD Graphics (ICL GT1).

    I have upgraded this laptop from 8gb (1x8) @ 2400mhz to 16gb ddr4 (2x8) @ 2666mhz and I have upgraded the cheap 128gb ssd and 1tb 5400 rpm hdd to 1tb NVMe and 1tb SSD.

    This codec is brutal to decode and I have never seen a system that could decode it in real time, in fact according to media info they created this file using Colorfront Transkoder running on Win 7 and from everything I can find it seems Colorfront Transkoder is GPU accelerated.

    For test I used Handbrake and used x264 very fast 10-bit and Intel's qsv hevc 10-bit quality setting, and both were set to use 5Mb/s in one pass mode, 1280x720@59.94fps.

    Encode speed for the x264 encode was 2.98fps and for the Intel hevc encode it was 3.02fps. There is a severe decode and resize bottleneck, with both encodes using about 50% of the cpu for the entire 5+ hours per encode.

    This is why I didn't do a 2 pass encode or use a slower preset for x264, it took so long with these settings that I consider it unusable for normal encodes.

    Furthermore, I tried a CRF encode but it only used about 2200kb/s and the quality was atrocious. The interesting thing is that even though I chose a 1 pass 5mb/s, both encoders only used 4200kb/s.

    Unless you use software that has hardware decode of this codec, you will not be able to transcode this in real time. The reason is that you would need a cpu with 20 times the cores and threads and even then you will bottlenecked by the resize filter, so you would need hardware resizing as well.

    Thoughts are welcome.
    Image Attached Files
    Last edited by sophisticles; 6th Jan 2021 at 18:34.
    Quote Quote  
  2. Why not perform all the preprocessing separately, and encode from that separate intermediate ? This way you could separate preprocessing steps from actual encoder(s) - you could determine the actual encoding speeds instead of contaminating the results pre-processing bottlenecks. Also, you could do many tests quickly, instead of incurring the bottleneck penalty each encoding run

    Or is that what you're testing on purpose ? ie. are you testing a specific workflow such as when faced with bottlenecks ?

    Why did you use handbrake ? Do they have true 10bit pipe yet ? It' s not a common tool for someone using a 10bit422 source planning on a 10bit encode for that intermediate 8bit reason

    Why did you choose to look at x264 ? Wouldn't x265 make more sense ? HEVC has 10bit decoding support in hardware, much more common. AVC has almost none . Usually the only time 10bit AVC is used is for intermediates and acquisition (AVC-Intra/Ultra variants)
    Quote Quote  
  3. You raised many of the objections I expected, and in fact grappled with myself.

    In order:

    I did create a number of intermediates, including x264 lossless, ut video, and huffy. The thing is each took over 20 hours to create and resulted in such huge but rates that there was still a bottleneck.

    I had hoped to feed the x264 lossless into avidemux, because it supports both hardware decode via vaapi and vdpau and hardware resize via vaapi, but I found it wouldn't work.

    I ended deleted all the intermediates I made in frustration, only to realize that I was using the i965 driver instead of the iHD driver. Now that I just checked, not only does the hardware decode works but so does the opengl and vaapi resize, so I may redo it with a lossless x264.

    I can't stand handbrake and no it still does not support a true 10-bit pipe, but it does make it easy to create a batch job and it does allow for 10-bit intel hevc, 2 things that avidemux does not and shotcut does not support 10-bit hevc encoding, at least not easily, I may be able to do it by manually adding the options, I haven't tried it.

    I didn't use x265 because it's so cpu intensive that the encode tests would have taken twice as much time.

    As for 10-bit hevc, maybe 4:2:0 hardware encode is becoming common but the only thing that features hardware 10-bit 4:2:2 decode is Apple's new M1.

    I'm going to redo this test using x264 lossless intermediate and avidemux hardware decode and resize, but with 8-bit target hevc and x265 encode.

    Check back in a day or two.
    Quote Quote  
  4. Originally Posted by sophisticles View Post
    I did create a number of intermediates, including x264 lossless, ut video, and huffy. The thing is each took over 20 hours to create and resulted in such huge but rates that there was still a bottleneck.
    What is your goal ? I thought you were encoding 1280x720p59.94 ? How is that lossless codec bitrate going to be a bottleneck? It's going to decode a lot faster than that UHD MJPEG source

    Are you encoding 10bit420 or 10bit422 ? Huffyuv classic does not support 10bit eitherway

    UT Video supports 10bit 420 and 10bit 422, but not encoding from libavcodec/ffmpeg . Only windows VFW/ACM supports encoding 10bit for UT

    FFVHuff (ffmpeg) and FFV1 support all those pixel format configurations

    I would simplify all the preprocessing steps into the intermediate if you were set out to test encoders (not some workflow). That way you're testing the actual encoder and encoding speed, not some other bottleneck or other factors like resizing or pixel format conversions (which are not part of the encoder)
    Quote Quote  
  5. My goal was 2 fold, I wanted to watch the Meridian movie and simultaneously I wanted to test the quality of Ice Lake's QSV encoder against x264. Initially I was just going to watch the 720p version and run encoding tests using the 4k version, so I created 4k intermediates and those were the ones that had data rates that resulted in bottlenecks just as big as the source.

    Now that I have watched the movie, I am going to make an x264 lossless 4k version, feed that into avidemux, which hopefully will allow me to use hardware decode and eliminate that bottleneck and then do a bunch of test encodes in a reasonable amount of time.
    Quote Quote  
  6. Originally Posted by sophisticles View Post
    This is why I didn't do a 2 pass encode or use a slower preset for x264, it took so long with these settings that I consider it unusable for normal encodes.
    Would a sower preset be slower though? It's generally only slower if a faster preset is already utilizing 100% of the CPU. If it's not, because there's a bottleneck earlier on, doesn't that mean there's free CPU cycles for a slower preset to use without it actually being slower? If you're running low on memory and a slower preset is looking further ahead, maybe that'd make a difference.

    Originally Posted by sophisticles View Post
    Furthermore, I tried a CRF encode but it only used about 2200kb/s and the quality was atrocious. The interesting thing is that even though I chose a 1 pass 5mb/s, both encoders only used 4200kb/s..
    Does that mean the CRF value should have been lower?
    Average bitrate encoding doesn't come close to CRF or 2 pass for quality anyway. Whenever I read "bitrate" without a 2 pass qualification for x264, I lose interest.
    Quote Quote  



Similar Threads

Visit our sponsor! Try DVDFab and backup Blu-rays!