The title is not a typo, I was doing some test encodes on Windows and Linux, using Handbrake and Hybrid, when I ran across an oddity that I am hoping someone can explain.
The source mediainfo says:
Format : JPEG 2000
Format profile : D-Cinema 4k
Format settings, wrapping mode : Frame
Codec ID : 0D010301020C0100-0401020203010104
Duration : 20 s 833 ms
Bit rate : 124 Mb/s
Width : 4 096 pixels
Height : 1 716 pixels
Display aspect ratio : 2.40:1
Frame rate : 24.000 FPS
Color space : XYZ
Chroma subsampling : 4:4:4
Bit depth : 12 bits
Scan type : Progressive
Bits/(Pixel*Frame) : 0.737
Stream size : 309 MiB (100%)
Title : Picture Track
Color range : Full
I created two x264 test encodes, one using Handbrake and one using Hybrid, both x264, crf 18, preset slow, tune film. At crf both should be visually lossless to the original.
When I checked both files I noticed that the one created by Handbrake was smaller than the one created by Hybrid, by a significant amount, due to using 5mb/s less bit rate.
Here's the mediainfo for both files:
Format : AVC
Format/Info : Advanced Video Codec
Format profile : Main@L5.1
Format settings : CABAC / 4 Ref Frames
Format settings, CABAC : Yes
Format settings, Reference frames : 4 frames
Codec ID : V_MPEG4/ISO/AVC
Duration : 20 s 834 ms
Bit rate : 40.2 Mb/s
Width : 4 096 pixels
Height : 1 712 pixels
Display aspect ratio : 2.40:1
Frame rate mode : Constant
Frame rate : 24.000 FPS
Color space : YUV
Chroma subsampling : 4:2:0
Bit depth : 8 bits
Scan type : Progressive
Bits/(Pixel*Frame) : 0.239
Stream size : 99.7 MiB (98%)
Writing library : x264 core 164 r3100 ed0f7a6
Encoding settings : cabac=1 / ref=2 / deblock=1:-1:-1 / analyse=0x1:0x111 / me=hex / subme=6 / psy=1 / psy_rd=1.00:0.15 / mixed_ref=1 / me_range=16 / chroma_me=1 / trellis=1 / 8x8dct=0 / cqm=0 / deadzone=21,11 / fast_pskip=1 / chroma_qp_offset=-3 / threads=12 / lookahead_threads=2 / sliced_threads=0 / nr=0 / decimate=1 / interlaced=0 / bluray_compat=0 / constrained_intra=0 / bframes=3 / b_pyramid=2 / b_adapt=1 / b_bias=0 / direct=1 / weightb=1 / open_gop=0 / weightp=1 / keyint=240 / keyint_min=24 / scenecut=40 / intra_refresh=0 / rc_lookahead=30 / rc=crf / mbtree=1 / crf=18.0 / qcomp=0.60 / qpmin=0 / qpmax=69 / qpstep=4 / vbv_maxrate=240000 / vbv_bufsize=240000 / crf_max=0.0 / nal_hrd=none / filler=0 / ip_ratio=1.40 / aq=1:1.00
Default : Yes
Forced : No
Color range : Limited
Color primaries : BT.709
Transfer characteristics : BT.709
Matrix coefficients : BT.709
Format : AVC
Format/Info : Advanced Video Codec
Format profile : High@L5.1
Format settings : CABAC / 5 Ref Frames
Format settings, CABAC : Yes
Format settings, Reference frames : 5 frames
Codec ID : avc1
Codec ID/Info : Advanced Video Coding
Duration : 20 s 833 ms
Bit rate : 46.8 Mb/s
Maximum bit rate : 96.9 Mb/s
Width : 4 096 pixels
Height : 1 712 pixels
Display aspect ratio : 2.40:1
Frame rate mode : Constant
Frame rate : 24.000 FPS
Color space : YUV
Chroma subsampling : 4:2:0
Bit depth : 8 bits
Scan type : Progressive
Bits/(Pixel*Frame) : 0.278
Stream size : 116 MiB (100%)
Title : Picture Track
Writing library : x264 core 164 r3094 bfc87b7
Encoding settings : cabac=1 / ref=5 / deblock=1:-1:-1 / analyse=0x3:0x113 / me=hex / subme=8 / psy=1 / psy_rd=1.00:0.15 / mixed_ref=1 / me_range=16 / chroma_me=1 / trellis=2 / 8x8dct=1 / cqm=0 / deadzone=21,11 / fast_pskip=1 / chroma_qp_offset=-3 / threads=12 / lookahead_threads=2 / sliced_threads=0 / nr=0 / decimate=1 / interlaced=0 / bluray_compat=0 / constrained_intra=0 / bframes=3 / b_pyramid=2 / b_adapt=1 / b_bias=0 / direct=3 / weightb=1 / open_gop=0 / weightp=2 / keyint=250 / keyint_min=24 / scenecut=40 / intra_refresh=0 / rc_lookahead=50 / rc=crf / mbtree=1 / crf=18.0 / qcomp=0.60 / qpmin=0 / qpmax=69 / qpstep=4 / vbv_maxrate=300000 / vbv_bufsize=300000 / crf_max=0.0 / nal_hrd=none / filler=0 / ip_ratio=1.40 / aq=1:1.00
Encoded date : UTC 2022-12-24 00:14:11
Tagged date : UTC 2022-12-24 00:14:17
Color range : Full
Matrix coefficients : BT.2020 constant
Here's the thing, the Hybrid encode used more reference frames (5 vs 4), more bit rate, a theoretically superior color range (full vs limited), they are both crf 18 so in theory visually lossless compared to the original, yet PSNR, SSIM, and VMAF all favor the Handbrake encode by a significant amount.
total PSNR
Handbrake 33.24382019
Hybrid 26.86223602
min PSNR
Handbrake 28.80747032
Hybrid 25.20804405
max PSNR
Handbrake 35.61754608
Hybrid 28.81113052
mean SSIM
Handbrake 0.946880043
Hybrid 0.902881146
min SSIM
Handbrake 0.333477795
Hybrid 0.333593279
max SSIM
Handbrake 0.975139856
Hybrid 0.943214178
VMAF
Handbrake 58.50056839
Hybrid 45.81596375
min VMAF
Handbrake 45.66186142
Hybrid 35.04164124
What I am wondering is how could both encodes be theoretically visually lossless to the original when the encode with the lower bit rate and limited color range scores higher on 3 of the most popular objective metrics available.
Source:
https://mango.blender.org/production/4k-dcp-available-for-testing/
+ Reply to Thread
Results 1 to 13 of 13
-
-
If this is TOS, something wrong with your encoding and/or measuring method. A black frame should have a very high max PSNR, near infinity. But you have 35 and 28. Even if one of them has the wrong range (full vs limited), you have both cases covered, so one of them should score very high for max PSNR.
If source MXF is converted from XYZ to limited range YUV as the "Source", yet one encode is mapped to full, the full version will score lower . Or if you 're converting encodes to 12bit XYZ 444 , and range is mapped incorrectly , that will skew the results too. Basically the one that is mismatched to the whatever your reference is should score lower. Also one is flagged bt.2020, the other is flagged rec.709 - that can affect some players or measuring software. So depending on how you test the source or convert the encodes to match the source for testing - results will be skewed too
ie. You're not controlling the variables or pixel type conversions, or range conversions. 12bit XYZ is not directly comparable to YUV420P8 . There has to be some conversions somewhere. If you specify exact method to convert to/from XYZ , to/from YUV 420P8 (or whatever you're testing), you should get the correct measurements -
I simply took the ToS mxf file I linked to, loaded them into Handbrake and Hybrid, in Hybrid I chose "full range", I see no such option in Handbrake and I chose x264 crf 18 tune film slow preset for both of them.
For testing I used MSU Video Quality Measurement Tool, free version.
I think there's something wrong with both Hybrid and Handbrake, Handbrake, for crf 0 svt-av1 10-bit produces a file that is 6mb in size because it uses just 2375 kb/s.
Meanwhile crf 1 creates a file over 400 mb with a bit rate of 171 mb/s.
The sad truth is that I can barely tell the difference. -
Try to control the variables, or metrics have no value. An average PSNR of 33 is not great quality, or VMAF of 58. The measurements should be higher . The low max PSNR is a big red flag when you have a black frame.
It would probably help you to convert the source to something more standard first . Then use that as the "source". I suspect the 12bit XYZ mxf source is giving MSU problems
HB - unless something has changed, I think handbrake always compresses the range, whenever source has a full range flag. You can encode with fullrange=on in the advanced options , but the actual range will be compressed from a full range source, but just flagged as full.
I think there's something wrong with both Hybrid and Handbrake, Handbrake, for crf 0 svt-av1 10-bit produces a file that is 6mb in size because it uses just 2375 kb/s.
Meanwhile crf 1 creates a file over 400 mb with a bit rate of 171 mb/s.
The sad truth is that I can barely tell the difference.
At that bitrate AV1 looks great. x264 would look much worse at that bitrate.
The AV1 encode is "watchable" but there is a significant quality difference between the encodes. The av1 encode is much softer with less detail and textures. It's similar result as if you applied a denoiser, smoothed everything over and discarded the high frequency details -
If you have full range input, but not flagged as full (e.g. use a lossless intermediate that doesn't flag the range, or strip the full range flag), then use fullrange=on, HB will encode full range data and flag it properly as full range
=> Whenever HB "sees" the full range flag, it will range compress the encode -
Only on mobile for the next few days, but I suspect that your source has no color matrix info.
Hybrid guesses "Matrix coefficients : BT.2020 constant" due to the resolution, Handbrake guesses "Matrix coefficients : BT.709".
Assuming you didn't change that info in either of them, both should pass that info to the muxer.
Depending on your input, one of the flags is wrong and thus the decoding of the source will use the wrong color matrix, which probably will influence the results of your measurements.users currently on my ignore list: deadrats, Stears555, marcorocchini -
re
For testing I used MSU Video Quality Measurement Tool, free version. -
-
Yes, depends on your definition of "visually lossless".
Some people call ProresHQ, or Cineform filmscan quality "visually lossless" too, and there is a massive quality difference between x264 crf 18. How can both be "visually lossless"
Yet other people call Youtube "visually lossless" to a bluray and cannot "see" the difference - go figure ...
Another common error for metric testing, is different container timebase. eg. MP4 and MKV and MXF use a different timebase . If you don't compensate for that, some tools will give you the wrong results (e.g. ffmpeg), even if the frames match up. Avisynth is CFR only , and if you use AssumeFPS(something) , you can also doube check that the timestamps and frames will match up. You also have control over the conversions of pixel types, up/down and chroma sampling algorithms. -
-
I decided to follow PD's advice and transcode the above linked source to a different format, I used Shotcut running on Ubuntu and transcode it to ProRes ks and Huffy, interestingly both transcoded to 4:2:2 BT.709:
Format :
HuffYUV
Format version :
Version 2
Codec ID :
V_MS/VFW/FOURCC / HFYU
Duration :
20 s 834 ms
Bit rate :
864 Mb/s
Width :
4 096 pixels
Height :
1 716 pixels
Display aspect ratio :
2.40:1
Frame rate mode :
Constant
Frame rate :
24.000 FPS
Color space :
YUV
Chroma subsampling :
4:2:2
Bit depth :
8 bits
Scan type :
Progressive
Bits/(Pixel*Frame) :
5.124
Stream size :
2.10 GiB (98%)
Default :
No
Forced :
No
Color range :
Full
Color primaries :
BT.709
Transfer characteristics :
BT.709
Matrix coefficients :
BT.709
Format :
ProRes
Format version :
Version 0
Format profile :
422 HQ
Codec ID :
apch
Duration :
20 s 834 ms
Bit rate mode :
Variable
Bit rate :
1 085 Mb/s
Width :
4 096 pixels
Height :
1 716 pixels
Display aspect ratio :
2.40:1
Frame rate mode :
Constant
Frame rate :
24.000 FPS
Color space :
YUV
Chroma subsampling :
4:2:2
Scan type :
Progressive
Bits/(Pixel*Frame) :
6.429
Stream size :
2.63 GiB (100%)
Writing library :
Apple
Color primaries :
BT.709
Transfer characteristics :
BT.709
Matrix coefficients :
BT.709 -
huffyuv wll be 8bit , and 422 . Great if you 're testing 8bit422 encodes. Not great if you're testing 8bit420 encodes - Otherwise you introduce other confounding variables. If you control those variables, great, otherwise your results will not be correct ( even if you use lossless encoding)
Same with prores HQ (10bit 422) . Errors and differences between procedures. Great if you 're going to test 10bit422 encoding. Not so great for anything else
1) Downsampling chroma from 4:4:4 to 4:2:2 or 4:2:0 ; and upsampling back to 4:4:4 or 4:2:2 if testing against the "source" - many different algorithms and different results even if both use lossless encoding . E.g if one uses bicubic, but one uses bilinear, you 're going to get different results
2) Bit depth conversions - eg. From higher to lower. If one tool dithers, another does not, you will get different results even with lossless encoding. Even if both dither, there are dozens of different dither algorithms. e.g if one uses sierra 4a, but another uses floyd-steinberg - different results
Shotcut
e.g. I think shotcut is 8bit . If you load a 10bit video (e.g. 10bit smooth gradient), export 10bit prores, it will be 8bit data in 10bit (no longer smooth gradient) . That Prores video will not be true 10bit data. So if you were testing 10bit encodes - not so good
Ideally, you would convert to the same pixel format that you are testing properly (if you were testing 8bit4:2:0 encodes, use something lossless - original huffyuv does not support 4:2:0), and use that as the source to eliminate all those variables (so everything is constant and the same for each encoder input). So the only thing you are testing is the encoder, not the other processing which may pollute the results. Or, you can control all the down/up conversions and test against the 12bit 444 XYZ source as long as you're consistent. Some programs might not give you that control (e.g. handbrake) . So it's usually better to test the encoder directly (CLI version)Last edited by poisondeathray; 24th Dec 2022 at 13:55.
Similar Threads
-
x264 threads
By alkoon in forum Video ConversionReplies: 5Last Post: 23rd Nov 2020, 01:22 -
x264-10bit to x264-8bit
By linuxlad in forum Video ConversionReplies: 7Last Post: 1st Nov 2020, 10:21 -
Tutorials for X264?
By vidmarc in forum Authoring (Blu-ray)Replies: 1Last Post: 26th May 2020, 05:19 -
x264 benchmark? what mobile chipset to do more fast encoding x264 encoding?
By marcorocchini in forum Newbie / General discussionsReplies: 1Last Post: 21st Sep 2018, 23:06 -
4K to x264 settings
By hatenooobs in forum Video ConversionReplies: 6Last Post: 12th Mar 2018, 02:13