VideoHelp Forum
+ Reply to Thread
Results 1 to 12 of 12
Thread
  1. It's me again.

    In this series of encoding tests I tried to see if I could transcode a very high quality source to an h264, hevc, or av1 delivery format file that scored at least 45dB PSNR YUV.

    The answer is yes and no.

    For test source I used this file:

    http://download.blender.org/mango/dcp/tos_dcp_test_04.zip

    I have used this file before and really like it because it's a professional, cinema quality source, made specifically for testing digital cinema servers and frankly it doesn't get much better than this.

    The goal was to see how much bit rate it would take to achieve at least 45dB PSNR with x264, x265 and svt-av1.

    PSNR literally stands for Peak Signal to Noise ratio and is the measure, as the name implies, of the ratio between how much of a source is received from a transmission and how much static or noise is received, or at least that's what it originally was back in the early radio broadcast days.

    Today this measurement has been extended to video and as far as i'm concerned, it's still the gold standard for video quality measurement.

    A word about PSNR; it has gotten a bad rap over the years, mostly because people that have used it to measure quality difference do not understand it and partly because of certain individuals that capitalized on this lack of understanding to spread FUD with the goal of promoting their own software.

    Back in the day, software like ffmpeg measured PSNR only on the Y channel of I frames. The problem with this should be obvious. If two encoders were compared and one encoder produced really high quality Y channel I frames but the other encoder produced higher quality UV channels and higher quality overall P and B frames, then your eyes would tell you that the second encoder produced the superior video but the PSNR calculation would tell you that the first one was the superior encode.

    Other mistakes include looking at the maximum PSNR each encoder produced; a slightly better, but still problematic method, is looking at the average or even the mean average.

    I have a background in analyzing data, I spent years analyzing tens of millions of dollars worth of yearly shipments, and I have come to think of data differently than most people.

    I never look at the maximum value, and give only passing weight to average and mean average values; I focus on the worst case scenario or minimum values and how many instances occur within a certain radius of the minimum value.

    My reasoning is that the perception of success or failure is defined by how low the lowest performance is; human nature is such that if you have excellent results most of the time but there is one atrocious outcome, then the idea that the thing being judged is of low value starts taking hold and once that happens, it becomes very difficult to change that perception, and in all honesty, there is significant validity in this human trait.

    So for this test, I encoded the above referenced file to x264, medium preset and both tune psnr and tune film variants; x265, 12-bit 444, preset very fast, both tune grain and tune psnr and svt-av1, 10-bit, preset 11.

    The goal was to see if any of these could achieve a PSNR of at least 45dB.

    ToS x265 veryfast tune grain 12bit 444.mp4
    max. val 53.81000519

    ToS x265 veryfast tune psnr 12bit 444.mp4
    max. val 69.64788818

    ToS x264 medium tune film.mkv
    max. val 30.16662788

    ToS x264 medium tune psnr.mkv
    max. val 30.16666794

    ToS svt-av1 10-bit.mkv
    max. val 71.25263977

    If simply posted these numbers, or graphed them in a chart, many would marvel at the PSNR x265 and svt-av1 achieved but if you watched the files you would not conclude that both of these is twice as good as x264, thereby further validating the claim that PSNR is a poor predictor of quality.

    But as i alluded to earlier, the devil is in the details:

    ToS x265 veryfast tune grain 12bit 444.mp4
    min. val 29.12540627

    ToS x265 veryfast tune psnr 12bit 444.mp4
    min. val 29.12562561

    ToS x264 medium tune film.mkv
    min. val 26.73370171

    ToS x264 medium tune psnr.mkv
    min. vall 26.73874855

    ToS svt-av1 10-bit.mkv
    min. val 33.103508

    This paints a very different picture, doesn't it? Here we see the minimum PSNR ratio achieved by each encoder is a lot closer than the maximum achieved by each encoder.

    More importantly, if we look at the maximum PSNR achieved by x264, 30.16662788/30.16666794, we see that it is actually LOWER than the minimum PSNR achieved by svt-av1, 33.103508.

    But this is misleading as well, because the x264 encode was 8-bit and the svt-av1 encode is 10-bit and the PSNR scale changes between 8/10/12 bit and between 420/422/444.

    With 500 frames and 5 encodes, for 2500 frames total, there's a lot of ways to parse the data.

    Bottom line is that none of these encodes achieved PSNR of at least 45dB bit in any meaningful way.

    Really surprising to say the least, considering I used what can be considered 4k BD bit rates for all encodes.This tells me that the jpeg200 format used for digital cinema is incredibly efficient, maybe as efficient as AVC.
    Image Attached Files
    Last edited by sophisticles; 14th Jan 2023 at 19:19.
    Quote Quote  
  2. You could argue it's not a realistic test, because 12bit444 is not going to be supported by many players. It's certainly not an end user pixel format

    Was the bitrate goal ~75 - 80Mb/s ? You could probably use better settings to achieve 45dB min



    When you have errors or inconsistencies in your methodology, you often come to the wrong conclusions

    The encodes don't match in range
    1) The svt-av1 encode has limited range . The x265 encode has the contrast expanded to full range. The source dcp raw data is actually limited range data, but flagged as full. So the x265 encode will be penalized

    2) The x265 encode has fluctuating CbCr values on the black frame . Instead of CbCr 2048,2048, it varies +/-4 every pixel . Likely it has been decoded at a higher bitdepth, dithered in the down conversion. This is like adding noise, which will, of course, drop the PSNR

    If you encode the DCP directly, libx265 12bit444 @ crf12 using the "slow" preset you get 69.7Mb/s , 174MB filesize
    PSNR y:52.401247 u:50.779612 v:55.001162 average:52.398049 min:48.628810 max:85.736507

    You could probably tune it using --tune psnr, and adjust the ip/pb ratios lower if you wanted higher min PSNR

    svt-av1 doesn't support 12bit444, but aomenc should do slightly better than x265
    Quote Quote  
  3. The goal was to see how much bit rate it would take to achieve at least 45dB PSNR with x264, x265 and svt-av1.
    I agree with poisondeathray. 'You could probably use better settings to achieve 45dB min.'
    Since you are aiming for PSNR at a restricted limit, why not use 2pass and slower presets?
    Why not use 10bit x264?

    Did a quick encode with x265, using:
    Code:
    ffmpeg -y -loglevel fatal -noautorotate -nostdin -threads 8 -i "G:\tos_dcp_test_04\tos_video.mxf" -map 0:0 -an -sn -vf  scale=in_range=pc:out_range=pc -sws_flags accurate_rnd+full_chroma_inp -pix_fmt yuv444p12le -strict -1 -vsync 0 -f yuv4mpegpipe - | x265 --input - --output-depth 12 --y4m --profile main444-12 --limit-modes --no-open-gop --opt-ref-list-length-pps --lookahead-slices 0 --pass 1 --no-slow-firstpass --bitrate 80529 --opt-qp-pps --cbqpoffs -2 --crqpoffs -2 --limit-refs 0 --ssim-rd --psy-rd 2.50 --rdoq-level 2 --psy-rdoq 10.00 --aq-mode 4 --sbrc --no-cutree --deblock=-1:-1 --limit-sao --no-repeat-headers --psnr --range full --colormatrix bt2020c --stats "J:\tmp\tos_video_generated_2023-01-15@08_05_23_5810_01.stats" --multi-pass-opt-analysis --multi-pass-opt-distortion --analysis-reuse-file "J:\tmp\tos_video_generated_2023-01-15@08_05_23_5810_01.analysis" --output NUL
    
    ffmpeg -y -loglevel fatal -noautorotate -nostdin -threads 8 -i "G:\tos_dcp_test_04\tos_video.mxf" -map 0:0 -an -sn -vf  scale=in_range=pc:out_range=pc -sws_flags accurate_rnd+full_chroma_inp -pix_fmt yuv444p12le -strict -1 -vsync 0 -f yuv4mpegpipe - | x265 --input - --output-depth 12 --y4m --profile main444-12 --limit-modes --no-early-skip --no-open-gop --opt-ref-list-length-pps --lookahead-slices 0 --pass 2 --bitrate 80529 --opt-qp-pps --cbqpoffs -2 --crqpoffs -2 --limit-refs 0 --ssim-rd --psy-rd 2.50 --rdoq-level 2 --psy-rdoq 10.00 --aq-mode 4 --sbrc --no-cutree --deblock=-1:-1 --limit-sao --no-repeat-headers --psnr --range full --colormatrix bt2020c --stats "J:\tmp\tos_video_generated_2023-01-15@08_05_23_5810_01.stats" --multi-pass-opt-analysis --multi-pass-opt-distortion --analysis-reuse-file "J:\tmp\tos_video_generated_2023-01-15@08_05_23_5810_01.analysis" --output "J:\tmp\2023-01-15@08_05_23_5810_02.265"
    I get:
    Code:
    y4m  [info]: 4096x1716 fps 24/1 i444p12 sar 1:1 unknown frame count
    raw  [info]: output file: J:\tmp\2023-01-15@08_05_23_5810_02.265
    x265 [info]: HEVC encoder version 3.5+83-555752223
    x265 [info]: build info [Windows][GCC 12.2.0][64 bit] 12bit
    x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
    x265 [warning]: --psnr used with psy on: results will be invalid!
    x265 [warning]: --tune psnr should be used if attempting to benchmark psnr!
    x265 [info]: Main 4:4:4 12 profile, Level-5 (High tier)
    x265 [info]: Thread pool created using 32 threads
    x265 [info]: Slices                              : 1
    x265 [info]: frame threads / pool features       : 5 / wpp(27 rows)
    x265 [info]: Coding QT: max CU size, min CU size : 64 / 8
    x265 [info]: Residual QT: max TU size, max depth : 32 / 1 inter / 1 intra
    x265 [info]: ME / range / subpel / merge         : hex / 57 / 2 / 3
    x265 [info]: Keyframe min / max / scenecut / bias  : 24 / 250 / 40 / 5.00
    x265 [info]: Cb/Cr QP Offset                     : -2 / -2
    x265 [info]: Lookahead / bframes / badapt        : 20 / 4 / 2
    x265 [info]: b-pyramid / weightp / weightb       : 1 / 1 / 0
    x265 [info]: References / ref-limit  cu / depth  : 3 / off / off
    x265 [info]: AQ: mode / str / qg-size / cu-tree  : 4 / 1.0 / 32 / 0
    x265 [info]: Rate Control / qCompress            : ABR-80529 kbps / 0.60
    x265 [info]: tools: limit-modes rd=3 ssim-rd psy-rd=2.50 rdoq=2 psy-rdoq=10.00
    x265 [info]: tools: rskip mode=1 signhide tmvp b-intra strong-intra-smoothing
    x265 [info]: tools: deblock(tC=-1:B=-1) sao stats-read
    x265 [info]: frame I:      8, Avg QP:13.62  kb/s: 161204.33  PSNR Mean: Y:53.617 U:51.702 V:53.932
    x265 [info]: frame P:    113, Avg QP:15.94  kb/s: 93141.56  PSNR Mean: Y:51.575 U:50.705 V:53.125
    x265 [info]: frame B:    379, Avg QP:18.47  kb/s: 71904.40  PSNR Mean: Y:49.690 U:48.715 V:51.575
    x265 [info]: Weighted P-Frames: Y:6.2% UV:1.8%
    encoded 500 frames in 121.29s (4.12 fps), 78132.80 kb/s, Avg QP:17.82, Global PSNR: 50.281
    Using x264 and:
    Code:
    ffmpeg -y -loglevel fatal -noautorotate -nostdin -threads 8 -i "G:\tos_dcp_test_04\tos_video.mxf" -map 0:0 -an -sn -vf  scale=in_range=pc:out_range=pc -sws_flags accurate_rnd+full_chroma_inp -pix_fmt yuv444p10le -strict -1 -vsync 0 -f rawvideo - | x264 --preset veryfast --pass 1 --bitrate 80529 --profile high444 --level 5.2 --direct auto --b-adapt 0 --sync-lookahead 48 --qcomp 0.50 --rc-lookahead 40 --qpmax 51 --aq-mode 0 --sar 1:1 --non-deterministic --range pc --stats "J:\tmp\tos_video_2023-01-15@08_12_11_6310_01.stats" --demuxer raw --input-res 4096x1716 --input-csp i444 --input-range pc --input-depth 10 --fps 24/1 --output-csp i444 --output-depth 10 --output NUL -
    
    ffmpeg -y -loglevel fatal -noautorotate -nostdin -threads 8 -i "G:\tos_dcp_test_04\tos_video.mxf" -map 0:0 -an -sn -vf  scale=in_range=pc:out_range=pc -sws_flags accurate_rnd+full_chroma_inp -pix_fmt yuv444p10le -strict -1 -vsync 0 -f rawvideo - | x264 --preset fast --pass 2 --bitrate 80529 --profile high444 --level 5.2 --direct auto --b-adapt 0 --sync-lookahead 48 --qcomp 0.50 --rc-lookahead 40 --qpmax 51 --partitions i4x4,p8x8,b8x8 --no-fast-pskip --subme 5 --trellis 0 --aq-mode 0 --vbv-maxrate 240000 --vbv-bufsize 720000 --sar 1:1 --non-deterministic --range pc --colormatrix bt2020c --stats "J:\tmp\tos_video_2023-01-15@08_12_11_6310_01.stats" --demuxer raw --input-res 4096x1716 --input-csp i444 --input-range pc --input-depth 10 --fps 24/1 --output-csp i444 --output-depth 10 --output "J:\tmp\2023-01-15@08_12_11_6310_02.264" -
    I get:
    Code:
    raw [info]: 4096x1716p 1:1 @ 24/1 fps (cfr)
    x264 [warning]: --psnr used with psy on: results will be invalid!
    x264 [warning]: --tune psnr should be used if attempting to benchmark psnr!
    x264 [info]: using SAR=1/1
    x264 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
    x264 [info]: profile High 4:4:4 Predictive, level 5.2, 4:4:4, 10-bit
    x264 [info]: frame I:11    Avg QP:22.68  size:1154335  PSNR Mean Y:54.43 U:53.19 V:55.69 Avg:54.13 Global:53.70
    x264 [info]: frame P:142   Avg QP:26.59  size:509731  PSNR Mean Y:51.06 U:50.56 V:53.44 Avg:51.24 Global:49.91
    x264 [info]: frame B:347   Avg QP:28.16  size:330732  PSNR Mean Y:50.75 U:50.11 V:51.94 Avg:50.64 Global:49.38
    x264 [info]: consecutive B-frames:  7.0%  1.2%  0.6% 91.2%
    x264 [info]: mb I  I16..4:  9.0% 36.8% 54.2%
    x264 [info]: mb P  I16..4: 22.0%  0.0% 21.0%  P16..4: 14.9% 15.7% 16.0%  0.0%  0.0%    skip:10.5%
    x264 [info]: mb B  I16..4:  8.7%  0.0%  8.3%  B16..8: 21.6% 18.6%  4.2%  direct:15.2%  skip:23.4%  L0:36.8% L1:32.7% BI:30.4%
    x264 [info]: 8x8 transform intra:3.1% inter:46.0%
    x264 [info]: direct mvs  spatial:99.7% temporal:0.3%
    x264 [info]: coded y,u,v intra: 71.4% 38.4% 24.1% inter: 42.4% 16.2% 7.0%
    x264 [info]: i16 v,h,dc,p: 16% 18% 24% 42%
    x264 [info]: i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 16% 21% 14%  4%  9%  8%  8%  8% 12%
    x264 [info]: i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 16% 18% 19%  4% 12%  8%  8%  7%  9%
    x264 [info]: Weighted P-Frames: Y:4.2% UV:0.7%
    x264 [info]: ref P L0: 57.0% 43.0%
    x264 [info]: ref B L0: 78.7% 21.3%
    x264 [info]: ref B L1: 91.5%  8.5%
    x264 [info]: PSNR Mean Y:50.920 U:50.308 V:52.447 Avg:50.888 Global:49.591 kb/s:76739.84
    encoded 500 frames, 11.89 fps, 76739.84 kb/s
    2023-01-15@08_14_19_0510_02_video finished after 00:00:42.333
    Using x264 medium preset with tune psnr:
    Code:
    ffmpeg -y -loglevel fatal -noautorotate -nostdin -threads 8 -i "G:\tos_dcp_test_04\tos_video.mxf" -map 0:0 -an -sn -vf  scale=in_range=pc:out_range=pc -sws_flags accurate_rnd+full_chroma_inp -pix_fmt yuv444p10le -strict -1 -vsync 0 -f rawvideo - | x264 --preset veryfast --pass 1 --bitrate 80529 --profile high444 --level 5.2 --sync-lookahead 48 --rc-lookahead 40 --weightp 2 --sar 1:1 --non-deterministic --range pc --stats "J:\tmp\tos_video_2023-01-15@08_17_48_6410_01.stats" --demuxer raw --input-res 4096x1716 --input-csp i444 --input-range pc --input-depth 10 --fps 24/1 --output-csp i444 --output-depth 10 --output NUL -
    
    ffmpeg -y -loglevel fatal -noautorotate -nostdin -threads 8 -i "G:\tos_dcp_test_04\tos_video.mxf" -map 0:0 -an -sn -vf  scale=in_range=pc:out_range=pc -sws_flags accurate_rnd+full_chroma_inp -pix_fmt yuv444p10le -strict -1 -vsync 0 -f rawvideo - | x264 --tune psnr --pass 2 --bitrate 80529 --profile high444 --level 5.2 --sync-lookahead 48 --no-mbtree --vbv-maxrate 240000 --vbv-bufsize 720000 --sar 1:1 --psnr --non-deterministic --range pc --colormatrix bt2020c --stats "J:\tmp\tos_video_2023-01-15@08_20_50_7110_01.stats" --demuxer raw --input-res 4096x1716 --input-csp i444 --input-range pc --input-depth 10 --fps 24/1 --output-csp i444 --output-depth 10 --output "J:\tmp\2023-01-15@08_20_50_7110_02.264" -
    I got:
    Code:
    raw [info]: 4096x1716p 1:1 @ 24/1 fps (cfr)
    x264 [info]: using SAR=1/1
    x264 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
    x264 [info]: profile High 4:4:4 Predictive, level 5.2, 4:4:4, 10-bit
    x264 [info]: frame I:11    Avg QP:25.73  size:963908  PSNR Mean Y:52.71 U:55.71 V:57.66 Avg:54.70 Global:54.14
    x264 [info]: frame P:128   Avg QP:27.97  size:515082  PSNR Mean Y:50.89 U:54.37 V:56.52 Avg:52.91 Global:51.80
    x264 [info]: frame B:361   Avg QP:29.95  size:326274  PSNR Mean Y:49.24 U:52.61 V:54.87 Avg:51.25 Global:49.87
    x264 [info]: consecutive B-frames:  2.8%  1.6%  3.6% 92.0%
    x264 [info]: mb I  I16..4: 20.6% 76.6%  2.8%
    x264 [info]: mb P  I16..4: 17.2% 37.0%  1.7%  P16..4: 15.6%  9.2%  3.8%  0.0%  0.0%    skip:15.4%
    x264 [info]: mb B  I16..4:  5.1%  9.7%  0.7%  B16..8: 18.6%  7.9%  1.5%  direct:20.7%  skip:35.9%  L0:34.2% L1:29.0% BI:36.8%
    x264 [info]: 8x8 transform intra:65.5% inter:86.9%
    x264 [info]: coded y,u,v intra: 74.7% 59.0% 41.8% inter: 39.4% 36.0% 30.9%
    x264 [info]: i16 v,h,dc,p:  9% 16% 25% 50%
    x264 [info]: i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 15% 15% 16%  6%  9%  9%  9%  9% 11%
    x264 [info]: i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 12% 23% 14%  6% 12%  9%  9%  7%  8%
    x264 [info]: Weighted P-Frames: Y:1.6% UV:0.8%
    x264 [info]: ref P L0: 49.3% 35.2% 15.4%  0.1%
    x264 [info]: ref B L0: 78.7% 17.1%  4.3%
    x264 [info]: ref B L1: 92.9%  7.1%
    x264 [info]: PSNR Mean Y:49.734 U:53.128 V:55.351 Avg:51.749 Global:50.354 kb/s:74618.31
    encoded 500 frames, 11.21 fps, 74618.31 kb/s
    Bottom line is that none of these encodes achieved PSNR of at least 45dB bit in any meaningful way.
    seems to be an issue with your settings

    => With slower presets, you can easily archive your 45db PSNR goal.

    Cu
    Selur
    users currently on my ignore list: deadrats, Stears555
    Quote Quote  
  4. Originally Posted by sophisticles View Post

    If simply posted these numbers, or graphed them in a chart, many would marvel at the PSNR x265 and svt-av1 achieved but if you watched the files you would not conclude that both of these is twice as good as x264, thereby further validating the claim that PSNR is a poor predictor of quality.

    But as i alluded to earlier, the devil is in the details
    Indeed. Especially as a double PSNR value does not mean "twice as good" anyway, because the Decibels are a logarithmic measure. A 3dB step is a factor of 2, so in linear terms

    x264: 30.1667dB = linear ratio of 1'039 => reference
    x265: 69.6479dB = linear ratio of 9'221'229 => 8'874 times better than x264?
    svt-av1: 71.2526dB = linear ratio of 13'343'322 => 12'841 times better than x264, and 1.447 times better than x265?

    What is double quality? What is double loudness? What is double pain?
    Quote Quote  
  5. The main issue was testing error, if you adjust the levels before you encode, of course it's going measure different. It's important to look at the min values, but if you have errors in the testing methodology => you're going to arrive at the wrong conclusions

    If you treat the mxf DCP as YUV, it will not quite look like the promo pictures, because XYZ is not YUV . For a DCP XYZ source, what you should be doing is converting XYZ to RGB linear (usually with a LUT, at 16bit or float) , then RGB to YUV (using some standardized matrices like 709 or 2020, +/- adjusting primaries) , then test that as the YUV source. What you should not be doing is converting the source differently before each encoder resulting in different colors, levels. DCP not a common end user format, and often mishandled

    PSNR just measures the values, so if you decode the mxf as YUV and encode it as YUV, PSNR will measure the signal to noise ratio of the encode vs. the "source" - so it's valid in that sense, even if it looks slightly different than the final graded promo pictures.

    For the range expanded "ToS x265 veryfast tune psnr 12bit 444.mp4" , the correct values are
    PSNR y:30.913554 u:42.733957 v:42.240891 average:35.117907 min:28.849992 max:36.465649
    Not sure how the max value reported was "max. val 69.64788818" . It's a red flag, and doesn't make sense. The levels are completely different than the source (because of the range expansion), there is no way a single frame can have 69 dB. eg. The "black" frame has Y=0 in the range expanded x265 version, but Y=256 in the raw mxf source.

    For "ToS svt-av1 10-bit.mkv", the correct values are
    PSNR y:49.625426 u:50.240122 v:54.620023 average:50.259750 min:41.360727 max:inf
    Notice the max:inf . It gets the solid black frame correct at YUV 64,512,512 for 10bit , when converted to 12bit it's the correct 256,2048,2048 . "Difficult to get a black frame wrong"...

    But the correct non range adjusted x265 encode max psnr is 85.736507 . I would expect infinity, or some very high number like 99+ . But examining the black frame at the end, it's not pure 256,2048,2048 - there are a few pixels in the top right corner that are a bit off (e.g. 256, 2049,2049), but 99% of the frame is correct. "Difficult to get a black frame wrong", but x265 manages it here...
    Quote Quote  
  6. PSNR is Peak - even single error may significantly impact result, also 45dB means you need more than 8 bit as 45dB is approximately 7.5 bit - PSNR is OK only when you know what are you doing and only if you can control everything.
    Last edited by pandy; 17th Jan 2023 at 13:51.
    Quote Quote  
  7. In my defense, I used standard off-the-shelf encoding front-ends to do these tests, such as Selur's Hybrid and Handbrake and i tried it with both Win 10 and Ubuntu and i used MSU's video quality assessment tool for PSNR calculations.

    Reading through the comments above tells me that none of these tools is able to correctly deal with this source.

    @Selur, is it possible to configure Hybrid to properly handle this source?

    Based on your and PDR's comments it sounds like filtering needs to be done prior to feeding this source into an encoder but since i use both Win10 and Linux I would prefer a cross platform solution, something that doesn't rely on a Windows specific filtering chain, like using avisynth or considering WINE to work with avisynth.
    Last edited by sophisticles; 15th Jan 2023 at 19:40.
    Quote Quote  
  8. Another way you could do it is take the graded 16bit sRGB tiffs, convert to some other format such as YUV444P12 (or more standardized for end user would be YUV420P10), then use that as the "source." You could use ffv1 for example. If you use standard range Rec709 for the 16bit RGB to YUV conversion it will easily handled by all software and will look just like the promo images. The tiffs are 4096x1714, so they have 2 vertical pixels less than the DCP version. Another difference is the quality is higher in the tiff's. If you zoom in on the DCP version, you can actually see macroblocking. The tiffs retain more of the original signal including the VFX grain that was added in post
    https://media.xiph.org/tearsofsteel/tearsofsteel-4k-tiff/
    Quote Quote  
  9. Originally Posted by poisondeathray View Post
    Another way you could do it is take the graded 16bit sRGB tiffs, convert to some other format such as YUV444P12 (or more standardized for end user would be YUV420P10), then use that as the "source." You could use ffv1 for example. If you use standard range Rec709 for the 16bit RGB to YUV conversion it will easily handled by all software and will look just like the promo images. The tiffs are 4096x1714, so they have 2 vertical pixels less than the DCP version. Another difference is the quality is higher in the tiff's. If you zoom in on the DCP version, you can actually see macroblocking. The tiffs retain more of the original signal including the VFX grain that was added in post
    https://media.xiph.org/tearsofsteel/tearsofsteel-4k-tiff/
    LOL, hysterical.

    https://mango.blender.org/

    First off we need to generate an image sequence of the whole short film, using an appropriate image format, such as TIFF 16bit sRGB. In our case we are talking about 17616 frames, taking 850GB of space. The 4K DCI compliant resolution for our aspect ratio is 4096x1716px.
    Quote Quote  
  10. @Selur, is it possible to configure Hybrid to properly handle this source?
    No clue, never looked into handling XYZ color space.
    Atm. Hybrid will tell LWLibavSource to output YUV 4:4:4 12bit so the XYZ->YUV conversion would be done by LWLibavSource .
    My guess is that using 'Filtering->Vapoursynth->Matrix->TimeCube' with 'DCIYZX_to_YUVBT709' or 'DCIYZX_to_BT2020_PQ' might be the right way, but I don't really know.
    (FFMS2 will also convert to YUV444P12)
    -> If someone can share a way to properly handle XYZ in Vapoursynth I can look into adding support for it in Hybrid.

    Like poisondeathray mentioned, since the psnr calculation also would need to be adjusted to this, it shouldn't really matter for the PSNR calculation and only for the res actual color representation. (atm. both the encoding and the psnr calculation would take slightly wrong colors as bases, but they use the same basis)


    Cu Selur
    users currently on my ignore list: deadrats, Stears555
    Quote Quote  
  11. Yes it doesn't really matter for an encoding test as long as you are consistent in the handling .

    For the DCI XYZ conversion - I described it above - it's usually done through LUTs. eg. You can do it through Davinci Resolve - DCI to Linear , Linear to sRGB. And from sRGB you can control the YUV conversion. It should be possible in vapoursynth/avisynth too with cube or dgcube using luts

    The DCP version looks denoised (not just jpeg2000 compression, there is evidence of additional processing) and graded slightly differently than the 16bit sRGB tiff version - the hues are slightly shifted, skin tones slightly different - again , it doesn't really matter for encoding tests - just pointing it out for interest

    If I were using the DCP version , but treating it directly YUV, I would expand the range so it's "normal range" - because the content frames are essentially range compressed /low contrast - so it will looked "washed out" with elevated black level, depressed white levels. The black frames are actually sub video black. So instead of being "legal range" black at any bitdepth (Instead of Y=16 at 8bit, Y=64 at 10bit, or Y=256 at 12bit) they will become "zero" if you apply the range expansion . Another reason you would want to do this, instead of treating it as a low contrast YUV, is it looks more normal, and a larger range of values (instead of a narrow band) makes it more difficult to compress. ie. You would expect lower psnr levels at a given bitrate/encoder/settings for a normal contrast version vs. low contrast version.
    Quote Quote  
  12. For the DCI XYZ conversion - I described it above - it's usually done through LUTs. eg. You can do it through Davinci Resolve - DCI to Linear , Linear to sRGB. And from sRGB you can control the YUV conversion. It should be possible in vapoursynth/avisynth too with cube or dgcube using luts
    Okay, so the 'Filtering->Vapoursynth->Matrix->TimeCube' approach is probably correct.
    users currently on my ignore list: deadrats, Stears555
    Quote Quote  



Similar Threads

Visit our sponsor! Try DVDFab and backup Blu-rays!