VideoHelp Forum
+ Reply to Thread
Results 1 to 18 of 18
Thread
  1. Hi !
    Need some help trying to understand what's going on with this sweet test o' mine.

    I've downloaded this 4K HDR10 test video: https://4kmedia.org/lg-new-york-hdr-uhd-4k-demo/
    With MKVToolnix I discarded the audio and put the video stream in MKV container.

    First, I used this command to convert it to FHD SDR:
    Code:
    ffmpeg -hide_banner -loglevel error -stats -r 25 -i "HDR10.mkv" -vf setsar=sar=1,zscale=size=1920x1080:filter=lanczos:primariesin=bt2020:transferin=smpte2084:matrixin=bt2020nc:rangein=tv:transfer=linear:npl=100,format=gbrpf32le,zscale=primaries=bt709,tonemap=tonemap=hable:desat=0,zscale=transfer=bt709:matrix=bt709:range=tv,format=yuv420p10le -map 0:0 -c:v:0 libx265 -x265-params log-level=error:sar=1:fps=25:colorprim=bt709:transfer=bt709:colormatrix=bt709:range=limited:crf=18 -map_metadata -1 -map_chapters -1 -max_muxing_queue_size 1024 "HDR10_to_SDR.mkv"
    Then I used this command to convert it to FHD HDR10:
    Code:
    ffmpeg -hide_banner -loglevel error -stats -r 25 -i "HDR10.mkv" -vf setsar=sar=1,zscale=size=1920x1080:filter=lanczos:primariesin=bt2020:transferin=smpte2084:matrixin=bt2020nc:rangein=tv:primaries=bt2020:transfer=smpte2084:matrix=bt2020nc:range=tv,format=yuv420p10le -map 0:0 -c:v:0 libx265 -x265-params log-level=error:sar=1:fps=25:hdr-opt=1:repeat-headers=1:colorprim=bt2020:transfer=smpte2084:colormatrix=bt2020nc:range=limited:master-display=G(13250,34500)B(7500,3000)R(34000,16000)WP(15635,16450)L(12000000,500):max-cll=0,0:crf=18 -map_metadata -1 -map_chapters -1 -max_muxing_queue_size 1024 "HDR10_to_HDR10.mkv"
    The resulting SDR video weights 128677Kb
    The resulting HDR video weights 80507Kb

    I assumed an HDR video would weight more but this test shows this results... and I don't understand why.

    Is this an expected outcome or something went wrong?
    Is there something in my commands that provoked it?
    Quote Quote  
  2. Member
    Join Date
    Mar 2008
    Location
    United States
    Search Comp PM
    The output uses CRF lossy encoding, size is unpredictable
    Quote Quote  
  3. Originally Posted by davexnet View Post
    The output uses CRF lossy encoding, size is unpredictable
    Thanks for answering!

    Size is relatively predictable. An 8K movie crf 18 is gonna weight more than the same movie 640x360 crf 18.

    I believed the same could be said of HDR10 vs SDR (more bits of data in HDR) but the tests I've made are showing this is not the case.
    As I am far from an expert in this matter I posted here asking for help in understanding why is this happening or what am I doing wrong to get these results.

    CRF has a part in this, I'm sure, but there has to be something more I am not understanding... and that's why I'm asking.
    Quote Quote  
  4. Member
    Join Date
    Mar 2008
    Location
    United States
    Search Comp PM
    There may be something else in your very elaborate command line limiting the bitrate
    Quote Quote  
  5. Originally Posted by davexnet View Post
    There may be something else in your very elaborate command line limiting the bitrate
    Maybe... but I don't know what.
    Both share the same level and tier -- Main 10 / L4 / Main
    • Bitrate for HDR version is about 9 Mb/s
    • Bitrate for SDR version is about 14 Mb/s
    Obviously, that's where the weight difference resides but AFAIK I am not limiting bitrate in any way apart from what level/tier impose to both.
    Color differences aside, image detail or artifacts are mostly at the same level to my eyes.

    I've tried other videos with different content, increased length, etc, and the weight difference is proportionally higher for the SDR each and every time.

    So I'm still at a loss


    PS: a zscale parameter, "nominal peak luminance" or npl do affects the resulting weight. At npl=500 white levels are so low that it seems lots of detail is lost in the shadows and it's therefore more compressible than at npl=100. At npl=1 white levels are so high that colors are oversaturated all over the place and it reveals lots of noise in the shadows that makes it much less compressible than npl=100. My tests were done at npl=100.

    Anyway, here lies the root of my "gordian knot" : If SDR implies less color data... how can SDR weight more with less data?
    My hypothesis so far is that SDR conversion creates... let's call it extra "imaginary" data... when trying to transform a bigger colorspace into an smaller one that these extra data --something akin to compressing a grainy vs not-grainy video-- results in a video weighting more with the same effective detal. Does that make sense?
    Last edited by bokeron2020; 14th Aug 2021 at 15:44.
    Quote Quote  
  6. It's the expected result -

    Compression-wise an encoder doesn't care if something is "HDR". It just cares about Y'CbCr data distribution per frame (spatial) and across frames (temporal)

    Raw HDR Y'CbCr image data will generally have low contrast. When it's not tone mapped or you're not viewing on an HDR display - it will look washed out , low contrast , low saturation - There is your answer

    If you look at the Y' waveform or CbCr waveforms, the data is generally range compressed for an HDR image, compared to it's tonemapped SDR version.

    To look at it another way - If you take a SDR video to begin with, and increase the contrast - bitrate goes up for a given QP or CRF. Same if you increase saturation, or local or global contrast. Those changes "consume" more bitrate
    Quote Quote  
  7. Originally Posted by poisondeathray View Post
    It's the expected result -

    Compression-wise an encoder doesn't care if something is "HDR". It just cares about Y'CbCr data distribution per frame (spatial) and across frames (temporal)

    Raw HDR Y'CbCr image data will generally have low contrast. When it's not tone mapped or you're not viewing on an HDR display - it will look washed out , low contrast , low saturation - There is your answer

    If you look at the Y' waveform or CbCr waveforms, the data is generally range compressed for an HDR image, compared to it's tonemapped SDR version.

    To look at it another way - If you take a SDR video to begin with, and increase the contrast - bitrate goes up for a given QP or CRF. Same if you increase saturation, or local or global contrast. Those changes "consume" more bitrate

    Thanks for taking the time to explain the root cause.

    Now I need to invert some time to better understand the maths involved.
    I should try to find some waveform monitor and/or vectorscope to help me visualize what you've explained -- once I've learned how to use them Could you recommend some free tool for this HDR vs SDR purpose?
    Quote Quote  
  8. Originally Posted by bokeron2020 View Post

    Now I need to invert some time to better understand the maths involved.
    I should try to find some waveform monitor and/or vectorscope to help me visualize what you've explained -- once I've learned how to use them Could you recommend some free tool for this HDR vs SDR purpose?

    avisynth , ffmpeg


    I use avisynth because of avspmod (it's like a script editor with preview). You can compare versions in tabs and use number keys to hot swap . Timeline is aligned and shared, so for example if you navigate to frame 76, all tabs and versions are aligned, so it's easy to compare with number keys.

    Histogram() function is misnamed - it's actually a Y' waveform .
    http://avisynth.nl/index.php/Histogram

    To examine the Y plane, Histogram(bits=10) since you're using 10bit data. 10bit data goes from 0-1023 code values

    To examine Cb or Cr planes (also called U, V) , you can use ExtractU (or UtoY), and ExtractV (or VtoY) before calling the Histogram function. You can also use ConvertToY() or Greyscale(), to see the Y plane only, but by default Histogram() analyzes the Y plane only anyway; you'd do that only to "see" the Y plane and correlate with the waveform (You're discarding the color information in the preview)



    In ffmpeg you can use -vf waveform, with many options. Also 8,10,12 bit compatible. You could use it in ffplay or mpv, for example

    https://ffmpeg.org/ffmpeg-filters.html#waveform


    Or a free/opensource NLE like shotcut might be easier for most people, but it's 8bit only (10bit values are converted to 8bit), but the trend should be obvious
    Quote Quote  
  9. @poisondeathray

    Thank you for all the info you've shared.
    I've read a lot about this already but I'm still scratching the surface of the more technical stuff theory and practice... and that's where I have to go. I need to understand why I do what I'm doing and where does that come from!

    REGARDS!
    Quote Quote  
  10. Originally Posted by bokeron2020 View Post
    @poisondeathray

    Thank you for all the info you've shared.
    I've read a lot about this already but I'm still scratching the surface of the more technical stuff theory and practice... and that's where I have to go. I need to understand why I do what I'm doing and where does that come from!

    REGARDS!

    What do you need help with , or what doesn't make sense ?


    The "disconnect" you had originally was that you were looking at an RGB converted representation, not the underlying YCbCr data that the encoder actually uses to store bits:

    Originally Posted by bokeron2020 View Post
    "Color differences aside, image detail or artifacts are mostly at the same level to my eyes"
    Quote Quote  
  11. one explaination: HDR video potentially has more (dark) areas where x265 will use higher compression.
    users currently on my ignore list: deadrats, Stears555
    Quote Quote  
  12. Originally Posted by poisondeathray View Post
    What do you need help with , or what doesn't make sense ?
    ...
    The "disconnect" you had originally was that you were looking at an RGB converted representation, not the underlying YCbCr data that the encoder actually uses to store bits:
    I don't know what I need help with yet.
    Things started to make sense now as your answer has made me realize I was looking at it from the wrong point of view (I was 'RGB minded' as you point) so I need to go back to the start and figure it out again from there. Then the questions will start appearing. I'll need/want to understand the theoretical basis while I deal with the practical side of it.
    Quote Quote  
  13. Originally Posted by bokeron2020 View Post
    I don't know what I need help with yet.
    Things started to make sense now as your answer has made me realize I was looking at it from the wrong point of view (I was 'RGB minded' as you point) so I need to go back to the start and figure it out again from there. Then the questions will start appearing. I'll need/want to understand the theoretical basis while I deal with the practical side of it.

    "HDR" is really metadata. Those mastering display, min, max etc... instruct the display... how to "display" it. But displays generally only work in RGB - so you're looking at the RGB converted representation of that YCbCr data, whether or not it's SDR or HDR. There are many different ways to convert to RGB (even SDR), many different ways to tonemap - so how you convert to RGB can affect your results. It's not as accurate. Just look at 16 different brands of TV's , you'll get 16 different results. The encoder settings are using YCbCr, so that is what is important to look at

    Conversely, waveform monitors look at the Y' data directly (or proper ones do) . So a Y' waveform looks at the Y' channel and plots the values from 0-1023 for 10bit code values. Those values are exact and perfect. If you look at at CbCr waveform, those channel values are examined directly. Perfect.

    The SDR converted version has higher contrast - that's the explanation. More YCbCr code values are taken up, so it "costs" more bits. If you look at the waveform - on the HDR version - large portions of the waveform are "black" or "empty" (ie. no data; I don't mean the "color" black Y=64, which takes up data). It has low contrast and is range compressed compared to the SDR version. No data costs fewer bits than more data. Nothing costs less than something. So those large empty areas on the waveform cost basically nothing. 1 code value of Y' = 512 "costs" less than 5 different code Y' values of 510,511,512,513,514 . Modern compression works by looking for redudancies. The more redundant , the higher the compression, the lower the bits, the lower the filesize. And there are more redundancies in the HDR version spatially and temporally.

    Since you already use ffmpeg, I'll stick with ffmpeg for now. Just so everyone is on the same page, here is 1 frame encoded that sample with both version. Same command lines as you posted in the 1st post, except at 4 seconds (because it starts with "black", and that's not useful)

    i.e.

    Code:
    -ss 00:00:04 -frames:v 1

    For now, just look at the Y' channel, but if you examine CbCr, you will also see the waveform is has more contrast and range code values on the SDR version, and has

    Code:
    ffmpeg -i "HDR10_to_SDR_1frame.mkv" -vf waveform=g=green:o=0.5 HDR10_to_SDR_1frame_Ywaveform.png
    Image
    [Attachment 60326 - Click to enlarge]


    Code:
    ffmpeg -i "HDR10_to_HDR10_1frame.mkv" -vf waveform=g=green:o=0.5 HDR10_to_HDR10_1frame_Ywaveform.png
    Image
    [Attachment 60327 - Click to enlarge]
    Image Attached Files
    Quote Quote  
  14. The graphs are great to understand what you meant, and regarding the compresion results they make sense.

    There's something that is counterintuitive to me, though.
    If I'm not understanding the whole HDR thing wrong, its purpose is to allow a broader range of luminance so there can be more detail shown in lights/shadows, and more datail means more data(?)

    So I don't understand the graphs. I would have expected just the opposite of what I see, that is, I would have thought there would be more data/more spread/closer to the legal limits with an HDR image.

    Something's gotta click in my mind yet, it seems.
    Quote Quote  
  15. Originally Posted by bokeron2020 View Post
    If I'm not understanding the whole HDR thing wrong, its purpose is to allow a broader range of luminance so there can be more detail shown in lights/shadows, and more datail means more data(?)
    It does in terms of what you "see" ie. the RGB conversion - HDR on a HDR panel does display more in the end - wider range (brighter brights, darker darks, more vivid colors, more unique color combinations). Not only is the panel technology different , the underlying math equations used are different - ie. how the YCbCr data is converted to RGB for display

    Rec709 for SDR is very limiting - ~2/3 of possible YCbCr combinations cannot even be expressed in RGB color model when using Rec709 equations (!). There are many "illegal" combinations that result in values >1023 in 10bit or negative RGB numbers. Many combinations actually map to the same duplicate pixel value for 10bit YCbCr to 10bit RGB .

    Another way to say this is: there are more unique color combinations possible with Rec2020. More colors are valid in the YCbCr to RGB conversion, instead of duplicates, or being clipped. Look up color gamut diagrams for Rec2020 vs. Rec709 - 2020 encompasses much more area - representing more possible visible colors to human eye

    https://en.wikipedia.org/wiki/Rec._2020
    https://en.wikipedia.org/wiki/Rec._709
    https://www.benq.com/en-us/business/resource/trends/understanding-color-gamut.html
    Quote Quote  
  16. I've been doing some testing, tinkering and reading so let me see if I'm getting somewhere now.

    This is ffplay processing in real-time HDR to SDR and their respective waveforms.
    Image
    [Attachment 60348 - Click to enlarge]


    On the right, the clouds' white levels are so high that are clipping, on the left they didn't even reach 75% on the IRE scale.
    So, this means that the left signal, with the proper hardware instead of the old one at the left, would show that image as is meant to be shown instead of this grayish unsaturated version.

    That signal, as the waveform shows, has still margin to include higher white levels... which wouldn't be possible before... but now, with HDR-capable screens, that extra margin allows that screen to show really high whites without clipping while showing the rest of the image at a normal white level.

    Did I get this right?
    Quote Quote  
  17. Originally Posted by bokeron2020 View Post
    On the right, the clouds' white levels are so high that are clipping, on the left they didn't even reach 75% on the IRE scale.
    So, this means that the left signal, with the proper hardware instead of the old one at the left, would show that image as is meant to be shown instead of this grayish unsaturated version.
    The left side's "unnatural" grey, low contrast - is because YCbCr has been converted to RGB using Rec709 for the preview. That's not what a HDR display uses. Right also uses 709, but has been tonemapped beforehand by basically adjusting contrast, saturation - to simulate something similar to what you might see on a HDR display. You can guess that proper hardware would show the image as meant to be shown (and you would probably be right), but be aware there are many different types of tonemapping algorithms and settings, and many variations of hardware HDR display processing that can yield quite different results


    That signal, as the waveform shows, has still margin to include higher white levels... which wouldn't be possible before... but now, with HDR-capable screens, that extra margin allows that screen to show really high whites without clipping while showing the rest of the image at a normal white level.

    Not exactly;

    That IRE waveform is just showing what levels are in the video (in converted mV ), if you assumed Rec709. That's all you can definitively say.

    You can infer/guess that on a HDR panel you'd get "really high whites without clipping while showing the rest of the image at a normal white level." (and you'd probably be right), but ffmpeg's IRE waveform does not neccesarily prove or disprove that. All ffmpeg's IRE reading is actually doing is replacing Y' 16-235 (or Y' 64-940 in 10bit) with 0-100 mV ie. It actually "cheats" and converts based on the Y' waveform reading. Technically IRE should be tied to the color model and color science used, and whether full or limited range is used, or what black and white reference points are - so it's not a true IRE reading.

    And there is quite a bit of variation on HDR processing and display quality. Indeed, many "can show really high whites without clipping", but some do it much better, some much worse.
    Quote Quote  
  18. Looking at the bright side, at least I took the screen capture properly.
    Last edited by bokeron2020; 17th Aug 2021 at 21:55.
    Quote Quote  



Similar Threads

Visit our sponsor! Try DVDFab and backup Blu-rays!