VideoHelp Forum
+ Reply to Thread
Page 2 of 4
FirstFirst 1 2 3 4 LastLast
Results 31 to 60 of 99
Thread
  1. As I said, to me the MC encode looks "better" because I feel it looks for "realistic". In order to understand better, for years when AVC first came on the scene I hated it, I thought MPEG-2 gave a more "natural" look, it was a long time before I saw an AVC encode that I preferred over a good MPEG-@ encode, even for HD content. I tend to like that 70's shot on film look over most shot on video look, if you know what I mean. For instance, I love the way Jaws 1 and Rocky 1 look, I hated how Avengers End Game looked, to me it looked like it was way out of focus, and I saw it up close in the theater.

    If possible, would you do a proper Meridian test, with the original 96GB file, unscaled, do a 2 pass x264 and 2 pass MC with AQ All 100, choose a bit rate that will result in a file under the 500MB limit this forum imposes on uploads and let's see who wins.
    Quote Quote  
  2. Originally Posted by sophisticles View Post
    As I said, to me the MC encode looks "better" because I feel it looks for "realistic".
    Ok, that's a different discussion, but still important to look at

    I asked which is more "similar" to the source, and why. How did you arrive at that decision . Truly, I'm trying to understand "perception" between people better

    Also which encode in particular, and if 3 are too close in terms are similarity, then say which ones

    Thanks

    In order to understand better, for years when AVC first came on the scene I hated it, I thought MPEG-2 gave a more "natural" look, it was a long time before I saw an AVC encode that I preferred over a good MPEG-@ encode, even for HD content. I tend to like that 70's shot on film look over most shot on video look, if you know what I mean. For instance, I love the way Jaws 1 and Rocky 1 look, I hated how Avengers End Game looked, to me it looked like it was way out of focus, and I saw it up close in the theater.
    Same with me for AVC. Oversmoothed, "plasticky doll" look is how I would describe it. . Even though MPEG2 had more "noise" it was more natural looking

    If possible, would you do a proper Meridian test, with the original 96GB file, unscaled, do a 2 pass x264 and 2 pass MC with AQ All 100, choose a bit rate that will result in a file under the 500MB limit this forum imposes on uploads and let's see who wins.
    It's too big. Unscaled isn't really useful for AVC either, because it has limited relevance. Netflix does't use it, YT doesn't use it. There is some UHD AVC broadcast in Europe, but it's limited. It would make more sense to do HEVC , AV1 , those types for the unscaled scenario.
    Quote Quote  
  3. When you look at these, which is more similar to the original "a" and "why" or how did you come to that decision ? If certain parts are more similar or more different, then describe what

    You can swap between them with the number keys

    http://www.framecompare.com/image-compare/screenshotcomparison/DWK6LNNX

    (Bear in mind this is just a cropped single frame, it doesn't necessarily indicate or represent anything; for example not motion characterics, not other frames or parts of other frames)

    Thanks
    Quote Quote  
  4. Comparisons - Series 2 and 3 . Same deal; number keys or mouse over

    series 2
    http://www.framecompare.com/image-compare/screenshotcomparison/J0FMMNNU

    series 3
    http://www.framecompare.com/image-compare/screenshotcomparison/DWK6PNNX

    I'm more interested in the "why" or how you came to a decision . The more description, the better. If certain areas are similar say so, but if certain areas are different say so.

    Thanks for your input
    Quote Quote  
  5. Video Restorer lordsmurf's Avatar
    Join Date
    Jun 2003
    Location
    dFAQ.us/lordsmurf
    Search Comp PM
    Originally Posted by poisondeathray View Post
    Comparisons - Series 2 and 3 . Same deal; number keys or mouse over

    series 2
    http://www.framecompare.com/image-compare/screenshotcomparison/J0FMMNNU

    series 3
    http://www.framecompare.com/image-compare/screenshotcomparison/DWK6PNNX

    I'm more interested in the "why" or how you came to a decision . The more description, the better. If certain areas are similar say so, but if certain areas are different say so.

    Thanks for your input
    What is this?
    The skimmable, Cliff's Notes, TL;DR version, please.
    Want my help? Ask here! (not via PM!)
    FAQs: Best Blank DiscsBest TBCsBest VCRs for captureRestore VHS
    Quote Quote  
  6. Originally Posted by lordsmurf View Post
    Originally Posted by poisondeathray View Post
    Comparisons - Series 2 and 3 . Same deal; number keys or mouse over

    series 2
    http://www.framecompare.com/image-compare/screenshotcomparison/J0FMMNNU

    series 3
    http://www.framecompare.com/image-compare/screenshotcomparison/DWK6PNNX

    I'm more interested in the "why" or how you came to a decision . The more description, the better. If certain areas are similar say so, but if certain areas are different say so.

    Thanks for your input
    What is this?
    The skimmable, Cliff's Notes, TL;DR version, please.




    For those, the short version is " Which is more similar to the source a ?"

    b,c,d, or e

    Look at each of the three series

    http://www.framecompare.com/image-compare/screenshotcomparison/DWK6LNNX
    http://www.framecompare.com/image-compare/screenshotcomparison/J0FMMNNU
    http://www.framecompare.com/image-compare/screenshotcomparison/DWK6PNNX

    Sometimes it's easier to distinguish similarities / differences if images are superimposed

    So if you mouse over, or toggle them by using number key, so you can check back quickly to source by hitting "1"

    Some of these screenshots are not in the posted videos

    It feels like some kids game, but really it isn't.

    I'm interested in the decision making process, and what characteristics make someone say it's more similar (or less similar). The description of what is similar (or less similar) is important here

    It's also part of feedback for VMAF improvement


    Thanks
    Quote Quote  
  7. I honestly can't see the difference between any of those screenshots, either my eyes are shot, my monitor is crap or the difference between any of those encodes is so small as to be insignificant.
    Quote Quote  
  8. Most similar to the source:
    c (look at the low contrast flower of the tie), d2, d3.

    c3 produces color artifacts (e.g. around the hat) and more jaggies
    Last edited by Sharc; 13th Oct 2019 at 04:19. Reason: added comment for c3
    Quote Quote  
  9. Dinosaur Supervisor KarMa's Avatar
    Join Date
    Jul 2015
    Location
    US
    Search Comp PM
    Originally Posted by sophisticles View Post
    I honestly can't see the difference between any of those screenshots, either my eyes are shot, my monitor is crap or the difference between any of those encodes is so small as to be insignificant.
    There's a fairly noticeable difference between all of them, some more than others. With C being closest to the source, followed by D, and a fight for the worst between B and E.
    Quote Quote  
  10. I would rank:

    Series 1:
    C
    D
    B
    E

    Series 2:
    D
    C
    E
    B

    Series 3:
    D
    C (hand color changes?)
    E
    B

    I think biggest difference is the noise level. D and C keep it. E and B do not. The sky in B has horrible banding. For series 1 the red flowers/leaves on the tie change much.
    Last edited by sneaker; 13th Oct 2019 at 04:21.
    Quote Quote  
  11. Dinosaur Supervisor KarMa's Avatar
    Join Date
    Jul 2015
    Location
    US
    Search Comp PM
    Yeah D won #3 hands down. Everything about the tire in the lower right and the guy's suit. I did notice the color change on C, might be deep into a GOP or it's just that way.
    Quote Quote  
  12. Originally Posted by sophisticles View Post
    I honestly can't see the difference between any of those screenshots, either my eyes are shot, my monitor is crap or the difference between any of those encodes is so small as to be insignificant.


    Are they 100% identical ? When you toggle the number keys or mouse over and flip back to "1" , are there any differences ? Can you see the pictures swap ?

    Could it be a browser issue ?

    Are you looking closely, regular computer distance, or living room TV distance ? Look closely

    Were you able to see differences in the 8bit vs. 10bit "Lighthouse" screenshots in the other thread?

    The problem with most picture viewers is you usually can't hot swap back to a specific one, such as the original . You can with avspmod/avisynth , or vapoursynth multiviewer - and also with video - such that all versions are aligned when navigate to different frames



    Keep in mind these are still images, cropped, they don't capture the full video experience, especially motion and temporal (obviously) . And you're probably sitting at normal computer distance, not living room TV distance. But I'm wondering what aspects human perception "focuses" on for lack of a better term.

    I uploaded the series in case in there is some browser issue.

    For the other guys that answered, I added a "f" for each series in the zip too, but not the website since there appeared to be clear differences in some of them

    The thing with VMAF is it gives individual frame scores, but it calculates with a temporal and motion component as well (unlike SSIM, PSNR) . I'll post more about VMAF a bit later with the updated results


    Thanks for the replies
    Image Attached Files
    Quote Quote  
  13. So I downloaded all the zip files and scrolled through them one by one in sequence:

    In the first series, if I simply concentrate on the guy's face, I can see no perceptible difference between any of the screenshots, with the exception of F, which the colors a a bit "off" compared to the source. If I concentrate on the pattern's on the actor's tie, then C seems to be the closest, F is the worst by virtue of being a bit blurry, the colors appear a bit "off" and the rest are very close to C.

    In all honesty, there's no way that anyone would know anything was amiss unless they set to monitors side by side and played both versions simultaneously.

    In the second series, the only thing I notice is that the color is a bit "off" with F, so in that sense it's the worse from the point of view of reproducing the original as faithfully as possible, however I actually like the colors a bit more than the original.

    With the third series, under normal conditions I can't tell the difference between any except for F, which again shows a slight color shift.

    If I zoom in on the door, the original seems to have a lot of noise, so for me C and D did the best job, because they seem to have just as much noise, F again has a slight color shift, however if I had to choose which I prefer I would choose B and E because they produced the cleanest encode.

    Again, I think if someone encoded a Blu-Ray with any of these encodes at BD rates, I don't think anyone would notice anything.

    So, which encoder is which?
    Quote Quote  
  14. Thanks for the comments so far

    The assigned "letters" were not necessarily the same between series 1,2,3 - ie. the encoder and/or settings used were not necessarily the same all the same between series 1,2,3 eg. b1 wasn't necessarily the same as b2 or b3. I swapped things around a bit and some encodes were not even posted before. So please be clear about which letter and which set.



    I'm asking you to perform 2 more assessments.

    The 1st one was supposed to be up close , computer distance, from the browser (or I guess picture viewer if looking at the zipped images)

    2) If you sat back at what Netflix calls "3H" or 3 times the screen height - approx. regular viewing distance in the living room from a TV - if you saw differences in (1) above up close - are you able to still see those differences now? Which ones are more similar now? I'm still referring to those cropped image sets.

    3) Were there any that you could exclude almost immediately in being "similar" or kick to the bottom of the list ? If there were, what characteristics or how did you make that determination
    Quote Quote  
  15. Originally Posted by poisondeathray View Post

    I'm asking you to perform 2 more assessments.

    The 1st one was supposed to be up close , computer distance, from the browser (or I guess picture viewer if looking at the zipped images)

    2) If you sat back at what Netflix calls "3H" or 3 times the screen height - approx. regular viewing distance in the living room from a TV - if you saw differences in (1) above up close - are you able to still see those differences now? Which ones are more similar now? I'm still referring to those cropped image sets.

    3) Were there any that you could exclude almost immediately in being "similar" or kick to the bottom of the list ? If there were, what characteristics or how did you make that determination
    'Similarity to the source' for 'normal' viewing distance:

    f looses in all series because it darkens the picture. So I dropped it.

    Series 1: c1 wins, followed by d1

    Series 2: c2, d2 are similar, b2 and e2 loose because of the banding

    Series 3: no clear preference, all about the same

    My impression:
    - From a normal viewing distance the loss of details (e.g. in shades and low contrast areas) become less critical, unless the impression of sharpness is also affected
    - Banding is more annoying than loss of some detail (would be even worse in fading scenes)
    - Any color shift makes the picture look dissimilar to the original, even from distant viewing (not saying that it looks bad per se)
    Quote Quote  
  16. At normal viewing distances, with the exception of the above noted color shift, I do not see any difference between any of the pictures.

    This experiment, for me, has simply reinforced my long held belief that the encoder is less important than the source and so long as one uses enough bit-rate you will probably not notice the difference, unless you pixel peep.

    Over the weekend I decided to do my own tests, downloaded the 96GB Meridian source and ran a bunch of encoding tests using HandBrake on Manjaro, with the output unscaled and doing 2 pass encodes of x264 at 15 MB/s, using ultrafast, superfast, veryfast, fast and medium presets as well as nvenc avc encodes with HQ, HP, fast, medium, slow and default presets.

    It took forever to finish the encodes, primarily because JPEG2000 is a beast to decode, meaning the encodes were averaging less than 4 fps, regardless of settings and encoder used, and when all was said and done, viewing the 4k final encode on a 1080p monitor, from a normal viewing distance, I honestly couldn't see any difference between any of them.
    Last edited by sophisticles; 14th Oct 2019 at 11:43.
    Quote Quote  
  17. Code:
    series 1 - frame 2050
    b MC_AVC_7.5Mbps_2passVBR_AQall100
    c x264_veryslow_7.5Mbps_2passVBR
    d x264_2pass_7.7Mbps_veryslow_tunepsnr (video not posted)
    e MC_AVC_7.5Mbps_2passVBR_AQcomplexity(-100)
    f Youtube download (video not posted)
    
    series 2 & 3 - frame 1000 (I frame for MC and x264)
    b MC_AVC_7.5Mbps_2passVBR_AQcontrast(-100)
    c MC_AVC_7.5Mbps_2passVBR_AQcontrast100
    d x264_veryslow_7.5Mbps_2passVBR
    e MC_AVC_7.5Mbps_2passVBR_AQall(-100)
    f Youtube download (video not posted)



    My thoughts are essentially the same as Sharc's , sneaker's, and Karma's - There are significant differences up close, but many of those differences are difficult to see at a distance. Rankings were identical between sneaker an I, and the most similar was the same for everyone, except sophi who reported cannot see differences except color shifting up close (only the most dissimilar "f" was reported) .

    Like Sharc, the difference in banding in series 2 is the worst offender for me , visible from a distance, and also in videos. (It's also the most "annoying" for me too)

    Color differences - interesting that this was the only difference for sophi. Humans are typical less sensitive to color. It's one of the reasons why we have chroma subsampling in video. Netflix does't even take into account color in this current VMAF model - I'll post more on VMAF later. Maybe sophi - you are extra sensitive to color variations? But just overall general color balance - you didn't mention or notice the discolored hands and faces in series c3 ?

    As I said before, I'm more interested about the PERCEPTION of differences, and WHY or WHAT caused you to identify a difference (or miss "seeing" a difference) .




    series 1)

    Up close I can easily see the differences in the skin detail, and textures such as hat, trench collar, wringles in the trench.

    I put a higher "weight" on different types of textures and detail. Foreground actors/actress faces and skin detail should be a the top. Nobody else mentioned the skin specifically, so maybe I'm just seeing things but it's pretty clear to me there are significant differences in the skin

    Details are blurred away in all of them except c1. d1 retains some of the hat brim textures. c1 retains the most of those fine details, but also has deteriorated edges. eg. If you look at the top of the hat, or the hat band - the interior and exterior lines are not as smooth . I call this "ratty" edges. These are typical x264 artifacts with most AQ modes, and to an extent psy - that's one of the tradeoffs for retaining details. The default values might not be optimal for this.

    "f" is the least similar. It has that smoothing lack of detail look, but also those edge artifacts - double bad. The plants in the background are clearly deteriorated

    The differences are more difficult to see from far, but there is a general impression of more detail, more "noise" for c1. But the original a1 also had more detail, more "noise"




    Series 2 - This one demonstrates the largest differences . Most people should see these differences at a distance, and in video

    Some serious banding in b2,e2,f2. c2 and d2 retain that noise/grain pattern, and d2 is slightly more similar

    f2 - the "darknening" was more noticable to me first , vs. the "color shift"


    This is classic AVC high quantization compression artifacts. In MPEG2, the analgous artifact would manifest as a grid of pixellation or macroblocking. In AVC, there is inloop deblocking and variable block sizes so the edges along blocks are typically smoother, and the pattern is less symmetrical, less grid like

    It's essentially the same phenomenon as this

    https://forum.videohelp.com/threads/394569-I-there-any-benefit-to-encoding-to-10-bits#post2562122
    or
    https://forum.videohelp.com/threads/394425-x264-Horizontal-blocky-lines-no-matter-what-preset

    This is a very common issue with all AVC encoders , and it's a very common complaint

    Look at image set 2 png file sizes. They used the same png compressor. That large deviation in size should raise red flags with b2, e2, f2

    a2 406 kb
    b2 15 kb
    c2 327 kb
    d2 315 kb
    e2 49 kb
    f2 45 kb

    You can "hide" banding artifacts if dither, or grain, or fine noise is applied, but the encoder has to be able to retain that dither/grain/noise , and it is typically expensive to encode in terms of bitrate

    In motion this is terrible too . I take it back that x264 is the "worst" for keyframe pumping. Look at "MC_AVC_7.5Mbps_2passVBR_AQcontrast(-100).mp4.mp4" (!). I didn't look closely at it earlier in motion. The pattern of the blocking/banding swaps or "clicks" with each new Iframe and GOP. That video was the highest rated VMAF score - which is supposed to take into account temporal differences (and penalize larger deviations more when using "harmonic mean" pooling). More on VMAF later




    series 3)

    There are significant quality differences to me up close. Someme of these were mentioned earlier -

    b3 and e3 have objects missing, such as rocks under the car, the buttons on the trenchcoat. Textures such as rear tire treads are missing. Window and car door grain/noise pattern is missing, face is discolored and missing the grain/noise

    c3 and d3 are closer in those respects, but c3 has softer shadow detail under the car, rocks are blurred. c3 has discoloring problems mentioned in earlier posts. Also around the hat brim and shoe highlights. The car hood highlights are discolored , and the top of the open door is missing the highlight. Fingers on actors left hand are discolored with red rimming, the right hand desaturated.

    I suspect "c3" is shifting CbCr bitrate into Y.
    Quote Quote  
  18. Originally Posted by sophisticles View Post
    At normal viewing distances, with the exception of the above noted color shift, I do not see any difference between any of the pictures.
    Yes that's a clear distinction I want to re-iterate

    A) What 95% of people can see from normal viewing distance across the room, and B) what you can see looking up close


    This experiment, for me, has simply reinforced my long held belief that the encoder is less important than the source and so long as one uses enough bit-rate you will probably not notice the difference, unless you pixel peep.

    I mostly agree . The problem arises when you have bitrate restrictions or have a scenario that requires better compression ratio . Everything looks great with "enough bitrate", even MPEG2.

    "Enough bitrate" is what encoding tests are typically about - how much is "enough" (and for what purpose)? . eg. If you can get away with using 20% less bitrate for the same level of "quality" in situation (A) , maybe that pushes some CEO farther along to make that decision . Maybe that can help justify cutting average bitrates some more.

    And how important is situation (B)? If 95% of viewers at a distance cannot see the difference between "youtube quality" and something "higher quality" up close, then does it matter what quality you stream ? What about viewers that watch on their computers, or portable devices such as ipads or notebooks ? What is the "minimum" that you can get away with ?

    What types of artifacts or "problems" are easily identifiable ? (And moving away from "similarity" for a second, what types of problems are really "annoying", enough that someone might complain about )

    You hear about complanits all the time about Netflix quality (or some Network TV station, HBO, or Youtube, or a certain BD's quality). Are they that 5% group ? Are the 95% "happy" with the quality ? "Complainers" are the most likely to speak up.

    I'm definitely in that 5%. Details matter to me. If objects are missing, details are blurred away, that's important to me - that's clearly not "similar". If source has grain, and encode suddenly doesn't - that's clearly "not similar" . If source is hiqh quality, encode suddenly has banding - that's clearly "not similar". A comprehensive quality assessment looks at everything, not just casually from a distance. So you need to compare other frames up close as well (not just a few cropped ones), the motion quality, motion artifacts also.

    Originally Posted by sophisticles View Post
    I honestly can't see the difference between any of those screenshots, either my eyes are shot, my monitor is crap or the difference between any of those encodes is so small as to be insignificant.
    If you cannot "see" a problem such as banding or blocking, even up close - it does not mean it has "no problems" . If we assume you are being honest and sincere - It suggests that you cannot assess "quality" properly , or at least that you miss some things. The other responders can identify those types of problems and differences (but people that post in this forum are probably not representative of the general population)

    It explains a lot; I can understand some of your previous posts and where you are coming from, so thanks for sharing. Everything almost looks the "same" to you. A youtube video looks similar to a higher quality mainconcept or x264 encode except for "color shifting" - You cannot identify differences up closely. Large differences and common AVC artifacts such as banding and blocking are completely missed.

    I encourage you to "pixel peep" a bit more, and learn about common artifacts. Identifying problems is the first step to learning how to combat them, how to filter or what encoding settings to use etc... But if you can't even identify the problem, then it's difficult to do something about it. In a sense "ignorance is bliss". I enjoy movies less - I notice defects and problems all the time, not just compression related, but director / camera related - analyzing more instead just being entertained
    Quote Quote  
  19. Would you consider doing a similar test with x265 and MC HEVC? Maybe someone can add Turing and AMD HEVC as well.
    Last edited by sophisticles; 14th Oct 2019 at 14:25.
    Quote Quote  
  20. I'm wondering if the reason I do not see the differences that PDR and others are describing is either age related or OS related. I'm 49, how old are you guys and how is your vision, 20/20, with or without glasses? Also, I have been using Linux for years as my primary OS, usually with an NVIDIA card, almost always with the proprietary drivers and I tend to favor smplayer.

    What version of Windows do you guys use, with what settings, drivers, video card, etc?

    The reason I ask is because I remember years ago, with Win XP, I had noticed that at default settings, Intel graphics resulted in colors that looked "washed out", even if I adjusted them, NVIDIA was better and AMD/ATI had the richest colors.
    Last edited by sophisticles; 14th Oct 2019 at 14:25.
    Quote Quote  
  21. Originally Posted by sophisticles View Post
    Over the weekend I decided to do my own tests, downloaded the 96GB Meridian source and ran a bunch of encoding tests using HandBrake on Manjaro, with the output unscaled and doing 2 pass encodes of x264 at 15 MB/s, using ultrafast, superfast, veryfast, fast and medium presets as well as nvenc avc encodes with HQ, HP, fast, medium, slow and default presets.

    It took forever to finish the encodes, primarily because JPEG2000 is a beast to decode, meaning the encodes were averaging less than 4 fps, regardless of settings and encoder used, and when all was said and done, viewing the 4k final encode on a 1080p monitor, from a normal viewing distance, I honestly couldn't see any difference between any of them.
    Not too surprising - the main reason is probably the oversampling. If you watch some YT videos, the UHD version on a small display looks much better . But if you examine the UHD stream at 1:1 it's full of artifacts . "Soft" or crappy looking UHD can look great when downscaled. It's analogous to soft HD video camcorder recordings can look good when resized to SD .



    Originally Posted by sophisticles View Post
    Would you consider doing a similar test with x265 and MC HEVC? Maybe some can add Turing and AMD HEVC as well.
    I'm still finishing up the VMAF part on this one (I made some mistakes, and had to double check with other tools to be certain)

    For me, the more tests results, the better.

    I would like to see that, but it takes a lot of time to actually do them (properly) . I will contribute if I can, but net bandwidth is a concern too. My home account doesn't have unlimited bandwidth, so 96GB is a lot . 2-3GB chunks are easier to "swallow"
    Quote Quote  
  22. I don't think it has to do with vision for the "easily" visible differences - unless someone has very poor eyesight

    All the graphics cards have settings you can adjust. Make sure monitor is calibrated; very rarely are monitors calibrated from the warehouse.

    What about the website and browser comparison ? Those are already sRGB images, and higher chance of displaying correctly. More things can go "wrong" with a YUV video and video player playback, there are multiple other issues that can affect players such as choice of renderer, renderer settings.



    That series 2 "sky" shouldn't be a problem with monitor or graphics card because it's not at the extremes of bright or dark. It's near the middle in terms of Y waveform values. The typical problems you might encounter for display/graphics settings issues occur around very bright and very dark shades (bright and dark areas might be crushed or not as visible if gamma curve is "off") . Or PC/TV levels is another common issue, or just wrong color balance. None of those should affect the series 2 sky issue

    Some of the smaller differences might be difficult to see, but the "big" one is the banding/blocking. Make sure you get that one down because it's so common.

    2 tools/method that helped me learn along the way :

    1) The fast switching technique with number keys . avspmod/avisynth . I tried to sort of emulate that experience with the website framecompare here. avspmod is better and you can compare other frames easily, do a bunch of manipulations to see things better etc.. But a linux equivalent would be vapoursynth multiviewer. Both require learning avisynth or vapoursynth basics

    2) Enhancement techniques to "see" things better . e.g. very common - you might "brighten" up a dark scene to see the compression artifacts better . You can do that right in avspmod or vpy script.

    You shouldn't need that for the series 2 differences. But here is scene 2 with histogram("luma") - which is an enhancement technique that amplifies luminance variations

    Image
    [Attachment 50542 - Click to enlarge]


    Notice in 2b, the macroblock edges are enhanced, easier to see. Now correlate those areas with the real image. Again, it would be "nicer" if you could do the fast switch, because of the "superimposed" split second residual memory . 2c should look more similar to 2a, than 2b . The grain/noise pattern is preserved better. You should be able to see that better in the "enhanced version" .

    It's something you get used to and/or learn with experience. You start noticing things quickly, you know where to look, where typical problems might arise; what conditions predispose you to certain types of artifacts. It probably involves some sort of pattern recognition, and feedback. You learn to identify signs, and certain encoders have almost "signature" signs.
    Quote Quote  
  23. Originally Posted by sophisticles View Post
    With the third series, under normal conditions I can't tell the difference between any except for F, which again shows a slight color shift.

    If I zoom in on the door, the original seems to have a lot of noise, so for me C and D did the best job, because they seem to have just as much noise, F again has a slight color shift, however if I had to choose which I prefer I would choose B and E because they produced the cleanest encode.
    For me it is hard to understand that you can't see the differences. It can definitely not be attributed to your age .
    What you call 'noise' are probably fine picture details in the original, no? You may argue that you expect the door of the car to be 'clean' in the original. But if an encoder suppresses this 'noise' it will most likely also remove details from other parts of the picture.
    See fore example the tire and the shadow below the car in c3 vs b3 - and compare with the original a3 . Don't you see the loss of details on the tire and in the shadow (grey mash with banding in b3)? Similar goes with the face of series 1, for example.
    I suspect that what you call a 'clean encode' is mostly a loss of picture details. If you like it, it is your personal preference. It has however little to do with 'similarity to the source' as I understand it.
    Image Attached Files
    Last edited by Sharc; 15th Oct 2019 at 06:48. Reason: Attachment added
    Quote Quote  
  24. Sure, with the zoomed in tire treads I notice the lose of detail, but there is no way you guys can tell me that unless you pixel peep you will notice this at normal playback on a normal monitor from normal viewing distances.

    The differences are largely academic, this whole discussion reminds me of being in high school arguing with my friends about which car was "the best". Our arguments invariably came down to which car had the quicker 0-60 time or 1/4 mile time or top speed as reported by some car magazine (this was before the days of the internet).

    If a Mustang was reported by Hot Rod magazine or Car & Driver or Road & Track or Motor Trend to have a quicker 0-60 time, if even by 2/10ths of s second over an Iroc or if the Buick GN could do 0-60 in 6 seconds verses 6.5 seconds for another car, that car was judged to be "the best".

    It didn't matter that the maximum speed limit in most states is between 55-65 mph, if a car could do 150+ on a race track, we wanted it. It didn't matter that you could never pass someone at 0 mph, the 0-60 time was king. It didn't matter that you don't only drive 1/4 mile and then park the car, having the quicker time meant everything.

    Things like comfort, reliability, handling, safety, braking, or cost didn't factor into our judgement.

    It didn't matter that under normal conditions t's nearly impossible to tell the difference between accelerating 0-60 in 4.9 seconds instead 5.4 just by driving the car or that on the street you will never be able to come close to a car's claimed 0-60 times because of traction issues.

    It's the same with these comparisons, you guys zoom in, pixel peep, take still screenshots and enhance them in order to spot some minute difference that will justify the claim that this encoder is better than the next.

    I did an experiment that I will be posting in the next day or two, I took a 4k source from the Black Magic website, the Fitness Blogger sample, cropped it and encoded it using x264 ultra fast, super fast, very fast, faster, fast, medium and slow, 2 pass, 15 MB/s to see if I could see any differences under normal viewing conditions. I have a very busy day today and tomorrow and will not be home until late tonight, but when i post it I want you guys to tell me if you spot any differences.

    It will be interesting to see what you guys think.
    Quote Quote  
  25. The differences in treads or rocks can be more difficult to see on some setups - perception of shadow details has a lot to do with with your monitor, how the gamma curve is handled. Many monitors will "crush" shadow details, so they are less distinct. ie. Typically there is more variation in what people can "see" with shadow detail, than with brighter differences such as the sky banding/blocking in series 2

    Again - not being able to see the differences , does not mean there are no differences

    If people can't see the differences in a Youtube video vs. a higher quality video, I think we're in trouble. Then really nothing matters.

    You can decide if the differences are important to you, or for some situation, or what you prefer more, or put conditions on it however you like. Those are slightly different topics.




    The actual quality of the video does not change with distance - your perception of it changes with distance. It's important to look at everything - other frames, motion characteristics , etc...
    Quote Quote  
  26. Originally Posted by sophisticles View Post
    I did an experiment that I will be posting in the next day or two, I took a 4k source from the Black Magic website, the Fitness Blogger sample, cropped it and encoded it using x264 ultra fast, super fast, very fast, faster, fast, medium and slow, 2 pass, 15 MB/s to see if I could see any differences under normal viewing conditions. I have a very busy day today and tomorrow and will not be home until late tonight, but when i post it I want you guys to tell me if you spot any differences.

    It will be interesting to see what you guys think.

    That setup has potential issues because it's oversampled. The scaling algorithm used to downsample has a significant effect in that situation. If Sharc's player or TV uses a different one than mine, the same video can look quite different. Usually that test would need stricter protocols to control variables.

    If you're testing to see if someone can see differences on a HD display, you would expect any the differences to be minimized from the downsampling. I wouldn't expect to see many differences unless unless the UHD version had serious issues, maybe something like completely wrong colors or big artifacts. It's analogous to acquisiton at UHD and delivery at 1080. A soft, low quality consumer level UHD camcorder can make incredible looking HD footage (in terms of artifact free, clean , sharp; not necessarily in terms of latitude or dynamic range) , better than $20-30K 1080p cameras just a few years ago because of the oversampling.

    But the opposite is a more common scenario. Lower quality adaptive streams are viewed on a 1080p display . For example, different versions of 360p, 480p web video. What is the "quality" perceived between those streams? Those are some situations the Netflix metric is supposed to measure. But it's a similar situation there - they have guidelines to control the upsampling algorithm used
    Quote Quote  
  27. Sorry for the long post, just wrapping things up

    Image
    [Attachment 50568 - Click to enlarge]



    I made a mistake with VMAF calculation earlier, so updated and verified with official tools (vmafossexec) , and added ffmpeg PSNR , SSIM results


    The "winner" according to VMAF is MC AVC AQComplexity(-100)


    Visually I disagree, and even some of the other MC encodes look more similar to the source in terms of overall characteristics, especially in terms of details and grain - basically all the stuff mentioned earlier. The banding is much worse with AQComplexity(-100). Looking at the per frame data, it's not just an averaging or math issue either.

    Some people think disabling psy and AQ is a good idea. Maybe in terms of concept, but in actual usage for x264 it's beneficial in most circumstances. You might have to adjust it for different circumstances (e.g. cartoons, anime would typically lower the values because clean lines have higher priority), but the default values are a good starting point .

    If you look at the --tune psnr (disables psy and AQ) screenshots and video (added to 1st post), it's blurry with missing details too. Banding in the sky is worse. It ends up looking more like a "traditional" MC encode, or something that comes out of Adobe. AQContrast(+) seems to work similarly to x264's AQMode 1 in terms of moving bitrate to flat areas, shadows, but the effect is limited in strength or capped, probably for safety. It's easy to go overboard for AQ for x264 and cause more problems.

    Notice --tune ssim and --tune psnr score the highest for ssim and psnr, respectively. When x264 supposedly "won" those AVC MSU encoder tests, note the commandlines and measurements. They were --tune ssim, psnr - it was pretty much meaningless. If you were to actually look at the videos, they look like mush .




    Nevertheless, doing those metrics still is useful in some ways for seeing trends, and sometimes pointing out problems. Notice the PSNR Min value is abnormally low for NVEnc(pascal), AMD(vce4.3), Apple(QTPro). If you look at the per frame data, all 3 had blips on real frame 1186 which is the 1st frame of the scene change (ffmpeg logs start at "1" not "zero", so the blip is on 1187 in the log) . But looking at each of them, it's not terribly different than the surrounding frames (they are all "lowish" in quality, but about the same) . Sometimes you can get weird results with ffmpeg because of various frame accuracy problems (weird timestamps, open gop seeks), but indexed avs scripts were used as inputs so I'm sure it working ok. But I can't explain the drop. vmaf only showed a slight drop, eg. if it was 92 for those frames around it might drop to 90 or so for that 1 frame.

    An excerpt from the NVEnc(Pascal) ssim and psnr logs.

    n:1186 Y:0.930024 U:0.942970 V:0.971175 All:0.939040 (12.149572)
    n:1187 Y:0.571607 U:0.895821 V:0.947226 All:0.688246 (5.061873)
    n:1188 Y:0.925645 U:0.954167 V:0.970581 All:0.937888 (12.068255)

    n:1186 mse_avg:8.69 mse_y:11.49 mse_u:4.19 mse_v:2.01 psnr_avg:38.74 psnr_y:37.53 psnr_u:41.91 psnr_v:45.10
    n:1187 mse_avg:2743.19 mse_y:4086.30 mse_u:76.00 mse_v:37.91 psnr_avg:13.75 psnr_y:12.02 psnr_u:29.32 psnr_v:32.34
    n:1188 mse_avg:14.40 mse_y:20.34 mse_u:3.08 mse_v:2.00 psnr_avg:36.55 psnr_y:35.05 psnr_u:43.24 psnr_v:45.12

    Any one that deals with PSNR in the past knows that a PSNR-Y of 12 is terrible quality; it's basically a destroyed image. An almost completely different image. Just google for pictures of representative db values

    Something is wrong here. I suspect something like pixel shifting on the scene change. Something like that looks identical, but causes PSNR or SSIM to say it's a completely different image. I didn't have time to investigate it more, but it might be worth while checking if Turing or HEVC or other cards are affected, or on other samples







    VMAF comments -

    VMAF has many issues ; it's discussed in other threads and forums, as well as the Netflix blog post and instructions but here is a small summary:

    Some pros:

    1) Other metrics like ssim and psnr are single frame only. vmaf uses some temporal analysis

    2) Open sourced now, and updated. When an obvious "wrong" result is encountered , you can upload and they will analyze it, hopefully improve it

    3) Doesn't make as many clear mistakes as SSIM/PSNR. But in this test it moved basically in the same direction as SSIM. This is unexpected for VMAF and I'll probably file a report


    Some cons: issues that I don't like (there are more, but these bother me especially)

    1) Other metrics like ssim and psnr are single frame only. vmaf uses some temporal analysis (yes, it can be both good and bad)

    2) Current models do not into account color information. You could offset U,V bitrate into Y to artificially bump VMAF scores as a "cheat"

    3) some vmaf implementations currently have issues esp. ffmpeg libvmaf

    -ffmpeg libvmaf does not match official scores (vmafossexec and vapoursynth vmaf keep values in float accuracy)
    -has other issues with bit depth conversions (the conversions done by ffmpeg are ok, but the way it's analyzed in ffmpeg libvmaf are not)

    4) doesn't register "perfect" 100 . Testing against itself, or a lossless encode

    5) limited range of testing and applicability (the current VMAF models were trained from CRF 22 to CRF 28 ). It's more applicable for "lower quality" bitrate scenarios . RD plots tend to plateau early and it does not differentiate well between higher quality encodes - they all look the "same" to VMAF, even when (some) human eyes can see the differences even at a distance. So they have a confidence interval calculation included

    In some sense, it's true - eg. CRF 20 and CRF 12 encode might look the same from the regular "3H" viewing distance and setup in motion under normal conditions. That's the setup and premise VMAF is based on. What does "typical audience" perceive under normal viewing conditions.

    => My theory is that it's a tool developed for Netflix to justify bitrate starving their streams . "Well, 95% of viewers can't see the difference so drop the bitrate some more"
    Quote Quote  
  28. Dinosaur Supervisor KarMa's Avatar
    Join Date
    Jul 2015
    Location
    US
    Search Comp PM
    VMAF seems to just keep pumping out crap stats but since it's associated with Netflix is gives weight to the results and allows for crazy headlines to exist. It pretty much says that my AMD encoding was as just about as good as the x264 very slow. Like at this rate why doesn't Neflix just use AMD cards and be done with it, if VMAF is to be trusted. Could encode their entire catalog for 10% the cost and time.

    I'm not sure what to make of your posted stats besides it took a lot of time so thank you for that. Also it was VCE 3.4 not 4.3. 4.0 is the newest one but they don't even have a 4.3 yet . Another note, my AMD encodings come with 100kbit of extra error detection data and I have no idea how to turn it off. So the real video data rate is like 7500kbit according to mediainfo. Not that it matters much.
    Quote Quote  
  29. Originally Posted by KarMa View Post
    VMAF seems to just keep pumping out crap stats but since it's associated with Netflix is gives weight to the results and allows for crazy headlines to exist. It pretty much says that my AMD encoding was as just about as good as the x264 very slow. Like at this rate why doesn't Neflix just use AMD cards and be done with it, if VMAF is to be trusted. Could encode their entire catalog for 10% the cost and time.
    vmaf is not that bad overall, for what it's purpose and intended usage is ; it's a mixed bag of good/bad with these 1st gen models. This sample with the grain/noise is downscaled grain/noise and I think it's having problems with it. *Even though it's a Netflix sourced sample). I might write it up and report it.

    I've done other tests where it clearly did better (ones where PSNR typically fails), and correlation was higher

    But not included in these results, but I did tested PSNR-HVS-M on a few , and it "failed" here too . That was generally considered the "best" metric that takes into account subjective/perceptive differences before VMAF (some still think it's better than VMAF)

    Bottom line - you can't trust them blindly - make sure you look at other measures, other data, and your eyes

    I'm not sure what to make of your posted stats besides it took a lot of time so thank you for that. Also it was VCE 3.4 not 4.3. 4.0 is the newest one but they don't even have a 4.3 yet . Another note, my AMD encodings come with 100kbit of extra error detection data and I have no idea how to turn it off. So the real video data rate is like 7500kbit according to mediainfo. Not that it matters much.
    The data rates in the chart were Total bitrate based on filesize; so including everything and container overhead . That reflects the actual payload in a delivery/streaming situation. Demuxing the ES and reporting that would probably be the "best" way to report video stream only bitrate

    That scenechange issue that PSNR/SSIM is picking up is bugging me . I have an older Maxwell I might try and see if it does the same thing.
    Quote Quote  
  30. Dinosaur Supervisor KarMa's Avatar
    Join Date
    Jul 2015
    Location
    US
    Search Comp PM
    I just remember this being posted here not so long ago on Turingvs x264, with x264 fast beating out x264 slow by a wide margin. And then Turing beating everything. Since this was posted to that website, I've seen people on different forums using these metrics as evidence.

    https://forum.videohelp.com/threads/392427-Turing-NVENC-on-RTX-series
    https://unrealaussies.com/tech/nvenc-x264-obs/
    Quote Quote  



Similar Threads

Visit our sponsor! Try DVDFab and backup Blu-rays!