Howdy, heard a lot about Turing NVENC, especially with the OBS update recently it's been the talk of the town. So I've done some testing and thought I would share:
Long story short, according to VMAF quality scoring, Turing AVC beats x264 in pretty much any circumstances. It's HEVC is approximately as good as x265 Slow. Here's an example FFMPEG command for a 1080p video source to H.264:
If you have one of these cards, take it for a test!Code:ffmpeg -i Original.mkv -c:v h264_nvenc -preset:v slow -profile high -level 4.2 -b:v 6000k -rc-lookahead:v 32 -g 480 -an -f matroska Turing_MaxQ_6000.mkv
+ Reply to Thread
Results 1 to 12 of 12
Edit: It seems like you are comparing NVENC HEVC to x264, which was not very clear in your graph and made me assume you were comparing AVC to AVC encoders. If this is NVENC HEVC then this graph is more believable. You really need to make that more clear in your post instead of just saying "Turing" or "NVENC", as these cards support AVC and HEVC. And most streamers today still need to use AVC/H.264 encoders, which your graph will not be valid for. The very people this website is geared toward.
Last edited by KarMa; 4th Mar 2019 at 03:32.
One of those has just AVC, the other has both AVC and HEVC included. Just look closer and you'll see it's all written there, there's no trick. The FFMPEG commands are included for you to try out yourself if you don't believe me.
What makes you doubt that x264 Medium gives better VMAF than x264 Slow? Have you tried? I realise that it's not intuitive, I get it, but the images don't lie. When you take one single source video and encode it in FFMPEG changing only ONE SINGLE VARIABLE and that's the -preset variable, and you do it at 15 different bitrates, at what point do you still believe intuition over evidence?
I get it, NVENC has been touted as an amazing thing since Kepler came out but it wasn't all it was cracked up to be. Maxwell came and x264 was still better. Pascal came and x264 was still better. It's an old story full of empty promises. I was there, I remember the hype and the let-down. I 100% felt all the doubt when NVIDIA was touting Turing NVENC as superior, I doubted it all the way, and I set out to prove it. I ended up proving myself wrong, and them right.
Turing slow preset absolutely does score better on VMAF than x264 Fast, Medium, Slow (and VerySlow but that's not on the same graph) for both Overwatch footage and Apex Legends footage. Will it do the same for IRL footage? I dunno, wanna help me find out? The best theory I have for why Medium beats Slow, is because the search method x264 uses is constructing a format that is 15 years old, and it disagrees with NVENC on what constitutes quality per bit. It would be a MIRACLE if they didn't disagree being so far apart on the timeline of history.
I stand by what I've done, I haven't lied with my results. I did it with Overwatch 1440p60 footage and then again with Apex Legends 1080p60 footage. Please, if you have evidence to the contrary, just show us. We're all here to find information. If you find a game that reverses these results, then the next thing I'll wanna do is figure out why the difference? But if you just wanna say "it can't be, there's no way, that's impossible" then I can't do anything about that, your mind is already set in the face of the VMAF filter's output.
Spoiler alert: x265 has the same counter-intuitive difference between Medium -> Slow -> Slower -> VerySlow presets. Although in x265 it's "slow" that scores the best, not "medium".
Please also see my response to your post in the other thread, regarding test sources and Turing in general.
It's as good an explanation as I've ever heard. And it does seem intuitive the way you describe it. As though increasing the search range means it finds more things to encode, and then the average amount of bits that goes to the OBVIOUS improvements is lowered to allocate some to the "subtle" improvements. Something like that?
I just read your other response and it's sick! I will reply now to keep the same convo in the same thread.
Thanks for sharing the tests . A few posts in different forums showing similar results. Very promising for Turning and Nvidia.
You have -g 480 for the 1st post, but it differs from the -c:v h264_nvenc command line used in your blog, can you clarify what was used
So far only gaming results, and realtime 1pass ABR tests (I realize this is for the fast realtime streaming scenario) . Would love to see different source materials, different rate control methods, actual bitstreams
In fact, I would argue that if you were doing a archival quality encode, without the desire to use lossless, or you were going for a high quality encode that would be distributed via large physical media, such as BD50, then you're probably better off using fast since that is the slowest preset that doesn't use any of that psycho-visual "enhancement" crap.
And you could argue "AQ" falls under the category of perceptual enhancement modification category
Or how about just disabling those specific options, if you didn't want them?
FFMPEG command in the above OP includes it, because I realised later it helps with VMAF score by about 0.5-1.0 % and wanted to share.
Here is a teaser of the method variety, I'm currently doing VP9 versions but I've finished x265. Why does it do this? Why is VerySlow so bad, then come so good? Why does Slow somehow beat Medium AND Slower? So many questions...
[Attachment 48268 - Click to enlarge]
 VMAF of 95 or higher looks considerably like the original, this graph is higher up the scale than the H.264 ABR one which caps around 91.
But as bit rate is allowed to balloon, you reach a point of diminishing returns and eventually the 2 lines, faster preset vs slower preset, if graphed, would meet and cross.
Think about it, if you take a 1080p lossless source and encode it with only 5 MB/s, then yes, one would expect the "very slow" preset to be higher quality than the "very fast" preset because this is a bit rate starved encode and the "bit budget" is at a premium. But take that same source and this time encode it with 50 MB/s, at this bit rate you are likely at the visually-lossless-relative-to-the-source-regardless-of-preset point and so the preset that throws away more data will result in the lower quality, i.e. it's throwing away data needlessly.
The reason I say to use at max the "fast" preset vs a slower preset with the above disabled is because the x264 developers have said that things like psy-rd only have meaning in the context of sub-me 6 and higher and sub-me 6 is first used by the "medium" preset.
I especially recommend using at max the "fast" preset if you are doing any filtering; I see no point in spending time editing a video, applying filters to get the look you want and then choose a setting where you know the encoder is going to sit and try to "optimize" each frame to emphasize parts of the frame it thinks are more important than others, which is how all of the psy "enhancements" work.
This is like taking a car, giving a custom paint job, with some fancy graphics, polish, multiple coats of paint, spending thousands on dollars and hundreds of dollars getting it just right and then sending it to a car wash that is known to add chrome highlights and pin stripes because they think it somehow "enhances" a cars appearance.
The model that I used in the test was specifically for 1080p where the viewer is sitting 3x as far away from the monitor as what the monitor is in height. If the monitor is 20cm tall, and the user is sitting 60cm away, it's accurate. If the user is closer, than ALL results drop down, but the closer to 50, the faster they drop. Not that I don't include any low-bitrate results that are below 50.
CRF 22 absolutely does not get 100 in this model, and it's the model recommended by Netflix's lead dev on the project. On this model, HEVC made with x265 VerySlow preset on CRF 22 gets 96.6 on this VMAF model at 19978kbps while Slower preset everything else the same scores 96.5 and used 20140kbps, about 1% more bits.
96 means that subjectively, out of every 100 people, only 4 are predicted to rate the video as not as good as the original with 95% confidence interval. It can be trained on a different viewing distance or different resolution if you like, which gives a different model file, which gives different scores. I have model files in my possession that are provided by Netflix which are designed for 4k montiors on a desk at 50cm from the viewer's face, and don't score 98 at 200Mbps. Wherever you heard about the CRF22 mapped to 100 thing, it was with a different model file, and it was either from long ago and is no longer used, or it wasn't one provided by Netflix. It may have been somebody custom training their own model file instead?
It was Netflix , and yes in the context of viewing 60cm away, for average user. So you have to be careful how it's interpreted or what situations to use the metric
(And sometimes you don't get "100" when testing source against itself)
Netflix: VMAF has been trained using encodes spanning from CRF 22 @ 1080 (highest quality) to CRF 28 @ 240 (lowest quality). The former is mapped to score 100 and the latter is mapped to score 20. Anything in between is mapped in the middle (for example, SD encode at 480 is typically mapped to 40 ~ 70). Does this help?
Whereas the "Near lossless" or "visually lossless" term that sophi used is also relative - but it's usually reserved for post production in the context of high bitrate intermediates like prores, cineform . Completely different scenario. To achieve those , you'd be using much lower CRF values .
Can OPan untouched source, and sample H264 RTX encodings.