Happy New Year.
I finally had some time and got around to doing a bunch of tests that i have wanted to do for a while.
Any of you that have read my posts over the years know that there are a few thinks that I have always believed:
1) Most, if not all, so-called psycho-visual optimizations border on being snake oil.
2) CRF is way over rated.
3) I can't stand bit-rate starving encodes, I have always said, and it's the truth, that with enough bit rate all encoders will do a good job and frankly for most home users, considering how cheap storage is, are better off just using more bit rate rather than trying to squeeze extra quality by using more "optimizations".
But there is another group, the professional content creator that runs a streaming video service and wants to be able to offer the highest quality video possible at bit rates that will not kill the company's profits.
These people might be tempted to spend the additional processing power to encode using more aggressive settings, but for most people it's a waste of time.
Consider the data on this page by the Handbrake people:
https://handbrake.fr/docs/en/1.5.0/technical/performance.html
If you look at the charts where they tested x264 and x265 at CRF 24 with various presets, you will see that there was relatively small changes in the amount of bit rate uses, once you get past the ultrafast preset for x264.
The significance is that CRF is supposed to be quality based encoding, where you set a desired quality, and the encoder tries to hit that quality using as much bit rate as it needs.
Ignore the obvious logical fallacy in the above belief, because these same people will tell you that objective metrics, such as psnr and ssim, can't be trusted, yet they trust the CRF algorithm to determine quality, oblivious of the fact that all encoders use psnr internally to determine quality and x264 uses psnr, ssim and vqm internally.
For those of you wondering why I dislike the notion of psycho-visual optimizations, consider the following scenario:
Say there are two pizzerias in town, one makes a great pizza but is very expensive and has long wait times, the other makes a decent pizza but is cheap and there are no wait times.
On owner of the decent pizzeria comes up with what he considers a brilliant idea, he's going to "optimize" pizza. He reasons that since most people grab a slice from the crust end and start eating from the other wedge end, he is going to create 2 sauces. One is going to be a cheap sauce that is barely one step above tomato paste with water and salt and one is going to be a fantastic gourmet sauce made with the finest tomatoes, best spring water money can buy, best spices, finest olive oil, cooked to perfection.
He's also going to do the same thing with the cheese, he will have 2 types of pizza cheese, one that is the cheapest stuff he can find on sale from the local supermarket and one that is extremely high end.
Now that he has that, he starts making his pizza pies as follows, in his mind he creates a logical segment in the rolled out dough, in the inner circle he uses the high quality sauce and cheese and in the outer circle he uses the cheap stuff.
He reasons that by doing this, most people will be wowed by the first couple of bites and still have the after taste in their mouth so they won't notice the sudden drop in flavor and quality as they work their way through the slice.
And for a while this works, but eventually people start noticing, word gets around and people are complaining.
So the pizza owner decides to "optimize" even further, and instead of segmenting a pizza into two logical circles, he segments it into four smaller logical circles, so that now there is overlap between the regions of great and decent and it tends to even out most of the noticed shortcomings.
But he decides to take it up a notch and apply this concept to the toppings as well, so he uses two different types of mushrooms, one that is barely above the fungus that grows in wood rot and the other a high end Oyster mushroom and he alternates pieces of mushroom so that in each bite the average person is betting both mushroom.
He does the same with the other toppings, and thanks to this convoluted pizza making process, it takes more time to prepare, so the wait times have increased and it costs more because he now has to use multiple suppliers and multiple people to make the pizzas.
Yet people now claim that he has the best pizza in town.
In a nutshell, this is what the x264 developers did and the x265 people have taken it even further, to create these monstrosity of code bases, with numerous "optimizations", that just over-analyzes the video and jumps through hoops to redistribute the bit rate.
Maybe this is why I find the svt family of encoders so appealing, they seem like a much cleaner, simpler design, much fewer options, simpler optimizations, simpler implementations, but the results are impressive.
Hope you find the test encodes that are attached useful.
+ Reply to Thread
Results 1 to 22 of 22
-
Last edited by sophisticles; 31st Dec 2022 at 22:43.
-
The significance is that CRF is supposed to be quality based encoding, where you set a desired quality, and the encoder tries to hit that quality using as much bit rate as it needs.
Ignore the obvious logical fallacy in the above belief, because these same people will tell you that objective metrics, such as psnr and ssim, can't be trusted, yet they trust the CRF algorithm to determine quality, oblivious of the fact that all encoders use psnr internally to determine quality and x264 uses psnr, ssim and vqm internally.
A bit simplified (not taking adaptive quantization into account and assuming that each macroblock in a frame is encoded with the same quantizer) I think one could say:
CRF is not an absolute quality, it's more like giving a quantizer (I assume it's understood what quantization is) and telling the encoder to aim for that average quantizer but to allow fluctuations from it to accommodate the complexity of different frames/scenes. At least in x264, the decision-making to whether the quantizer is increased or decreased is the same that is used during 2pass encoding. I think the main problem is that usually quantizer choice is referred to as quality level to simplify things.
About your pizza analogy, to me, it's more like the pizza owner with the wonderful pizza has to lower the price of the pizza (since customers don't come since they can't afford the original price) and now has to come up with a way to lessen the costs while trying to keep the quality of the pizza as close as the original so that most customers still like it and can afford the pizza.
What he then does is, he uses heuristics which tell him what flavors people recognize. List what it costs him to archive all the flavors he currently has on the pizza.
Noticing that there are some subtle flavors, which lots of people do not really seem to notice, cost him quite a lot of money to produce.
So what he does is, he leaves out is adding some flavors in some cases, which lowers the costs he has and thus allows him to lower the price so more customers can afford the pizza.
So everything in this fails in how accurate those heuristics are.
In video compression, a few things usually happen thanks to psycho visual tuning:
a. we use 4:2:0 since the human eye is more sensitive to brightness than it is to color. So halving the required data that needs to be processed ended up to be the norm. (luckily more formats nowadays support 4:4:4)
b. we got different sorts of adaptive quantization and rate factor optimizations, for which some of them ended up being labeled 'psycho-visual optimizations' in x264&co.
The main problem I see is that folks tend to be lazy and start to assume that the settings which (seemed) optimal for one source should also be optimal for others. But if you got a setting which simply drops fine details in dark places, this might be fine if those details are not really important. The problem is if you create a file that you later want to further process by increasing the brightness, for example during playback, those details that normally the human eye probably would have missed could now gain in importance and thus signify a noticeable loss.
(additional laziness is to assume that enabling more options will always produce better results)
In a nutshell, this is what the x264 developers did and the x265 people have taken it even further, to create these monstrosity of code bases, with numerous "optimizations", that just over-analyzes the video and jumps through hoops to redistribute the bit rate.
Cu Selurusers currently on my ignore list: deadrats, Stears555, marcorocchini -
1) Most, if not all, so-called psycho-visual optimizations border on being snake oil.
Does anything in your post, or previous posts, provide any support for that opinion ?
Which psy-vis opts do you consider "not snake oil" ? or "low-fat snake oil" ?
Are the SVT-HEVC tune parameters being passed in your tests ?
The likelihood of groups of 3 different tunes (each triplet using the same preset 5 or preset 11) resulting in 3 files with identical filesizes is extremely low.
If they were being passed, then I would consider SVT-HEVC tuning snake oil (TM) too, because the tunes are doing nothing
2) CRF is way over rated.
In what way is it "over rated" ? Who "rated" it ? Where is the "rating" ?
Did you evaluate the alternative(s) ? How were they "rated" ?
Ignore the obvious logical fallacy in the above belief, because these same people will tell you that objective metrics, such as psnr and ssim, can't be trusted, yet they trust the CRF algorithm to determine quality, oblivious of the fact that all encoders use psnr internally to determine quality and x264 uses psnr, ssim and vqm internally.
CRF is not an absolute measure of quality
In contrast, metrics like PSNR, SSIM are measures of "quality" or similarity to the source. But PSNR, SSIM have low correlation with human perception - that's a fact, not an opinion
Metrics like PSNR, SSIM, are used internally by many codecs because there are not very many other options to "measure" quality quickly for codec testing. Low positive correlation is better than no correlation, or negative correlation. The trends are more important than the absolute value (unless you're testing "losslessness"). VMAF wasn't present during x264 development (VMAF came out in 2016), and V1 was very slow (V2 slightly faster). VMAF isn't perfect either , but it's arguably "less bad" than PSNR, SSIM
3) I can't stand bit-rate starving encodes
To put it into perspective , Netflix uses ~4-5Mb/s for 1080 , ~12-15Mb/s for UHD (and those are considered on the low end, some would call them "bitrate starved") . Disney is typically a bit higher, ~7-8 for 1080, ~16-18 for UHD -
Hello!
A fun test, but where is the lossless original? and it is much more easy to see changes in 4 sec clip in a playlist. And why 4k? You are limiting your audience -
@PDR:
The point of the encodes is to see at what point do the encoders fall apart, i.e. what is the lowest bit rate that can be used and still provide decent quality. In previous tests I have done, you are one of the people that claimed my tests where flawed because I used "too much" bit rate.
The psy-vid opts are snake oil because they barely do anything, look at the Handbrake link I provided, they encoded the same file using different presets, that use different psy-vid opt levels and the file sizes are within 10% of one another.
But I am going to redo them myself, with both x264 and x265 and post the samples here.
I have no way of knowing if the tunes are being passed to the svt-hevc encoder, there does seem to be minor differences between the various tunes, it may be that they don't balloon the bit rate, maybe they use the same bit rate but distribute it a bit differently.
For me, i love the svt-hevc preset 5 tune psnr/ssim encode.
In contrast, metrics like PSNR, SSIM are measures of "quality" or similarity to the source. But PSNR, SSIM have low correlation with human perception - that's a fact, not an opinion
I really want to point out that PSNR literally means Peak Signal to Noise Ratio and is a measure of how much of the transmitted signal is received vs how much "noise" or non-signal is received.
The "fact" that it has low correlation with human perception does not invalidate the metric, it says more about human preferences.
For instance, Selur said "we use 4:2:0 since the human eye is more sensitive to brightness than it is to color. So halving the required data that needs to be processed ended up to be the norm. (luckily more formats nowadays support 4:4:4)" but I tend to favor cideo with good brightness, high contrast and bright colors, i.e. I place equal importance to all three. -
Yes you can't satisfy everyone...
Difficult to determine "at what point" with only 1 data point.
I'm not saying you should do this (very time consuming), but most people use a range of bitrates to determine those relationships (RD curves). Plot them if you like metrics
https://rigaya.github.io/vq_results/
The psy-vid opts are snake oil because they barely do anything, look at the Handbrake link I provided, they encoded the same file using different presets, that use different psy-vid opt levels and the file sizes are within 10% of one another.
I have no way of knowing if the tunes are being passed to the svt-hevc encoder, there does seem to be minor differences between the various tunes, it may be that they don't balloon the bit rate, maybe they use the same bit rate but distribute it a bit differently.
For me, i love the svt-hevc preset 5 tune psnr/ssim encode.
So you love all SVT-HEVC tunes in that test group, because PSNR says all 3 are the same . Snake oil (TM) indeed. That's one thing PSNR is really good for
Note that "love" an encode ... is not necessarily the same is "looks most similar to the source" .
In contrast, metrics like PSNR, SSIM are measures of "quality" or similarity to the source. But PSNR, SSIM have low correlation with human perception - that's a fact, not an opinion
The "fact" that it has low correlation with human perception does not invalidate the metric, it says more about human preferences. -
I did a test to get the best 1080p for youtube on a lossless clip. 1440p export will kick in the hevc but it did look worse in 1080p, than a native 1080p upload that run in regular avc.
Last edited by Mike_B; 1st Jan 2023 at 12:31.
-
I believe it came from here: https://mango.blender.org/download/ Probably the 4K DCP version.
Last edited by jagabo; 1st Jan 2023 at 20:38.
-
The source is a massive 186gb, downloadable here:
https://media.xiph.org/tearsofsteel/tearsofsteel-4k.y4m.xz -
-
Just to further show what a waste all the psy optimizations are, here are a bunch of CRF encodes done using x264 and x265 at various presets.
They basically parallel the results shown on Handbrake's performance page.
On the bright side, it does have the potential to save people lots of money by showing that using slower presets is a waste of time and thus there is no need to buy new hardware.
Those psy optimizations are some real good snake oil. -
I can't stand bit-rate starving encodes,...
On the bright side, it does have the potential to save people lots of money by showing that using slower presets is a waste of time and thus there is no need to buy new hardware.
=> I don't get your argumentation and comparison method at all.
Cu
Selurusers currently on my ignore list: deadrats, Stears555, marcorocchini -
You are confused .
Psycho-visual optimizations are not for "performance" - they do not refer to speed presets. 2 different things
For example psy-rd, psy-trellis in x264; or psy-rdo in x265 are psycho-visual optimizations
https://x265.readthedocs.io/en/latest/cli.html#psycho-visual-options
Speed presets like faster, slower etc... are generic performance speed vs. quality tradeoff settings
https://x265.readthedocs.io/en/latest/presets.html
Both have an effect on the actual image, at a given bitrate (e.g. 2pass encode)
When using CRF encoding, the visual difference is usually less (the filesize will adjust , but there will still be a visible quality difference between say very slow and very fast) .
The amount of bitrate the file ends up with, does not necessarily indicate anything the quality. (Remember, CRF is not an actual measure of "quality"). You can have a larger filesize, yet worse quality -
CRF 0 May also not work on your Smart TV, but CRF 1 will. If I remember correctly, CRF 0 does not provide a thumbnail in Windows.
Have anyone grab any stills on the test where the differences can be seen?Last edited by Mike_B; 2nd Jan 2023 at 04:09.
-
As i already explained, during past encoding tests I have received numerous comments claiming my tests where invalid because i was using too much bit rate and so naturally all encodes would look good. i was told that a so-called proper test should be at low bit rates to see at what point each encoders falls apart.
If you check the files, you will see the crf 40 encodes are x264 and the crf 34 encodes are x265; I was trying to get roughly the same bit rate for both encoders. -
Okay, so you jumped from one extreme to the other.
If you check the files, you will see the crf 40 encodes are x264 and the crf 34 encodes are x265; I was trying to get roughly the same bit rate for both encoders.
File Type: mkv tearsofsteel-4k x264 sf crf 40.mkv (148.86 MB, 1 views)
File Type: mkv tearsofsteel-4k x264 vf crf 40.mkv (125.97 MB, 1 views)
File Type: mkv tearsofsteel-4k x264 faster crf 40.mkv (142.98 MB, 1 views)
File Type: mkv tearsofsteel-4k x264 fast crf 40.mkv (142.41 MB, 2 views)
File Type: mkv tearsofsteel-4k x264 medium crf 40.mkv (142.52 MB, 1 views)
File Type: mkv tearsofsteel-4k x264 slow crf 40.mkv (141.07 MB, 3 views)
File Type: mkv tearsofsteel-4k x265 vf crf 34.mkv (202.06 MB, 3 views)
File Type: mkv tearsofsteel-4k x265 faster crf 34.mkv (202.05 MB, 2 views)
File Type: mkv tearsofsteel-4k x265 fast crf 34.mkv (210.16 MB, 0 views)
File Type: mkv tearsofsteel-4k x265 medium crf 34.mkv (162.24 MB, 0 views)
File Type: mkv tearsofsteel-4k x265 sf crf 34.mkv (146.41 MB, 2 views)
So this shows that slower presets with a fixed crf target likely create smaller file sizes.
(since crf with different presets will do totally different things it doesn't say anything about the quality, using constant quantizer might have been the better idea here)
File Type: mkv ToS svt-hevc preset 5 tune psnr-ssim cq 37.mkv (119.64 MB, 19 views)
File Type: mkv ToS svt-hevc preset 5 tune vmaf cq 37.mkv (119.64 MB, 11 views)
File Type: mkv ToS svt-hevc preset 5 tune visually cq 37.mkv (119.64 MB, 16 views)File Type: mkv ToS svt-hevc preset 11 tune vmaf cq 37.mkv (138.53 MB, 4 views)
File Type: mkv ToS svt-hevc preset 11 tune visually cq 37.mkv (138.53 MB, 4 views)
File Type: mkv ToS svt-hevc preset 11 tune psnr-ssim cq 37.mkv (138.53 MB, 11 views)
File Type: mkv tearsofsteel-4k svt-av1 preset 10.mkv (122.04 MB, 5 views)
File Type: mkv tearsofsteel-4k svt-av1 preset 11.mkv (124.51 MB, 6 views)
File Type: mkv ToS svt-hevc preset 7 tune visually cq 37.mkv (123.60 MB, 4 views)
File Type: mkv ToS svt-hevc preset 9 tune visually cq 37.mkv (125.43 MB, 4 views)
File Type: mkv tearsofsteel-4k svt-av1 preset 8.mkv (125.68 MB, 5 views)
File Type: mkv tearsofsteel-4k svt-av1 preset 9.mkv (126.36 MB, 4 views)
File Type: mkv tearsofsteel-4k svt-av1 preset 12.mkv (127.21 MB, 3 views)
File Type: mkv tearsofsteel-4k x265 uf.mkv (131.06 MB, 2 views)
File Type: mkv tearsofsteel-4k x265 sf.mkv (131.05 MB, 2 views)
File Type: mkv tearsofsteel-4k x265 vf.mkv (130.63 MB, 2 views)
File Type: mkv tearsofsteel-4k x265 faster.mkv (130.63 MB, 5 views)
File Type: mkv tearsofsteel-4k x264 vf.mkv (131.76 MB, 2 views)
File Type: mkv tearsofsteel-4k x264 faster.mkv (131.53 MB, 2 views)
File Type: mkv tearsofsteel-4k x264 fast.mkv (131.50 MB, 3 views)
File Type: mkv tearsofsteel-4k x264 medium.mkv (131.64 MB, 6 views)
File Type: mkv tearsofsteel-4k x264 slow.mkv (131.49 MB, 4 views)
File Type: mkv Meridian svt-av1 preset 11 1080p.mkv (133.19 MB, 7 views)
File Type: mkv Meridian x264 1080p.mkv (169.28 MB, 8 views)
File Type: mkv Meridian x265 1080p.mkv (169.39 MB, 5 views)
So with these, one can see what effect the presets combined with the chosen encoding mode have on the source for each of the encoders you used.
Cu Selurusers currently on my ignore list: deadrats, Stears555, marcorocchini -
I am not confused at all.
Each speed preset either enables a new psy optimization or makes it more aggressive:
https://gist.github.com/liyonglion/b175a9fee3251893ff0e8329ff9a3541
BTW, the x264 source includes a document that has the following statements:
By default, MB-tree is used instead of qcomp for weighting frame quality based on complexity. MB-tree is effectively a generalization of qcomp to the macroblock level. MB-tree also replaces the constant offsets for B-frame quantizers. The legacy algorithm is still available for low-latency applications.
Adaptive quantization is now used to distribute quality among each frame; frames are no longer constant quantizer, even if MB-tree is off.
The psy optimizations directly affect performance, for instance the fast preset enables sub-me 6, which enables psy-rd for I and P frames; the medium preset uses sub-me 7, which enables psy-rd for I, P, and B frames.
The psy opts are inextricably tied to encoding speed; as more so-called psy optimizations are enabled, the slower the encoding becomes. -
The one "minor" thing that you ignored is that most of the encodes ended up in the 140MB range, only the very fast preset differed bu any significant margin.
There's no question that the x265 code base is much more advanced than the x264 code base, even still the vf, faster and fast encodes are pretty close to one another.
I was meaning to bring this to the attention of the developer of Hybrid, since that is the program i used for the svt-hevc; maybe he can determine if the tune parameter is being passed to the encoder.
Problem is I have no idea where to find him, maybe we will get lucky and he will stumble upon this thread.
I succeeded because these are 2-pass bit rate based encodes, so the same bit rate was used for all encodes. Further, since each had different levels of psy-opts thanks to using different presets, it's easier to see what snake oil the psy-opts are.
If you check all the encodes, you will notice little discernible difference between presets that use psy-rd and those that don't, meaning all that the psy-opts are good for is wasting cpu cycles. -
maybe he can determine if the tune parameter is being passed to the encoder.
(I don't use SVT-.. encoders, so I'm not surprised I never noticed.
)
Checked Hybrid source and I commented the relevant code part out with 'disabled since it's not working' *gig*
If you check all the encodes, you will notice little discernible difference between presets that use psy-rd and those that don't, meaning all that the psy-opts are good for is wasting cpu cycles.users currently on my ignore list: deadrats, Stears555, marcorocchini -
Yes, psy-rd , psy-trellis affects speed, but how much? Many other options change when you change a preset.
If you change a dozen other parameters, you cannot attribute the speed change or quantify the speed difference to psy alone
eg. If I increase from 1 reference frame to 16 reference frames from ultrafast to placebo , yet disable psy - it's going to be slower too. Is that a result of psy alone ? Absolutely not.
ie. You're not controlling the variables, it's not very scientific - you don't know what is causing what. You need to follow basic scientific principles
To test the effect of psy-rd or psy-trellis on encoding speed directly - you need to keep the other parameters constant, measure fps. Then change 1 variable. e.g. test disabled, then test enabled measure speed. You can also test the magnitude to see how much of a difference like 0.1 vs. 1, vs. 2 etc.... -
Similar Threads
-
Some x264 and SVT-AV1 comparisons that don't make much sense to me
By asw3 in forum Video ConversionReplies: 12Last Post: 17th Jul 2023, 19:46 -
What parameters to use to convert videos to av1 with SVT-AV1 with FFmpeg
By ignace72 in forum Video ConversionReplies: 4Last Post: 19th Mar 2022, 08:52 -
AOM adopts Intel SVT-AV1
By sophisticles in forum Latest Video NewsReplies: 2Last Post: 9th Sep 2020, 09:03 -
SVT-HEVC vs X265
By sophisticles in forum Video ConversionReplies: 5Last Post: 30th Jun 2020, 10:01 -
NetFlix to start using Intel's SVT-AV1
By sophisticles in forum Latest Video NewsReplies: 4Last Post: 8th Jan 2020, 08:12