Hello. I'm new here but not completely new to video stuff in general. During the last couple of years I've been using x265, with ffmpeg, for a variety of different video encodes. It, x265, being new but usable enough happened to coincide with the time I needed to do something about the large amount and variety of large video files filling my hard drives. Sometimes it's very straightforward, other times a little less.
A while ago I was trying to read up on the different pixel formats and found a lot of confusing stuff. For example, every single website explaining 420 subsampling makes it sound like the color were LITERALLY one quarter the of the luminance... which would look like giant colorful lego blocks with a more accurate greyscale picture laid over them. They even have diagrams suggesting something like this.
As I understand it however, what subsampling means is only that color differences can't be expressed accurately, NOT that they can't be expressed at all! I don't understand it fully (exactly because I haven't found any explanation of how it actually does work) but hopefully that's somewhat in the right direction.
Now, I also don't know what exactly happens in the encoding process let's say from h264 to x265. Assuming the h264 source is yuv420, does x265 perform some sort of transforms that could cause even MORE color information to go missing?
I'm getting some interesting results here, encoding from indeed a h264 source file in yuv420p to x265. I'm actually getting a smaller file if I encode to yuv444p instead of keeping the original yuv420p (and yuv422p is larger than either of them for some reason).
+ Reply to Thread
Results 1 to 30 of 68
-
-
First: yes, 4:2:0 does indeed literally mean that you only have 1/4th of the pixels for each of the 2 chroma planes. So for a common Blu-Ray (1080p) you have 3 planes:
1920x1080 pixels of luma (Y)
960x540 pixels of chroma (U)
960x540 pixels of chroma (V)
Both H.264/AVC as well as H.265/HEVC can use the same 4:2:0 YUV color space (among other color spaces). So there is no conversion happening there BUT since H.265/HEVC itself is a lossy codec* you do of course have some loss. Every time you encode with a lossy codec you suffer additional loss.
Don't try to compare different color spaces' resulting file sizes. Identical CRF does not guarantee identical visual quality across different spaces or settings.
(*) I'm ignoring the lossless modes of H.264/AVC and H.265/HEVC here as they are impractical for the average user. -
Are you absolutely sure? As a literally literal example, an image (420) could not then have e.g. a blue line next to a red line (depending on where the block of pixels is located). At all. The entire 4x2 block of pixels would HAVE to be either red or blue, anything else would be impossible. That's how it's explained everywhere.
That's what I don't get. If it was like that, it would be very clearly visible, no?
Analog tv used to have horrible color bleeding sort of like that, but I've NEVER seen anything like that in any digital video.
Is there something I'm missing here?
Both H.264/AVC as well as H.265/HEVC can use the same 4:2:0 YUV color space (among other color spaces). So there is no conversion happening there BUT since H.265/HEVC itself is a lossy codec* you do of course have some loss. Every time you encode with a lossy codec you suffer additional loss. -
Just a quick addition to the subsampling thing: Sites like Wikipedia suggest in particular that (in 4:2:0) the large chroma blocks would be (exactly) the SAME color. I thought, and have seen, it's more like gradients, there's interpolation and "anti-aliasing" involved in what is output on the screen after conversion from yuv to the screen format, instead of large blocks of the SAME color.
Other sites do try to explain how the colors of pixels are somehow derived from those of neighboring pixels, but do not explain how that is done. -
Yes its correct
It depends on the chroma upsampling algorithm, and what method is used for the conversion to RGB for display. Think of it as resizing U, V back to 1920x1080
If you use nearest neighbor, then yes, you would expect red or blue . Any other common method will result in bleeding or some intermediate color depending on what the surrounding pixel values are. You're sampling more than one pixel with commonly used ones like bicubic (16 pixels, 4x4) or bilinear (4 pixels, 2x2)
https://en.wikipedia.org/wiki/Bicubic_interpolation
https://en.wikipedia.org/wiki/Bilinear_interpolation
https://en.wikipedia.org/wiki/Nearest-neighbor_interpolation
There are dozens of others, each with different settings, pros/cons
Yes, sure. But what about loss of color information specifically? For the lack of a better way to put it, I was indeed wondering whether or not the color subsampling would happen TWICE. Or, something to that effect, or something that could be mitigated by using a more accurate colorspace.
If you started with 4:2:0 , there is no farther degradation, except for lossy compression . Unless some other step is used in the pipeline . For example, you upsample to 4:4:4 or have an intermediate RGB step. Then it depends on what algorithm is used, what bit depth. Nearest neighbor can duplicate and drop the same pixels duplicated. So that is a lossless transformation. Others are not lossless -
Do not forget , that those missing color info still offers luma. So our brain fills in anyway, whatever color is next. Brain makes our images anyway and is susceptible to all kinds of tricks as you can find on web and youtube. Mostly adds on what it thinks it should be there.
If you artificially generate an image with blue and red line next to each other only, sure will be mess. Because there is no explanation for brain what it is, and it does not have story for a brain, like sky changing into red car roof etc. Brain knows the border is exact and it is not looking for aliasing or uncertainty. It knows it is clear. Even if 420 cannot make it clear. You'd notice maybe bleeding over 2-3 pixels, but not one.
About tahe SAME color, working in Vapoursynth for example, by choosing correct command line, you can influence that color. For example having 420 image loaded into Vapoursynth and changing it to 422. If not carefull and using Spline36 or other resizers , colors might bet changed. If using point resize, even after upsampling them to 422, they will be the same, pairs would just have same color value, sure 4xbigger value, but under microscope you's see the same image. Well converted to RGB 8bit, both, because you cannot study YUV on screen, just real YUV values, but behing RGB 8 bit conversion.
clip = vs.core.ffms2.Source(r'C:\file.mp4') #YUV420P8
clip = vs.core.resize.Point(clip, format = vs.YUV422P10)
#not the same as:
#clip = vs.core.resize.Spline36(clip, format = vs.YUV422P10) -
to add what pdr says, all those RGB conversions to get visual, where those png's below came from, were changed using Point resize,
and starting clip is always YUV420P8
so top rgb image, just one rgb conversion from original YUV420P8:
rgb = core.resize.Point(clip, matrix_in_s = '709', format = vs.RGB24)
bottom rgb image:
clip = core.resize.Spline36(clip, format=vs.YUV422P10)
rgb = core.resize.Point(clip, matrix_in_s = '709', format = vs.RGB24)
these things can get realy tricky, because any presentation might involve two color changes, first YUV to YUV and second (ussually behind scenes) that YUV to RGB
ALL images are zoomed-in to get visual blocks, yet again point resize on the top!, one block means just one pixel in reality. Real resolution for posted images is 42x16pixels.
Last edited by _Al_; 1st Nov 2019 at 18:20.
-
resize with Point gets the same colors:
clip = core.resize.Point(clip, format=vs.YUV422P10)
rgb = core.resize.Point(clip, matrix_in_s = '709', format = vs.RGB24)Last edited by _Al_; 1st Nov 2019 at 13:51.
-
if I decided for preview to use:
rgb = vs.core.resize.Bicubic(clip, matrix_in_s = '709', format = vs.RGB24) -
To get RGB for preview using Bicubic instead of Point and again starting clip is YUV420P8
so top rgb image:
rgb = core.resize.Bicubic(clip, matrix_in_s = '709', format = vs.RGB24)
bottom rgb image:
clip = core.resize.Point(clip, format=vs.YUV422P10)
rgb = core.resize.Bicubic(clip, matrix_in_s = '709', format = vs.RGB24)
(yes that is not a mistake, Bicubic actually fixes that bottom line)Last edited by _Al_; 1st Nov 2019 at 14:05.
-
again, original clip is YUV420P8
clip = core.resize.Spline36(clip, format=vs.YUV422P10)
rgb = core.resize.Bicubic(clip, matrix_in_s = '709', format = vs.RGB24)
so this is the reason to get feedback what actually IS in those YUV values using software players that convert to RGB behind scenes is very difficult, and I did not go into matrix at all, just using BT709 thru. But players can default to BT601 matrix for some reason and colors would be yet again different.Last edited by _Al_; 1st Nov 2019 at 14:06.
-
Some chroma subsampling examples I posted long ago:
https://forum.videohelp.com/threads/294144-Viewing-tests-and-sample-files#post1792760
https://forum.videohelp.com/threads/319360-DVD-LAB-PRO-color-map#post1977264
Going from x264 4:2:0 to uncompressed YUV 4:2:0 to x265 4:2:0 shouldn't cause additional blurring from chroma subsampling (maybe from the compression if you compress too much). But be aware that some programs will convert to RGB 4:4:4 in the middle -- that may cause additional blurring.Last edited by jagabo; 1st Nov 2019 at 15:06.
-
Thanks for the replies! Just quickly before I lose my train of thought --- any of those pictures show kind of what I thought was going on. There are different shades of red and blue. And most importantly the purple inbetween. There wouldn't be any purple if the chroma "resolution" worked like it's VERY often explained. That's what I was trying to say... something like that anyway.
-
I guess the very short version of the question is: can ffmpeg/x265 cause that? The colorspace by itself is, I suppose, largely irrelevant but I have no possible way of knowing what happens internally in the encoder.
(I've also been reminded that x265 is lossy... which I know it is. But it has been suggested the loss would not apply to color. How so?) -
Not if you use it properly - if there are no other intermediate steps or conversions or filters affecting the chroma
The colorspace by itself is, I suppose, largely irrelevant but I have no possible way of knowing what happens internally in the encoder.
(I've also been reminded that x265 is lossy... which I know it is. But it has been suggested the loss would not apply to color. How so?)
4:2:0 => 4:2:0 is not a lossy transformation . Nothing is done . That step is a no-op. But if you start with 4:2:0 then go to 4:2:2 or 4:4:4, or RGB, then the chroma planes are resampled (resized). Then depending on how you do it, and go back to 4:2:0, that can be lossy -
Chroma subsampling is only the cause of the purple blur between the patches. The different shades of red and blue in the large patches si caused by the precision of the RGB to YUV to RGB conversion (chroma subsampled or not). 8 bit, limited range YUV has only about 1/6 the number of unique colors as 8 bit full range RGB. So 5 out of every 6 RGB colors will come back wrong after a RGB to YUV to RGB conversion. 10 bit YUV has enough unique colors for every 8 bit RGB color. So it's possible for 8 bit RGB to be converted to 10 bit YUV then back to 8 bit RGB losslessly.
It works exactly as was explained. Whether there's any blurring or not depends on the method used to downscale and upscale the chroma -- as was explained by poisondeathray. Point (aka nearest neighbor) resizing avoids the blur but leads to aliasing artifacts. "Smooth" resizing, like blinear, bicubic, etc. lead to blurring at the transitions. The latter looks better with "real" video (as opposed to rectilinear test patterns).Last edited by jagabo; 1st Nov 2019 at 21:02.
-
We're talking about basically the same thing. I didn't say explained HERE (quite the opposite in fact). Tons of other websites on the other hand only explain the part where you throw away 75% of the color information, with absolutely no mention of bicubic or other methods of upsampling that are probably the principal reason why it actually WORKS as well as it does. Do you disagree?
-
Last edited by _Al_; 1st Nov 2019 at 22:25.
-
Using included Vapoursynth script, I drew perfect blue rectangle in red in YUV420P8. YUV values are clean cut, but if changing to RGB24 (8bit) I got different results, depending on resize kernel/method. That is the equivalent of simply previewing YUV 420 8bit video using some player. So I can get very different results using these computer generated images. Depending how that YUV is resized or even what matrix is used. So even if I got YUV values always the same underneath those RGB pixels, looking at it could be a mirage, not real, YUV values are hidden underneath.
Code:import vapoursynth as vs from vapoursynth import core import havsfunc red = core.std.BlankClip(width=38, height=20, format = vs.YUV420P8, color =(51,109,212), length = 10) blue = core.std.BlankClip(width=22, height=10, format = vs.YUV420P8, color =(28,212,120), length = 10) #blue rectangle placement X1,Y1 = (9,4) #making blue in red clip = havsfunc.Overlay(red,blue,X1,Y1) clip = core.std.SetFrameProp(clip, prop="_Matrix", intval=1) #flagging video internally in vapoursynth as BT.709 #zoom-in , increase pixels 16 times to blocks def zoom(clip, kernel_string): return core.resize.Point(clip, clip.width * 16, clip.height * 16).text.Text(kernel_string) clips = [] kernels = ['Point', 'Bicubic', 'Bilinear'] #'Spline16', 'Spline36', 'Lanczos' for kernel_string in kernels: _resize = getattr(core.resize, kernel_string) clips.append(zoom(_resize(clip, format = vs.RGB24), kernel_string)) clip = core.std.StackVertical(clips) clip.set_output()
Last edited by _Al_; 2nd Nov 2019 at 02:14.
-
notice, if I offset placement of that blue rectangle to X1,Y1 = (8,3) instead of X1,Y1 = (9,4) , to throw off subsampling a bit, Point resize is not exact anymore:
-
_Al_ , thanks, those pictures are more than informative. Just to clear any confusion everyone, I never said YUVxxx was better than it is. In my very first post, I said I thought chroma subsampling means colors can NOT be accurately expressed. Which it does mean, and which is why I don't like it.
These kinds of fundamentals, while great, should preferably be in a different post. I'm actually still much more interested in how the end result would be affected after doing these conversions multiple times, which could easily happen. -
The reason chroma subsampling was invented was because the human eye has less resolution for color than it does for greyscale. Storing and transmitting color at a lower resolution saves memory and bandwidth. Sure, when you look at extreme enlargements the loss of color resolution is obvious. But when watching normal video at normal viewing distances it's usually not. All commercial distribution formats use chroma subsampling. Try playing YUV 4:4:4 video on DVD or Blu-ray -- you can't. Skipping YUV entirely would be great too. The light sensors on cameras are RGB. The sub-pixels you see on the TV/monitor are RGB. Most of what's in between is YUV. Video is all about compromises.
Each time the chroma is downsampled and upsampled you get more blurring and/or artifacts. That is why you try to avoid multiple conversions.Last edited by jagabo; 2nd Nov 2019 at 06:45.
-
Yeah. I've read a little bit about that, the history of color television etc. where these things originate. It's fascinating although not easy to understand and apparently not easy to explain either.
What I was referring to earlier, color subsampling is most often explained in a way that easily leaves you thinking THIS is what would happen: (picture borrowed from the other old thread mentioned earlier)
[Attachment 50737 - Click to enlarge]
from original https://forum.videohelp.com/images/guides/p1885214/original.png
...which is obviously NOT how it actually works, at any stage of the literal subsampling and later upsampling with interpolation, the whole process. Like said in sneaker's reply, there are two chroma planes, so it's actually not even correct to say that the "color" is only at quarter resolution (in the case of yuv420).
...So that's why it can be confusing.Last edited by non-vol; 2nd Nov 2019 at 10:42.
-
-
Why not? In particular, are you suggesting one or the other would be significantly worse?
I'm asking because I've read (including from codec developers) that both going from 420 to 444 and 8 bit to 10 bit are likely to "improve compression efficiency" more than anything else.
(this is x264 and/or x265) -
Luma is just brightness, back then using B&W CRT , only Y was used to emit on screen. U and V are differences.
But even upsampling would change color.
But if you choose Point resize, U and V planes just multiply their arrays and if bit depth changes, from 8bit to 10, value is 4x increased. But they are the same on visual.
Code:clip_YUV420P8 = clip clip_YUV444P10 = core.resize.Point(clip_YUV420P8, format = vs.YUV444P10) clipout_YUV420P8 = core.resize.Point(clip_YUV444P10, format = vs.YUV420P8)
Code:clip_YUV420P8 = clip clip_YUV444P10 = core.resize.Bicubic(clip_YUV420P8, format = vs.YUV444P10) clipout_YUV420P8 = core.resize.Bicubic(clip_YUV444P10, format = vs.YUV420P8)
Last edited by _Al_; 2nd Nov 2019 at 10:40.
-
the whole script is here, so if having Vapoursynth you can explore it further:
Code:import vapoursynth as vs from vapoursynth import core import havsfunc red = core.std.BlankClip(width=38, height=20, format = vs.YUV420P8, color =(51,109,212), length = 10) blue = core.std.BlankClip(width=22, height=10, format = vs.YUV420P8, color =(28,212,120), length = 10) #blue rectangle placement, X must be odd and Y even number, otherwise you construct within subsampling - not a clean cut between colors X1,Y1 = (9,4) #making blue in red clip = havsfunc.Overlay(red,blue,X1,Y1) #flagging clip internally in Vapoursynth as BT.709, if there is any conversion to RGB later #yes there is!, on screen later while previewing clips clip = core.std.SetFrameProp(clip, prop="_Matrix", intval=1) clip_YUV420P8 = clip clip_YUV444P10 = core.resize.Bilinear(clip_YUV420P8, format = vs.YUV444P10) clipout_YUV420P8 = core.resize.Bilinear(clip_YUV444P10, format = vs.YUV420P8) #note it can get even trickier if your vsedit or other RGB preview setup for this script #uses Bicubic to RGB transfer etc, use Point #clip_YUV420P8.set_output() #clip_YUV444P10.set_output() clipout_YUV420P8.set_output()
Last edited by _Al_; 2nd Nov 2019 at 15:21.
-
But that is only because of the choice to use a bicubic interpolation, right? Meaning changing the colorspace alone (in this case) would not in itself cause any change at all. Correct?
10 bit vs 8 bit is something I admit I don't really understand at all, what comes to "compression efficiency". The 420 vs 444 case was explained as the encoder simply having much better methods for discarding "unneeded" information compared to chroma subsampling (which is the simplest and stupidest compression method). That does make sense. Whatever the encoder (x265) does, it seems to even be SO good at it that I can encode a yuv420p8 source to yuv444p10, x265 crf 20, and it will still be smaller than the source AND smaller than doing the exact same encode without changing the colorspace.
So one thing I'm trying to figure out is: Is there actually a reason I should NOT upsample here? For these files I have kept the original resolution and applied no filters of any kind. I've also disabled SAO and x265's deblock filter -- I'm not sure whether or not I'd need to "deblock" a h264 source (...as I assume it has already been deblocked when it was encoded to h264, not sure if that's how I should think though?).
(re-encoding these particular files, tv series episodes, is almost an academic exercise as they are lowish res and quite small to begin with... 576p @1200k . I completely understand why you'd say "it's not really worth the bother". However it's astounding how well x265 actually works, and I am able to crunch them even smaller, to around 800k, without noticeable degradation in quality.
It's "almost an academic exercise" on purpose, BTW.) -
But that is only because of the choice to use a bicubic interpolation, right? Meaning changing the colorspace alone (in this case) would not in itself cause any change at all. Correct?
So there is a change. Bicubic interpolates using certain alghorithms, like Bilinear, like Lanczos etc. But not Point resize.
Imagine 4x4 pixel clip YUV420P8, you'd get planes with data like this (valid values for 8bit are 0-255):
Y plane:
200,200,200,200
200,200,200,200
200,200,200,200
200,200,200,200
U plane:
100,100
100,100
V plane:
120,120
120,120
resized to 4x4 YUV444P10 clip, using Point, you'd get this suddenly (valid values for 10bit are 0-1023):
Y plane:
800,800,800,800
800,800,800,800
800,800,800,800
800,800,800,800
U plane:
400,400,400,400
400,400,400,400
400,400,400,400
400,400,400,400
V plane:
480,480,480,480
480,480,480,480
480,480,480,480
480,480,480,480
so you definitely resize two planes even if not apparent on screenLast edited by _Al_; 2nd Nov 2019 at 12:33.
Similar Threads
-
Transcoding x265 to x264
By therock003 in forum Video ConversionReplies: 3Last Post: 27th Oct 2016, 01:19 -
changing chroma subsampling from 4:2:2 to 4:2:0
By yescool2002 in forum Video ConversionReplies: 12Last Post: 3rd Apr 2016, 11:37 -
Specific file size for x265 encoding...
By alryan011 in forum Video ConversionReplies: 2Last Post: 26th May 2015, 17:09 -
Why is chroma subsampling not classed as compression?
By kieranjol in forum Video ConversionReplies: 28Last Post: 17th May 2015, 16:22 -
How to play file with chroma subsampling 4.2.2?
By thinredline in forum Software PlayingReplies: 18Last Post: 30th Dec 2014, 15:55