RGB to YUV - Revisited.
Please help me to review my understanding of the following
steps of conversion:
* step 1 - RGB 4 pixel layout - MAIN
* step 2 - RGB to YUV
* step 3 - Sub-Sampling to 420
* step 4 - stream format of file structure (theory)
* step 5 - block layout (theory, mpeg)
Any advice would greatly be appreciated
-vhelp 3425
STEP 1
RGB - using 4 pixels to represent image:
Code:1 2 3 4 [RGB][RGB][RGB][RGB]
STEP 2
YUV - after converted from RGB, now uses 12 pixels
to represent image from above 4 RGB pixels:
Code:1 2 3 4 -------- -------- -------- -------- [Y][U][V] [Y][U][V] [Y][U][V] [Y][U][V]
STEP 3
After converted to YUV, a sub-sampling of 420 format is
applied, and the new layout reconstruction becomes a total
of 6 pixels, with four 'Y' pixles, and two chroma pixels 'Cb' and 'Cr'
would represent the original 4 RGB pixel layout from STEP 1:
Code:1 2 3 4 [ Y][ Y][ Y][ Y] 5 6 [Cb][Cr]
STEP 4
Storage to a stream file could theorectically look:
Code:1 2 3 4 5 6 7 8 9 10 11 12 [ Y][ Y][ Y][ Y][Cb][Cr][ Y][ Y][ Y][ Y][Cb][Cr] ... or, 1 2 3 4 5 6 7 8 9 10 11 12 [ Y][ Y][ Y][ Y][ u][ v][ Y][ Y][ Y][ Y][ u][ v] ...
STEP 5
Using the original 4 pixel RGB above in STEP 1,
and then converted to YUV in STEP 2, and then sub-sampled
to 420 in STEP 3, could be theoretically viewed in block
form as:
Code:[Y00] [Y01] [Cb4][Cr5] [Y02] [Y03]
+ Reply to Thread
Results 1 to 12 of 12
-
-
I think your step 2 is going to the wrong direction.
In your example, the original 4 pixels that are defined by three bytes each, when converted to YUV will still be 4 pixels, and represented by 3 bytes. I think the best possible explanation comes from a relevant book, so I quote:
The human visual system (HVS) is less sensitive to colour than to luminance (brightness). In the RGB colour space the three colours are equally important and so are usually all stored at the same resolution but it is possible to represent a colour image more efficiently by separating the luminance from the colour information and representing luma with a higher resolution than colour.
The YCbCr colour space and its variations (sometimes referred to as YUV) is a popular way of efficiently representing colour images. Y is the luminance (luma) component and can be calculated as a weighted average of R, G and B:
Y = kr R + kgG + kbB (2.1)
where k are weighting factors.
The colour information can be represented as colour difference (chrominance or chroma) components, where each chrominance component is the difference between R, G or B and the luminance Y :
Cb = B − Y
Cr = R − Y (2.2)
Cg = G − Y
The complete description of a colour image is given by Y (the luminance component) and three colour differences Cb, Cr and Cg that represent the difference between the colour intensity and the mean luminance of each image sample. Figure 2.10 shows the chroma components (red, green and blue) corresponding to the RGB components of Figure 2.9. Here,
mid-grey is zero difference, light grey is a positive difference and dark grey is a negative difference. The chroma components only have significant values where there is a large difference between the colour component and the luma image (Figure 2.1). Note the strong
blue and red difference components.
So far, this representation has little obvious merit since we now have four components instead of the three in RGB. However,Cb + Cr + Cg is a constant and so only two of the three chroma components need to be stored or transmitted since the third component can always be
calculated from the other two. In the YCbCr colour space, only the luma (Y ) and blue and red chroma (Cb, Cr) are transmitted. YCbCr has an important advantage over RGB, that is the Cr and Cb components may be represented with a lower resolution that Y because the HVS is less sensitive to colour than luminance. This reduces the amount of data required to represent the chrominance components without having an obvious effect on visual quality.
To the casual observer, there is no obvious difference between an RGB image and a YCbCr image with reduced chrominance resolution. Representing chroma with a lower resolution than luma in this way is a simple but effective form of image compression.
An RGB image may be converted to YCbCr after capture in order to reduce storage and/or transmission requirements. Before displaying the image, it is usually necessary to convert back to RGB. The equations for converting an RGB image to and from YCbCr colour space and vice versa are given in Equation 2.3 and Equation 2.41. Note that there is no need
to specify a separate factor kg (because kb + kr + kg = 1) and that G can be extracted from the YCbCr representation by subtracting Cr and Cb from Y , demonstrating that it is not necessary to store or transmit a Cg component.The more I learn, the more I come to realize how little it is I know. -
I forgot to mention the source of the text quoted. It is a book by Iain Richardson titled:
H.264 and MPEG4 video compression.
An excellent book, highly recommended.The more I learn, the more I come to realize how little it is I know. -
@vhelp
Your "4 pixel layout" needs further definition. For CCIR-601 (D1) sampling, full bandwidth 4:4:4 RGB is represented by a 720x480 raster in 3 layers R, G an B. Conversion to YUV gets you 3 720x480 layers in YUV or in the digital domain Y, Cb, Cr.
SaSi has the math right.
To take it further, you need to think of YUV as a potentially spatially compressed version of RGB where the "spatial compression" takes advantage of the psychological perception of the human eye and brain. As such, compressed U and V (in terms of XY spacial sampling) can be done to 4x or more in both X and Y (for a progressive frame) without the human eye noticing the difference. Detail is perceived in Y(monochrome) so Y must be kept full bandwidth or the human will detect a difference. (Ref. rods and cones perception in the eye).
If this technique is confined to display, nobody is the wiser and humans never notice that the spatial compression was done.
However ....
Machines (and math algorithms) don't have the same limited perception. Therefore image processing of bandwidth reduced UV has its limitations and must be factored into the algorithm (effects, filter or data compression) being comtemplated with subsampled U and V.
Most all video formats are in the form of YUV for good reason. U and V have been bandwidth limited to varying degrees.
I'll pause now while you describe the context of the question. What are you trying to do? -
Originally Posted by vhelp
1 2 3 4
--------
R R R R
G G G G
B B B B
converted to YUV (same number of samples)
1 2 3 4
--------
Y Y Y Y
U U U U
V V V V
4:2:2 sampling is like this (2/3 the number of samples for the same 4 pixels)
1 2 3 4
--------
Y Y Y Y
U_ U_
V_ V_
4:1:1 sampling is like this (half the number of samples for the same 4 pixels)
1 2 3 4
--------
Y Y Y Y
U
V
In the cases above, U and V samples are spatially coincident with the first (and third for 4:2:2) Y sample.
4:2:0 adds another dimension
U and V are interpolated to fall in between a quadrant of 16 Y pixels (4x4)
See this representation:
http://www.answers.com/topic/ycbcr-sampling
To further complicate matters, 4:2:0 comes in two flavors. One lines up U and V with existing Y pixels (called co-sited), but PAL DV and PALand NTSC DVD interpolate U and V positions to fall between Y samples in X and Y. This spatial interpolation results in very poor multigeneration performance for the PAL DV format compared to 4:1:1 NTSC DV format. -
I was busy (after my time limit exptired and was disconnected
from the web) with my new toy. A USB cam. (USB 2.0 compatible)
and found a dramatic difference between my digital camera, and
my latest usb cam. I can capture 640x480 at 29.970 fps. But,
w/ my digital, no can do. jerks; spits; and dropped frames all
over the place.
.
So, why am I OT 'ing you may ask ? Well, because I've been
busy having fun, and I wanted to share my reason for delayed
response
.
Also, because it has some parts to this RGB / YUV business, and
I was just getting an early taste of it w/ the two usb cams.
and now back to our TOPIC.. Anyways..
Thanks SaSi.. edDV..
But, wait. I was in the middle of composing my response back
to SaSi, when my time limint online ran out (10 hrs) since
this morning. I'm all tuckered out - pfew. Top that off with
trying to get my last words in response above, and what-not.., pfew.
I spend most of my time researching. B believe me. But, nothing
is easy in the things of video.. is it. hehe.
Anyways..
@ edDV
Did you sneak a peak at my response, while I was trying to
respond back ?? Because what you said, I already knew. I just
did not update this page, (lost in time, due to lack of response)
and I did not get to it as of yet)
Ok. I have this Excel Spreadsheet that I made a few months ago, I use
it to keep track of my understanding of these things in RGB and YUV.
And at every chance I get, I update it with new information.
.
What you laid out was basically what was on my sheet already.
I still do not get it all. There is something missing in my
illustration of how things are "put together". My problem is putting
things into words that you all can understand.
.
Perhaps I should revise my Steps above, and then continue. Thanks again
for both your shared input. I appreciate it - I do
I will be back shortly w/ more info, and responses.
-vhelp 3460 -
@vhelp,
The above is right (except that in most editing programs RGB is interleaved instead of planar; just like YUV 4:2:2). You might find the following page also worthwhile to read:
http://www.avisynth.org/Sampling -
@ Wilbert
Thank you for your comment/correction
I had about 3 different (plus) comback responses, but they all, in the
end, would not add up because each one would lead up to several more
questions, and things were just getting out of hand (I'm too impatient
I guess) For now, let me start up fresh. Forgive me if I come
off too redundant in my quest for knowledge..
GOAL:
* to gain knowledge and understanding of the machanics/layout of
the pixel placements when writing out to a RAW bitmap file.
Example, once RGB is converted to YUV (plus a sampling or sub-sampling,
if need-be) to write this out in RAW form, to a file. Later,
restore through a re-conversion back to RGB, and display inside
a bitmap on screen. (for the purpose of this topic, (LAB) a tool
could come in handy)
History:
Part of this discussion has to do with some work that I am envolved
with, on a DCT application, which deals with YUV. Also, on another
project, (a screen capture) where everything seems to be working out
great, speed 'wise, but that I'm stuck with the saving out to a stream
file because I want to save as RAW YUV data, but SO NOT want to use the
API calls for VFW, which I am well aware of. I've had some ups and downs
during this ongoing DCT and Screen Capture project. Then there is yet,
one more project that i'm slowly working on, ..an MPEG encoder.., if it
ever takes off.
ToDo:
* rewrite this topic into a LAB, so that others can make use of it. +/-
* provide a "learning tool" to aid in the understanding of the process
..(RGB > YUV conversions; and the "sampling" that takes place)
* demo at least a sectional part of my DCT project (may bring things
..into perspective here) +/-
I will try and conjure up a tool to aid in the research/educational
of this topic/project. I was meaning to add it into this discission
when things got more clearer to me. And, I as learn more, the ongoing
building of such a tool will reflect this knowledge/growth.
A personal note..
This topic/project is driving me crazy. I have spent months on this,
and still, no progress. I can not take a RGB bitmap; convert it to
YUV color space; incorporate a sampling; and then write this YUV format
out to a RAW file. Then later, to open it and bring it back into
an RGB bitmap for viewing on screen.
.
What illudes me (after the RGB > YUV conversion, which I understand)
is the "sampling" and "pixel arrangement" to format into a RAW file.
No header is required for the file, in this learning excercise.
But, I suspect that there is a standard layout for such YUV files.
I just don't know what they are at this time. So, I'm here, trying
to figure them out, from the bottom up.
So, what I would like to do for this learning excericse is the
following:
* open an rgb bitmap (say a small grid of 8x8 pixels - my DCT project)
* convert this 8x8 rgb grid bitmap to yuv
* apply a sampling to this yuv (ie, 422 / 411 / 420 sampling)
* write it out to a RAW file
then..
* read in this RAW YUV file (who should still be in 8x8 yuv grid)
* convert it to RGB 8x8 grid
* display this rgb 8x8 grid
Gosh. I hope I got everything down here. Anyways..
In the mean time, I'll post whatever notes and things I can (I do have
some on hold) or find interesting to this topic/project. But for now,
I am off to start work on the learning tool
-vhelp 3461 -
I'm also very interested in your project.
But I'm not a programmer, so i don't know how to open (avs ImageSource contains the code to open RGB bmp files) and write files.
The sampling is easy. Start with RGB 8x8 pixels, and convert to full YUV 8x8 pixels.
Consider 4:4:4 YUV -> 4:2:2 YUV (YUY2) for example (section 2.2. of the link i gave you):
line 1: Y1C1 Y2C2 Y3C3 Y4C4
line 2: Y1C1 Y2C2 Y3C3 Y4C4
etc ...
Sampling:
line 1: Y1C1x Y2 Y3C3x Y4
line 2: Y1C1x Y2 Y3C3x Y4
with
C1x = (C1+C1+C2)/3 // edge
C3x = (C2+C3+C3+C4)/4
C5x = (C4+C5+C5+C6)/4
4:4:4 YUV -> 4:2:0 YUV (YV12) is more difficult. Convert to YUY2 first, and then to YV12. The YUY2 -> 'YV12 progressive conversion' is given in the same section of the link.
The layout of raw YUV depends on which YUV flavour you want to write.
For YUY2 it's simply (Y1* first line, Y2* second line, etc ...):
Y11U11 Y12V13 Y13U13 Y14V14 ... Y21U21 Y22V22 Y23U23 Y24V24 ...
The second frame comes directly after it.
You need to write the YUV samples in the order as given above. So when writing 8 pixels per line you get 16 bytes per line (8 for the luma values and 8 for the chroma).
YV12 is even more simple (Y1* first line, Y2* second line, etc ...): It's planar, so first Y plane, followed by V plane, followed by U plane.
Y11 Y12 Y13 Y14 ... Y21 Y22 Y23 Y24 ... Y31 Y32 Y33 Y34 ... Y41 Y42 Y43 Y44 ...
V11 V13 ... V31 V33 ...
U11 U13 ... U31 U33 ...
The second frame comes directly after it.
So when writing the first frame you get: 8*8 = 64 luma samples, followed by 4*4=16 V samples, followed by 4*4=16 U samples.
ps, i didn't say it explicitly but you have multiple foms of 4:2:2 and 4:2:0.
ps2, the position of the chroma samples in the YUY2 and YV12 (and all others) layouts gives you the weight you should apply when sampling. -
Hi Wilbert
Thanks again for your follow-up. Appreciated
And it pleases me that you have interest/curiosities in my project(s)
However, your great explanation has given me a headacke, and I
I must step back and take a break and regroup.
.
.
EDIT:
After rereading my last post, I had decided that I did not like the
way it all came out, and felt that it was too much, and I resorted
to editing it out of this latest response here.
In addition to what I have already from my internet searches, and those
notes / responses I received here, I will take the time to study them all.
Perhaps in 2 or 3 weeks I'll have found the answers to all these puzzle
pieces.
Thank you all, for your patiance with me and helping me out as much
as you all could afford in your spare time with my lack of understanding
this YUV Sampling and Layout/Pixel Placement for RAW storage.
-vhelp 3464 -
vhelp, you might want to start with the "easiest" part. Just open RGB24 bitmaps, do R<->G (or something else which is trivial) and write it back to
1) raw RGB24.
2) a bitmap.
If you can do the above, the YUV conversion/sampling should be pretty trivial. -
@ Wilbert,
I'm finally back for more, hehe
Question.. regarding the fragment below, after I perfeorm the
RGB -> YUV conversion on an image.
( from the avisythn sampling page.. sec 2.2 )
Code:Recall the layout of a 4:4:4 encoded image frame line Y1C1 Y2C2 Y3C3 Y4C4 line 1 Y1C1 Y2C2 Y3C3 Y4C4 line 2 Y1C1 Y2C2 Y3C3 Y4C4 line 3 Y1C1 Y2C2 Y3C3 Y4C4 line 4
Code:In AviSynth, the default mode is using a 1-2-1 kernel to interpolate chroma, that is C1x = (C1+C1+C2)/4 ??? C3x = (C2+C3+C3+C4)/4 C5x = (C4+C5+C5+C6)/4
From sec 2.1 The color formats: RGB, YUY2 and YV12
.
.
The term 4:4:4 denotes that for every four samples of the
luminance (Y), there are four samples each of U and V. Thus
each pixel has a luminance value (Y), a U value (blue difference
sample or Cb) and a V value (red difference sample or Cr).
Note, "C" is just a chroma sample (UV-sample).
.
.
They seem to have been replaced with Cxx, as in C1 C2 C3 and C4
(I'm assuming per line of 4 'Y' samples, where every luma sample
is 'YYYY' are considered a line, as you seem to be refereing to.
Am I Correct ??
So, after looking this over, and over and over ..
I have concluded that the Cxx is suppose to be refered to as
the U and V sampled together as one Chroma value, hence..
if the following (simplified) luma sample per line were:
Code:Y Y Y Y U=100 V=255 U=110 V=250 U=120 V=240 U=130 V=230
C2 = (U+V)/2 = n -- (or 180 )
C3 = (U+V)/2 = n -- (or 180 )
C4 = (U+V)/2 = n -- (or 180 )
per luma line sample ??
So, the UV-sample seems to indicate to me, that these two
(now refered [after rgb -> yuv conversion] to as chroma values
that are merged or averaged by using the above formula that I just
noted above.
If this is true, then this is a sampling all on it's own, before
the other sampling ( ie, 422 or sub-sampling, 420 ) would take
place. Interesting
Am I correct ??
So now, (if correct) the C1 C2 C3 and C4 now become,
C C C C, or YC YC YC YC, or from:
YC1 YC2 YC3 YC4 - line 1
YC1 YC2 YC3 YC4 - line 2
YC1 YC2 YC3 YC4 - line 3
YC1 YC2 YC3 YC4 - line 4
to this:
YC YC YC YC - line 1
YC YC YC YC - line 2
YC YC YC YC - line 3
YC YC YC YC - line 4
[ Y(u+v)/2 ] for every pixel per 4 pixel in an RGB -> YUV conversion
for luma pre-sampling, or the first stage for formating/sampling.
Now, we continue with the sampling you demonstrated in the above post
you made earlier with the following:
The sampling is easy. Start with RGB 8x8 pixels, and convert to full YUV 8x8 pixels.
Consider 4:4:4 YUV -> 4:2:2 YUV (YUY2) for example (section 2.2. of the link i gave you):
line 1: Y1C1 Y2C2 Y3C3 Y4C4
line 2: Y1C1 Y2C2 Y3C3 Y4C4
etc ...
Sampling:
line 1: Y1C1x Y2 Y3C3x Y4
line 2: Y1C1x Y2 Y3C3x Y4
with
C1x = (C1+C1+C2)/3 // edge
C3x = (C2+C3+C3+C4)/4
C5x = (C4+C5+C5+C6)/4
.
.
Am I correct ??
Thanks,
-vhelp 3479
Similar Threads
-
Understanding RGB filter algorithm
By brianwarming in forum ProgrammingReplies: 5Last Post: 6th Feb 2009, 21:16 -
PS3 MPEG4 AVC - RGB or YUV?
By Colmino in forum Newbie / General discussionsReplies: 1Last Post: 28th Jun 2008, 14:50 -
Separation of YUV and RGB video components
By Dave1024 in forum ProgrammingReplies: 0Last Post: 24th Jun 2008, 04:19 -
App to identify whether a video is RGB or YUV?
By Colmino in forum Newbie / General discussionsReplies: 2Last Post: 12th Jun 2008, 23:52 -
Optimized Conversion between YUV and RGB
By Dave1024 in forum ProgrammingReplies: 14Last Post: 27th May 2008, 01:40