VideoHelp Forum

Our website is made possible by displaying online advertisements to our visitors. Consider supporting us by disable your adblocker or buy Replay Video Capture or PlayON and record Netflix, HBO, etc! :)
+ Reply to Thread
Results 1 to 5 of 5
Thread
  1. Member
    Join Date
    Jun 2018
    Location
    East Coast
    Search Comp PM
    I'd appreciate some clarification about digital video capture. So a typical consumer camcorder might record 1080p at 30 fps (frames per second) and 60 Mbps (megabits per second). As such, each frame is 1/30 sec in duration so there are 60/30 = 2 Mb available for each frame. For a 1080p frame, there are roughly 2 million pixles and they have to use 2 million bits of data, or on average one bit per pixel. Under the same conditions, a 4K frame would on average have 1/2 bit of data for each pixel.

    I understand that there is compression going on, but that doesn’t seem like a lot of data for each frame of color video. So how do they do it and end up with what looks like high quality video? Thanks.
    Last edited by RealImage; 20th Jun 2018 at 15:10.
    Quote Quote  
  2. Chroma subsampling: The human eye has less color resolution that greyscale resolution. So RGB video is converted to a greyscale image and two color add/subtract images. The color images are reduced in resolution by half in each dimension. So a 1920x1080 video has a 1920x1080 greyscale (luma) channel, and two 960x540 color (chroma) channels. This reduces the size of the data per frame by half.

    Interframe encoding: Most frames are not encoded as entire pictures but only the difference between that frame and other frames. So in a talking head shot maybe only the speaker's lips move from one frame to the next. So only that small part of the picture needs to be encoded for that frame. I.e. "copy the last frame, just change this small section..."

    Discrete Cosine Transformation: Blocks of pixels are converted from the spacial domain to the frequency domain. Low amplitude, high frequency data can be removed without adversely effecting the picture. The remaining frequency data is more easily compressed later (see below).

    Motion Vectors. Often in video something moves from one location to another. Motion vectors, "move this block of pixels from here to there...", are used to reduce the amount of information needed for successive frames.

    Entropy encoding: After all the other compression techniques are used the final step is to losslessly compress what's left. Much like a ZIP file is losslessly compressed.
    Last edited by jagabo; 20th Jun 2018 at 13:29.
    Quote Quote  
  3. Member
    Join Date
    Jun 2018
    Location
    East Coast
    Search Comp PM
    Interesting and thank you. I can see where a head shot would not require a lot of data from one frame to the next. What about, for instance, a landscape video with trees and vegetation moving in the wind? In that case, almost everything is changing, in a somewhat random way, from one frame to the next. It's impressive to think that only 2 million bits of data is sufficient to more or less faithfully capture each frame of such scenery.
    Quote Quote  
  4. Member
    Join Date
    Nov 2007
    Location
    Minneapolis MN
    Search Comp PM
    Originally Posted by RealImage View Post
    It's impressive to think that only 2 million bits of data is sufficient to more or less faithfully capture each frame of such scenery.
    Actually our ATSC 1.0 is far from faithful when capturing things like a babbling brook or wheat field blowing in the wind. If you look carefully(most of the time it's not even that hard to spot) you'll see lots of macroblocking in scenes like that. Another almost impossible thing to reproduce without macroblocking(and most of the time it's god awful!) are strobes. Things like the Grammys look like absolute garbage once strobes start, with our current ATSC 1.0. I hope ATSC 3.0 will be better but I'm skeptical.......
    Quote Quote  
  5. Originally Posted by RealImage View Post
    Interesting and thank you. I can see where a head shot would not require a lot of data from one frame to the next. What about, for instance, a landscape video with trees and vegetation moving in the wind? In that case, almost everything is changing, in a somewhat random way, from one frame to the next. It's impressive to think that only 2 million bits of data is sufficient to more or less faithfully capture each frame of such scenery.
    Yes, more "complex" frames require more bits. This is partially handled by using variable bitrates -- some shots are given more bitrate, some less, as required. But when the complexity of the frame exceeds the allowable bitrate, picture quality starts to suffer. You'll see more loss of detail, blocky artifacts, blurring, etc.
    Quote Quote  



Similar Threads