VideoHelp Forum
+ Reply to Thread
Page 1 of 2
1 2 LastLast
Results 1 to 30 of 34
Thread
  1. right now I'm working on a project about "Performance comparison of AV1, VP9, HEVC, AVC, THOR"

    I have some questions, I read a lot of website and papers but I can't find a good answer for my questions

    I'm using HEVC HW reference software, AVC JM reference software, AOM AV1, VP9

    Kubuntu 18.04 , Intel® Core™ i7-3610QM,6GB ram,7670m 2gb ddr3 AMD

    Questions :

    - where I get 8k yuv? I just found 8k 360 VR

    - when I'm using video with 120fps, should I tell the encoder its 120fps, or if I input its 50fps, Is it a problem and get the wrong readings? I'm asking because of AVC encoder JM the limit for frames 100fps

    - why there is no sound in yuv files?

    - whats the better yuv player, I'm using yuviwe and vooya, and sometimes I need to edit the setting to make it read the yuv file

    - I read some paper on IEEE I see them compare the compression rate, when I encode av1 or vp9 I got WebM file , and for avc and hevc using (JM, HW ) I got for avc yuv and .264 and for hevc I got .bin
    so the question is the size for this files, is the final result , I mean how I can convert them to mp4 or mkv ?

    - I will compare them with the same setting and then I change QP different values, and then calculate (PSNR, SSIM, MS-SSIM, VQM,) is this right way to compare them?

    - whats your thoughts about "VMAF - Video Multi-Method Assessment Fusion NETFLIX" do you recommend me to use it for compare?

    - whats the highest sitting for each one of them?

    - I read a paper about Performance comparison between hevc and avc, and he was using different QP not the same, I mean avc qp=30 and hevc qp=29, is this normal?

    - what happens to the sound in compression process?

    - is it good when I use all CPU power? I heard its not good for video quilty when you speed up the process

    - I know you can encode AV1 with FFmpeg or aomedia source code, so which better?

    sorry for all this question, but I don't find a good site to answer my questions
    Last edited by rockerovo; 23rd Jul 2018 at 07:29.
    Quote Quote  
  2. Member
    Join Date
    Aug 2013
    Location
    Central Germany
    Search PM
    Do not care about encoding UHD resolutions when your PC has only 6 GB RAM. You wish you had 32 GB if you had 8K UHD sources, you already need 16 GB for 4K UHD to be encoded with x265 to avoid swapping to your harddisk all day long.

    Yes, the encoder should know the frame rate. I do remember that x264/x265 tune the meaning of CRF in relation to it, because the shorter you can see a frame, the less you will notice flaws in them.

    YUV is a color space for video frames, similar to RGB. Video only. No audio. But you are testing video encoders only, anyway, neither of them will process audio. It will be ignored.

    Video only encoders may produce only raw video streams if they don't contain a multiplexer to wrap them with a container (the raw stream format for VPx and AOM is IVF). If you want to compare the quality at a specific size, you will want to compare the size of the raw video stream only, additional container headers would produce wrong results. To watch the videos in players which need containers (some players can't identify raw video streams properly), use the usual recommended multiplexers (e.g. GPAC MP4Box to multiplex AVC or HEVC into MP4, or MKVtoolnix to multiplex anything into MKV).

    The subjectively optimal way to compare videos is to try to produce raw streams of the same output size, as good as it gets, and then watch them in comparison to the original video, without knowing which is which (ABX blind test), rating how annoying the differences subjectively appear, using thousands of participants. Objective metrics calculating a difference between original and encoded-and-restored result hardly get close to the average opinion of a variety of people. But if you don't have a choice ... SSIM (and variations), VQM, VMAF are some of the best objective metrics we have; PSNR is known to be fooled easily with academic samples.

    Performance comparisons are a complicated topic. One codec can be more efficient than another because it does not do the same calculations; but how do we compare obviously different algorithms if we can't compare the speed of the same, because they just don't do the same? It's hardly possible to make two codecs have the same calculation efforts. I would not even try to achieve that. You can, of course, report that a range of available presets, as designed by the developers of each codec, may result in magnitudes of durations. But better be vague and don't claim a certain speed without knowing specific hardware.

    A high CPU utilization is good, it means good parallelization, thus good efficiency. But different algorithms are more or less parallelizable, and some codecs use heavily parallelized SIMD / Vector instructions which are not reported as utilizing all cores equally. It even depends on the video material. So don't compare CPU utilization values across several different codecs; you can't get fair results.

    It doesn't matter whether you use a separate aomenc encoder or ffmpeg if both contain exactly the same libaom codec core.
    Last edited by LigH.de; 23rd Jul 2018 at 08:04.
    Quote Quote  
  3. Originally Posted by rockerovo View Post
    right now I'm working on a project about "Performance comparison of AV1, VP9, HEVC, AVC, THOR"

    I have some questions, I read a lot of website and papers but I can't find a good answer for my questions

    I'm using HEVC HW reference software, AVC JM reference software, AOM AV1, VP9

    Kubuntu 18.04 , Intel® Core™ i7-3610QM,6GB ram,7670m 2gb ddr3 AMD

    Questions :
    Based on your questions i would advise you to gain some knowledge for example start reading video codec developer blogs:
    https://medium.com/@luc.trudeau

    Codec testing can be done in many ways but seem your area of codec testing is most difficult one.


    Originally Posted by rockerovo View Post
    - where I get 8k yuv? I just found 8k 360 VR
    You can shot 8k your self, you can ask camera manufacturers, optical sensor manufacturers etc, computer generated graphics (synthetic patterns and normal graphics) is also some option.

    Originally Posted by rockerovo View Post
    - when I'm using video with 120fps, should I tell the encoder its 120fps, or if I input its 50fps, Is it a problem and get the wrong readings? I'm asking because of AVC encoder JM the limit for frames 100fps
    It is not important - if your goal is compare codecs then use same settings for each of them, common denominator should be codec with lowest capabilities. Personally i doubt if any of them is capable to deliver more than 5fps for 8K so realtime encoding speed is not feasible

    Originally Posted by rockerovo View Post
    - why there is no sound in yuv files?
    because video codec is video codec - only video data are required and yuv are RAW video data - i'm not aware of existence any video codec that store audio data.

    Originally Posted by rockerovo View Post
    - whats the better yuv player, I'm using yuviwe and vooya, and sometimes I need to edit the setting to make it read the yuv file
    You can use ffplay or you can wrote own yuv player (yuv is simply data - known regular structure)

    Originally Posted by rockerovo View Post
    - I read some paper on IEEE I see them compare the compression rate, when I encode av1 or vp9 I got WebM file , and for avc and hevc using (JM, HW ) I got for avc yuv and .264 and for hevc I got .bin
    so the question is the size for this files, is the final result , I mean how I can convert them to mp4 or mkv ?
    You need to use multiplexer capable to understand you raw codec to create desired multimedia container - however for video comparison i consider this unneeded exercise.

    Originally Posted by rockerovo View Post
    - I will compare them with the same setting and then I change QP different values, and then calculate (PSNR, SSIM, MS-SSIM, VQM,) is this right way to compare them?
    I have no clue what you wish to compare as most of those codecs are immature and suffer from many problems - personally i think (based on your questions) that thorough comparison may not be possible - you should definitely focus on gaining some basic knowledge before starting such difficult task (i would not dare to compare those codecs without coding experience and experience with codec implementations as most of them is available in plain C code which focus on other than speed aspects).

    Originally Posted by rockerovo View Post
    - whats your thoughts about "VMAF - Video Multi-Method Assessment Fusion NETFLIX" do you recommend me to use it for compare?
    VMAF focus on video (picture and motion) where PSNR, SSIM and similar are focused on picture quality - no clue if VMAF was trained for 8K video's (AFAIR it is trained for HD only).

    Originally Posted by rockerovo View Post
    - whats the highest sitting for each one of them?
    Don't understand your question


    Originally Posted by rockerovo View Post
    - I read a paper about Performance comparison between hevc and avc, and he was using different QP not the same, I mean avc qp=30 and hevc qp=29, is this normal?
    No clue - IMHO with different codec structure each codec may have different QP (that's why i've wrote that i would not dare to perform codec comparison without sophisticated knowledge how codec is designed - you need to understand how to evaluate codecs by understanding truly how codec process data - this is very challenging even for people with many years experience on this).

    Originally Posted by rockerovo View Post
    - what happens to the sound in compression process?
    Nothing - sound is processed completely independently from video - audio and video need to be combined and synchronized - those things are controlled not by video codec.

    Originally Posted by rockerovo View Post
    - is it good when I use all CPU power? I heard its not good for video quilty when you speed up the process
    You need to know how codec is designed - every codec is designed in particular way and some of them are capable to use for example more cores.
    So real codec comparison will rather require to count CPU cycles spent by codec in every codec block - you need to be familiar with development tools and code profilers (if you have this kind of knowledge then you can easily work for some company for +150k$ yearly)

    Originally Posted by rockerovo View Post
    - I know you can encode AV1 with FFmpeg or aomedia source code, so which better?
    You can compare them both... there is also other AV1 codec: RAV1E https://github.com/xiph/rav1e
    Quote Quote  
  4. Originally Posted by LigH.de View Post
    Do not care about encoding UHD resolutions when your PC has only 6 GB RAM. You wish you had 32 GB if you had 8K UHD sources, you already need 16 GB for 4K UHD to be encoded with x265 to avoid swapping to your harddisk all day long.

    Yes, the encoder should know the frame rate. I do remember that x264/x265 tune the meaning of CRF in relation to it, because the shorter you can see a frame, the less you will notice flaws in them.

    YUV is a color space for video frames, similar to RGB. Video only. No audio. But you are testing video encoders only, anyway, neither of them will process audio. It will be ignored.

    Video only encoders may produce only raw video streams if they don't contain a multiplexer to wrap them with a container (the raw stream format for VPx and AOM is IVF). If you want to compare the quality at a specific size, you will want to compare the size of the raw video stream only, additional container headers would produce wrong results. To watch the videos in players which need containers (some players can't identify raw video streams properly), use the usual recommended multiplexers (e.g. GPAC MP4Box to multiplex AVC or HEVC into MP4, or MKVtoolnix to multiplex anything into MKV).

    The subjectively optimal way to compare videos is to try to produce raw streams of the same output size, as good as it gets, and then watch them in comparison to the original video, without knowing which is which (ABX blind test), rating how annoying the differences subjectively appear, using thousands of participants. Objective metrics calculating a difference between original and encoded-and-restored result hardly get close to the average opinion of a variety of people. But if you don't have a choice ... SSIM (and variations), VQM, VMAF are some of the best objective metrics we have; PSNR is known to be fooled easily with academic samples.

    Performance comparisons are a complicated topic. One codec can be more efficient than another because it does not do the same calculations; but how do we compare obviously different algorithms if we can't compare the speed of the same, because they just don't do the same? It's hardly possible to make two codecs have the same calculation efforts. I would not even try to achieve that. You can, of course, report that a range of available presets, as designed by the developers of each codec, may result in magnitudes of durations. But better be vague and don't claim a certain speed without knowing specific hardware.

    A high CPU utilization is good, it means good parallelization, thus good efficiency. But different algorithms are more or less parallelizable, and some codecs use heavily parallelized SIMD / Vector instructions which are not reported as utilizing all cores equally. It even depends on the video material. So don't compare CPU utilization values across several different codecs; you can't get fair results.

    It doesn't matter whether you use a separate aomenc encoder or ffmpeg if both contain exactly the same libaom codec core.
    hi

    Thank you for the detailed and informative answer
    but I'm trying to encode short files (5-10 second), I did encode some files, I don't have any problems with 6gb ram
    Quote Quote  
  5. You can shot 8k your self, you can ask camera manufacturers, optical sensor manufacturers etc, computer generated graphics (synthetic patterns and normal graphics) is also some option.
    okay, how I can convert Canon raw "CRW" to yuv?

    Thank you for your response
    Quote Quote  
  6. Originally Posted by rockerovo View Post
    okay, how I can convert Canon raw "CRW" to yuv?
    https://www.lifewire.com/crw-file-2620390
    http://rawtherapee.com/
    https://helpx.adobe.com/photoshop/using/adobe-dng-converter.html
    http://www.cybercom.net/~dcoffin/dcraw/

    Or use Canon provided software (or newer Canon Digital Photo Professional software)
    Quote Quote  
  7. Hi sorry for asking a lot of questions, right now I'm encoding a 1080p file with :
    HEVC : HW Reference Software
    AVC : JW Reference Software
    VP9 : FFMPEG-libvpx-vp9

    the cfg sieeting for HW encoder :

    Code:
    #======== File I/O ===============
    InputBitDepth                 : 8          # Input bitdepth
    InputChromaFormat             : 420         # Ratio of luminance to chrominance samples
    FrameRate                     : 120          # Frame Rate per second
    FrameSkip                     : 0           # Number of frames to be skipped in input
    SourceWidth                   : 1920        # Input  frame width
    SourceHeight                  : 1080        # Input  frame height
    FramesToBeEncoded             : 10         # Number of frames to be coded
    
    PrintFrameMSE         : 1
    PrintSequenceMSE        : 1
    
    #======== Profile ================
    Profile                       : main
    Level                         : 5.2
    
    #======== Unit definition ================
    MaxCUWidth                    : 64          # Maximum coding unit width in pixel
    MaxCUHeight                   : 64          # Maximum coding unit height in pixel
    MaxPartitionDepth             : 4           # Maximum coding unit depth
    QuadtreeTULog2MaxSize         : 5           # Log2 of maximum transform size for
                                                # quadtree-based TU coding (2...6)
    QuadtreeTULog2MinSize         : 2           # Log2 of minimum transform size for
                                                # quadtree-based TU coding (2...6)
    QuadtreeTUMaxDepthInter       : 3
    QuadtreeTUMaxDepthIntra       : 3
    
    #======== Coding Structure =============
    IntraPeriod                   : 32          # Period of I-Frame ( -1 = only first)
    DecodingRefreshType           : 1           # Random Accesss 0:none, 1:CRA, 2:IDR, 3:Recovery Point SEI
    GOPSize                       : 8           # GOP Size (number of B slice = GOPSize-1)
    ReWriteParamSetsFlag          : 1           # Write parameter sets with every IRAP
    
    IntraQPOffset                 : -3
    LambdaFromQpEnable            : 1           # see JCTVC-X0038 for suitable parameters for IntraQPOffset, QPoffset, QPOffsetModelOff, QPOffsetModelScale when enabled
    #        Type POC QPoffset QPOffsetModelOff QPOffsetModelScale CbQPoffset CrQPoffset QPfactor tcOffsetDiv2 betaOffsetDiv2 temporal_id #ref_pics_active #ref_pics reference pictures     predict deltaRPS #ref_idcs reference idcs 
    Frame1:  B    8   1        0.0                      0.0        0          0          0.442    0            0              0           2                3         -8 -12 -16             0
    Frame2:  B    4   2        0.0                      0.0        0          0          0.3536   0            0              1           2                3         -4  -8   4             1       4        4         1 1 0 1
    Frame3:  B    2   3        0.0                      0.0        0          0          0.3536   0            0              2           2                4         -2  -6   2 6           1       2        4         1 1 1 1
    Frame4:  B    1   4        0.0                      0.0        0          0          0.68     0            0              3           2                4         -1   1   3 7           1       1        5         1 0 1 1 1
    Frame5:  B    3   4        0.0                      0.0        0          0          0.68     0            0              3           2                4         -1  -3   1 5           1      -2        5         1 1 1 1 0
    Frame6:  B    6   3        0.0                      0.0        0          0          0.3536   0            0              2           2                3         -2  -6   2             1      -3        5         0 1 1 1 0
    Frame7:  B    5   4        0.0                      0.0        0          0          0.68     0            0              3           2                4         -1  -5   1 3           1       1        4         1 1 1 1
    Frame8:  B    7   4        0.0                      0.0        0          0          0.68     0            0              3           2                4         -1  -3  -7 1           1      -2        5         1 1 1 1 0 
    
    #=========== Motion Search =============
    FastSearch                    : 1           # 0:Full search  1:TZ search
    SearchRange                   : 256         # (0: Search range is a Full frame)
    BipredSearchRange             : 4           # Search range for bi-prediction refinement
    HadamardME                    : 1           # Use of hadamard measure for fractional ME
    FEN                           : 1           # Fast encoder decision
    FDM                           : 1           # Fast Decision for Merge RD cost
    
    #======== Quantization =============
    QP                            : 29          # Quantization parameter(0-51)
    MaxDeltaQP                    : 0           # CU-based multi-QP optimization
    MaxCuDQPDepth                 : 0           # Max depth of a minimum CuDQP for sub-LCU-level delta QP
    DeltaQpRD                     : 0           # Slice-based multi-QP optimization
    RDOQ                          : 1           # RDOQ
    RDOQTS                        : 1           # RDOQ for transform skip
    SliceChromaQPOffsetPeriodicity: 0           # Used in conjunction with Slice Cb/Cr QpOffsetIntraOrPeriodic. Use 0 (default) to disable periodic nature.
    SliceCbQpOffsetIntraOrPeriodic: 0           # Chroma Cb QP Offset at slice level for I slice or for periodic inter slices as defined by SliceChromaQPOffsetPeriodicity. Replaces offset in the GOP table.
    SliceCrQpOffsetIntraOrPeriodic: 0           # Chroma Cr QP Offset at slice level for I slice or for periodic inter slices as defined by SliceChromaQPOffsetPeriodicity. Replaces offset in the GOP table.
    
    #=========== Deblock Filter ============
    LoopFilterOffsetInPPS         : 1           # Dbl params: 0=varying params in SliceHeader, param = base_param + GOP_offset_param; 1 (default) =constant params in PPS, param = base_param)
    LoopFilterDisable             : 0           # Disable deblocking filter (0=Filter, 1=No Filter)
    LoopFilterBetaOffset_div2     : 0           # base_param: -6 ~ 6
    LoopFilterTcOffset_div2       : 0           # base_param: -6 ~ 6
    DeblockingFilterMetric        : 0           # blockiness metric (automatically configures deblocking parameters in bitstream). Applies slice-level loop filter offsets (LoopFilterOffsetInPPS and LoopFilterDisable must be 0)
    
    #=========== Misc. ============
    InternalBitDepth              : 8           # codec operating bit-depth
    
    #=========== Coding Tools =================
    SAO                           : 1           # Sample adaptive offset  (0: OFF, 1: ON)
    AMP                           : 1           # Asymmetric motion partitions (0: OFF, 1: ON)
    TransformSkip                 : 1           # Transform skipping (0: OFF, 1: ON)
    TransformSkipFast             : 1           # Fast Transform skipping (0: OFF, 1: ON)
    SAOLcuBoundary                : 0           # SAOLcuBoundary using non-deblocked pixels (0: OFF, 1: ON)
    
    #============ Slices ================
    SliceMode                : 0                # 0: Disable all slice options.
                                                # 1: Enforce maximum number of LCU in an slice,
                                                # 2: Enforce maximum number of bytes in an 'slice'
                                                # 3: Enforce maximum number of tiles in a slice
    SliceArgument            : 1500             # Argument for 'SliceMode'.
                                                # If SliceMode==1 it represents max. SliceGranularity-sized blocks per slice.
                                                # If SliceMode==2 it represents max. bytes per slice.
                                                # If SliceMode==3 it represents max. tiles per slice.
    
    LFCrossSliceBoundaryFlag : 1                # In-loop filtering, including ALF and DB, is across or not across slice boundary.
                                                # 0:not across, 1: across
    
    #============ PCM ================
    PCMEnabledFlag                      : 0                # 0: No PCM mode
    PCMLog2MaxSize                      : 5                # Log2 of maximum PCM block size.
    PCMLog2MinSize                      : 3                # Log2 of minimum PCM block size.
    PCMInputBitDepthFlag                : 1                # 0: PCM bit-depth is internal bit-depth. 1: PCM bit-depth is input bit-depth.
    PCMFilterDisableFlag                : 0                # 0: Enable loop filtering on I_PCM samples. 1: Disable loop filtering on I_PCM samples.
    
    #============ Tiles ================
    TileUniformSpacing                  : 0                # 0: the column boundaries are indicated by TileColumnWidth array, the row boundaries are indicated by TileRowHeight array
                                                           # 1: the column and row boundaries are distributed uniformly
    NumTileColumnsMinus1                : 0                # Number of tile columns in a picture minus 1
    TileColumnWidthArray                : 2 3              # Array containing tile column width values in units of CTU (from left to right in picture)   
    NumTileRowsMinus1                   : 0                # Number of tile rows in a picture minus 1
    TileRowHeightArray                  : 2                # Array containing tile row height values in units of CTU (from top to bottom in picture)
    
    LFCrossTileBoundaryFlag             : 1                # In-loop filtering is across or not across tile boundary.
                                                           # 0:not across, 1: across 
    
    #============ WaveFront ================
    WaveFrontSynchro                    : 0                # 0:  No WaveFront synchronisation (WaveFrontSubstreams must be 1 in this case).
                                                           # >0: WaveFront synchronises with the LCU above and to the right by this many LCUs.
    
    #=========== Quantization Matrix =================
    ScalingList                   : 0                      # ScalingList 0 : off, 1 : default, 2 : file read
    ScalingListFile               : scaling_list.txt       # Scaling List file name. If file is not exist, use Default Matrix.
    
    #============ Lossless ================
    TransquantBypassEnableFlag : 0                         # Value of PPS flag.
    CUTransquantBypassFlagForce: 0                         # Force transquant bypass mode, when transquant_bypass_enable_flag is enabled
    
    #============ Rate Control ======================
    RateControl                         : 0                # Rate control: enable rate control
    TargetBitrate                       : 1000000          # Rate control: target bitrate, in bps
    KeepHierarchicalBit                 : 2                # Rate control: 0: equal bit allocation; 1: fixed ratio bit allocation; 2: adaptive ratio bit allocation
    LCULevelRateControl                 : 1                # Rate control: 1: LCU level RC; 0: picture level RC
    RCLCUSeparateModel                  : 1                # Rate control: use LCU level separate R-lambda model
    InitialQP                           : 0                # Rate control: initial QP
    RCForceIntraQP                      : 0                # Rate control: force intra QP to be equal to initial QP
    
    ### DO NOT ADD ANYTHING BELOW THIS LINE ###
    ### DO NOT DELETE THE EMPTY LINE BELOW ###
    then i run this command :

    Code:
     ./TAppEncoderStatic -c encoder_randomaccess_main.cfg -i /home/siraj/Desktop/Project/Samples/Jockey_1920x1080_120fps_420_8bit_YUV.yuv -b hevc_1080_29.bin -o hevc_1080_29.yuv >> hevc_1080_29.txt
    the "hevc_1080_29.txt" output :

    Code:
    HM software: Encoder Version [16.18] (including RExt)[Linux][GCC 7.3.0][64 bit] 
    
    
    Input          File                    : /home/siraj/Desktop/Project/Samples/Jockey_1920x1080_120fps_420_8bit_YUV.yuv
    Bitstream      File                    : hevc_1080_29.bin
    Reconstruction File                    : hevc_1080_29.yuv
    Real     Format                        : 1920x1080 120Hz
    Internal Format                        : 1920x1080 120Hz
    Sequence PSNR output                   : Linear average only
    Sequence MSE output                    : Enabled
    Frame MSE output                       : Enabled
    MS-SSIM output                         : Disabled
    Cabac-zero-word-padding                : Enabled
    Frame/Field                            : Frame based coding
    Frame index                            : 0 - 9 (10 frames)
    Profile                                : main
    CU size / depth / total-depth          : 64 / 4 / 4
    RQT trans. size (min / max)            : 4 / 32
    Max RQT depth inter                    : 3
    Max RQT depth intra                    : 3
    Min PCM size                           : 8
    Motion search range                    : 256
    Intra period                           : 32
    Decoding refresh type                  : 1
    QP                                     : 29
    Max dQP signaling depth                : 0
    Cb QP Offset                           : 0
    Cr QP Offset                           : 0
    QP adaptation                          : 0 (range=0)
    GOP size                               : 8
    Input bit depth                        : (Y:8, C:8)
    MSB-extended bit depth                 : (Y:8, C:8)
    Internal bit depth                     : (Y:8, C:8)
    PCM sample bit depth                   : (Y:8, C:8)
    Intra reference smoothing              : Enabled
    diff_cu_chroma_qp_offset_depth         : -1
    extended_precision_processing_flag     : Disabled
    implicit_rdpcm_enabled_flag            : Disabled
    explicit_rdpcm_enabled_flag            : Disabled
    transform_skip_rotation_enabled_flag   : Disabled
    transform_skip_context_enabled_flag    : Disabled
    cross_component_prediction_enabled_flag: Disabled
    high_precision_offsets_enabled_flag    : Disabled
    persistent_rice_adaptation_enabled_flag: Disabled
    cabac_bypass_alignment_enabled_flag    : Disabled
    log2_sao_offset_scale_luma             : 0
    log2_sao_offset_scale_chroma           : 0
    Cost function:                         : Lossy coding (default)
    RateControl                            : 0
    WPMethod                               : 0
    Max Num Merge Candidates               : 5
    
    TOOL CFG: IBD:0 HAD:1 RDQ:1 RDQTS:1 RDpenalty:0 LQP:0 SQP:0 ASR:0 MinSearchWindow:8 RestrictMESampling:0 FEN:1 ECU:0 FDM:1 CFM:0 ESD:0 RQT:1 TransformSkip:1 TransformSkipFast:1 TransformSkipLog2MaxSize:2 Slice: M=0 SliceSegment: M=0 CIP:0 SAO:1 PCM:0 TransQuantBypassEnabled:0 WPP:0 WPB:0 PME:2  WaveFrontSynchro:0 WaveFrontSubstreams:1 ScalingList:0 TMVPMode:1 AQpS:0 SignBitHidingFlag:1 RecalQP:0
    
    Non-environment-variable-controlled macros set as follows: 
    
                                    RExt__DECODER_DEBUG_BIT_STATISTICS =   0
                                          RExt__HIGH_BIT_DEPTH_SUPPORT =   0
                                RExt__HIGH_PRECISION_FORWARD_TRANSFORM =   0
                                            O0043_BEST_EFFORT_DECODING =   0
                                             ME_ENABLE_ROUNDING_OF_MVS =   1
    
                       Input ChromaFormatIDC =   4:2:0
           Output (internal) ChromaFormatIDC =   4:2:0
    
    POC    0 TId: 0 ( I-SLICE, nQP 26 QP 26 )     418088 bits [Y 42.8646 dB    U 43.7874 dB    V 44.1285 dB] [Y MSE 3.3621  U MSE 2.7185  V MSE 2.5132] [ET    12 ] [L0 ] [L1 ]
    POC    8 TId: 0 ( B-SLICE, nQP 30 QP 30 )      83864 bits [Y 41.7588 dB    U 43.2854 dB    V 43.7040 dB] [Y MSE 4.3371  U MSE 3.0517  V MSE 2.7713] [ET    27 ] [L0 0 ] [L1 0 ]
    POC    4 TId: 1 ( B-SLICE, nQP 31 QP 31 )      32248 bits [Y 41.7168 dB    U 43.3183 dB    V 43.7337 dB] [Y MSE 4.3792  U MSE 3.0287  V MSE 2.7524] [ET    26 ] [L0 0 8 ] [L1 8 0 ]
    POC    2 TId: 2 ( B-SLICE, nQP 32 QP 32 )      20880 bits [Y 41.8175 dB    U 43.3482 dB    V 43.7739 dB] [Y MSE 4.2789  U MSE 3.0078  V MSE 2.7270] [ET    26 ] [L0 0 4 ] [L1 4 8 ]
    POC    1 TId: 3 ( B-SLICE, nQP 33 QP 33 )      10656 bits [Y 41.9152 dB    U 43.4544 dB    V 43.8358 dB] [Y MSE 4.1837  U MSE 2.9352  V MSE 2.6884] [ET    25 ] [L0 0 2 ] [L1 2 4 ]
    POC    3 TId: 3 ( B-SLICE, nQP 33 QP 33 )      11448 bits [Y 41.6468 dB    U 43.3309 dB    V 43.7687 dB] [Y MSE 4.4504  U MSE 3.0199  V MSE 2.7303] [ET    28 ] [L0 2 0 ] [L1 4 8 ]
    POC    6 TId: 2 ( B-SLICE, nQP 32 QP 32 )      21800 bits [Y 41.5797 dB    U 43.2551 dB    V 43.7030 dB] [Y MSE 4.5197  U MSE 3.0730  V MSE 2.7719] [ET    27 ] [L0 4 0 ] [L1 8 4 ]
    POC    5 TId: 3 ( B-SLICE, nQP 33 QP 33 )      12456 bits [Y 41.5673 dB    U 43.3008 dB    V 43.6896 dB] [Y MSE 4.5326  U MSE 3.0409  V MSE 2.7805] [ET    27 ] [L0 4 0 ] [L1 6 8 ]
    POC    7 TId: 3 ( B-SLICE, nQP 33 QP 33 )      11824 bits [Y 41.5538 dB    U 43.2267 dB    V 43.6695 dB] [Y MSE 4.5467  U MSE 3.0932  V MSE 2.7934] [ET    24 ] [L0 6 4 ] [L1 8 6 ]
    POC    9 TId: 3 ( B-SLICE, nQP 33 QP 33 )      19696 bits [Y 41.4120 dB    U 43.1222 dB    V 43.5762 dB] [Y MSE 4.6977  U MSE 3.1686  V MSE 2.8541] [ET    17 ] [L0 8 ] [L1 8 ]
    
    
    SUMMARY --------------------------------------------------------
    	Total Frames |   Bitrate     Y-PSNR    U-PSNR    V-PSNR    YUV-PSNR  Y-MSE     U-MSE     V-MSE    YUV-MSE 
    	       10    a    7715.5200   41.7833   43.3429   43.7583   42.2823    4.3288    3.0138    2.7383    3.8445
    
    
    I Slices--------------------------------------------------------
    	Total Frames |   Bitrate     Y-PSNR    U-PSNR    V-PSNR    YUV-PSNR  Y-MSE     U-MSE     V-MSE    YUV-MSE 
    	        1    i   50170.5600   42.8646   43.7874   44.1285   43.1985    3.3621    2.7185    2.5132    3.1134
    
    
    P Slices--------------------------------------------------------
    	Total Frames |   Bitrate     Y-PSNR    U-PSNR    V-PSNR    YUV-PSNR  Y-MSE     U-MSE     V-MSE    YUV-MSE 
    	        0    p         -nan      -nan      -nan      -nan      -nan      -nan      -nan      -nan      -nan
    
    
    B Slices--------------------------------------------------------
    	Total Frames |   Bitrate     Y-PSNR    U-PSNR    V-PSNR    YUV-PSNR  Y-MSE     U-MSE     V-MSE    YUV-MSE 
    	        9    b    2998.2933   41.6631   43.2936   43.7172   42.1915    4.4362    3.0466    2.7633    3.9258
    
    RVM: 0.000
    Bytes written to file: 80370 (7715.520 kbps)
    
     Total Time:      239.852 sec.
    for the slow of the encoder, I choose to encode 10 frames only
    Last edited by rockerovo; 24th Jul 2018 at 12:27.
    Quote Quote  
  8. for AVC, the CFG file for JM :

    "attachments"

    then i run this command :

    Code:
    $ ./lencod.exe -d mine.cfg >> AVC_1080_29.txt
    AVC_1080_29.txt output :

    Code:
    Code:
     -------------------------------------------------------------- 
      This file contains statistics for the last encoded sequence   
     -------------------------------------------------------------- 
     Sequence                     : /home/siraj/Desktop/Project/Samples/Jockey_1920x1080_120fps_420_8bit_YUV.yuv
     No.of coded pictures         :   10
     Freq. for encoded bitstream  :  120
     I Slice Bitrate(kb/s)        : 4459.01
     P Slice Bitrate(kb/s)        : 3454.94
     B Slice Bitrate(kb/s)        : 2158.56
     Total Bitrate(kb/s)          : 10076.64
     ME Level 0 Metric            : SAD
     ME Level 1 Metric            : Hadamard SAD
     ME Level 2 Metric            : Hadamard SAD
     Mode Decision Metric         : Hadamard SAD
     ME for components            : Y
     Image format                 : 1920x1080
     Error robustness             : Off
     Search range                 : 32
     Total number of references   : 5
     References for P slices      : 5
     List0 refs for B slices      : 5
     List1 refs for B slices      : 1
     Profile/Level IDC            : (100,52)
     Entropy coding method        : CABAC
     EPZS Pattern                 : Extended Diamond
     EPZS Dual Pattern            : Extended Diamond
     EPZS Fixed Predictors        : Aggressive
     EPZS Aggressive Predictors   : Disabled
     EPZS Temporal Predictors     : Enabled
     EPZS Spatial Predictors      : Enabled
     EPZS Threshold Multipliers   : (1 0 2)
     EPZS Subpel ME               : Basic
     EPZS Subpel ME BiPred        : Basic
     Search range restrictions    : none
     RD-optimized mode decision   : used
    
     ---------------------|----------------|---------------|
         Item             |     Intra      |   All frames  |
     ---------------------|----------------|---------------|
     SNR Y(dB)            | 41.26          | 40.74         |
     SNR U/V (dB)         | 42.23/42.73    | 42.09/42.62   |
     ---------------------|----------------|---------------|
    
     ---------------------|----------------|---------------|---------------|
         SNR              |        I       |       P       |       B       |
     ---------------------|----------------|---------------|---------------|
     SNR Y(dB)            |      41.256    |     40.900    |     40.574    |
     SNR U(dB)            |      42.228    |     42.100    |     42.068    |
     SNR V(dB)            |      42.733    |     42.625    |     42.601    |
     ---------------------|----------------|---------------|---------------|
    
     ---------------------|----------------|---------------|---------------|
         Ave Quant        |        I       |       P       |       B       |
     ---------------------|----------------|---------------|---------------|
            QP            |      29.000    |     29.000    |     29.000    |
     ---------------------|----------------|---------------|---------------|
    
     ---------------------|----------------|
       Intra              |   Mode used    |
     ---------------------|----------------|
     Mode 0  intra 4x4    |    278         |
     Mode 1  intra 8x8    |   6284         |
     Mode 2+ intra 16x16  |   1598         |
     Mode    intra IPCM   |      0         |
     ---------------------|----------------|-----------------|
       P Slice            |   Mode used    | MotionInfo bits |
     ---------------------|----------------|-----------------|
     Mode  0  (copy)      |  16182         |        0.00     |
     Mode  1  (16x16)     |   3621         |     7626.00     |
     Mode  2  (16x8)      |    515         |     2712.00     |
     Mode  3  (8x16)      |    623         |     2770.67     |
     Mode  4  (8x8)       |    238         |     3136.33     |
     Mode  5  intra 4x4   |    111         |-----------------|
     Mode  6  intra 8x8   |   2303         |
     Mode  7+ intra 16x16 |    887         |
     Mode     intra IPCM  |      0         |
     ---------------------|----------------|-----------------|
       B Slice            |   Mode used    | MotionInfo bits |
     ---------------------|----------------|-----------------|
     Mode  0  (copy)      |  42128         |        0.00     |
     Mode  1  (16x16)     |   4937         |     5343.00     |
     Mode  2  (16x8)      |    654         |     1342.00     |
     Mode  3  (8x16)      |    736         |     1661.17     |
     Mode  4  (8x8)       |    212         |      641.50     |
     Mode  5  intra 4x4   |     19         |-----------------|
     Mode  6  intra 8x8   |    150         |
     Mode  7+ intra 16x16 |    124         |
     Mode     intra IPCM  |      0         |
     ---------------------|----------------|
    
     ---------------------|----------------|----------------|----------------|----------------|
      Bit usage:          |      Intra     |      Inter     |    B frame     |    SP frame    |
     ---------------------|----------------|----------------|----------------|----------------|
     Header               |      32.00     |      32.00     |      32.00     |
     Mode                 |   66375.00     |   18769.67     |    7295.83     |
     Motion Info          |        ./.     |   16245.00     |    8987.67     |
     CBP Y/C              |   27092.00     |   10668.00     |    2928.67     |
     Coeffs. Y            |  216124.00     |   41784.67     |    9169.50     |       0.00     |
     Coeffs. C            |   61226.00     |    8127.00     |    1330.67     |       0.00     |
     Coeffs. CB           |       0.00     |       0.00     |       0.00     |       0.00     |
     Coeffs. CR           |       0.00     |       0.00     |       0.00     |       0.00     |
     Delta quant          |     456.00     |     113.00     |      26.67     |
     Stuffing Bits        |       7.00     |       6.33     |       5.67     |
     ---------------------|----------------|----------------|----------------|
     average bits/frame   |  371312.00     |   95745.66     |   29776.67     |
     ---------------------|----------------|----------------|----------------|
    for VP9 , i run this command :

    Code:
    ffmpeg -f rawvideo -c:v rawvideo -s 1920x1080 -r 120 -pix_fmt yuv420p -i Jockey_1920x1080_120fps_420_8bit_YUV.yuv -vf fps=fps=120 -keyint_min 50 -g 50 -pass 1 -passlogfile jockey-2160 -c:v libvpx-vp9 -threads 8 -cpu-used 4 -tile-columns 3 -frame-parallel 1 -b:v 0 -crf 29 -an -f webm -y NUL
    Code:
    ffmpeg -f rawvideo -c:v rawvideo -s 1920x1080 -r 120 -pix_fmt yuv420p -i Jockey_1920x1080_120fps_420_8bit_YUV.yuv -vf fps=fps=120 -keyint_min 50 -g 50 -pass 2 -passlogfile jockey-2160 -c:v libvpx-vp9 -threads 8 -cpu-used 3 -tile-columns 3 -frame-parallel 1 -auto-alt-ref 1 -b:v 0 -crf 29 -an -f webm -y jockey.webm
    then i pass the webm file to the decoder with this command :

    Code:
    ./vpxdec  jockey.webm --420 -o vp9.yuv
    Image Attached Files
    Quote Quote  
  9. I'm still studying the video coding, but I want to know how things work

    the result, compare with each yuv with the original yuv :

    https://drive.google.com/drive/folders/1kvBg-_FuOkVefb-sylVx2qx_7D6AGEP1?usp=sharing

    As you notice vp9 looks better between them, I mean i can't compare the JM, HW with FFMPEG VP9, VP9 the fastest right now, how i can make a Fair comparison?
    do you recommend me to change HW, JM to x265 and x264?


    the last question here how I can determine the bitrate for yuv file ? to compare saving on bitrate?
    Quote Quote  
  10. Originally Posted by rockerovo View Post
    the last question here how I can determine the bitrate for yuv file ? to compare saving on bitrate?
    I can only provide correct answer for this question - YUV (YCbCr) bandwidth calculation are quite simple - for 8 bit pixel depth (YCbCr 4:2:0 HX*VY*1.5*FPS)/125000=Mbps where:
    HX - amount of pixels in line
    VY - amount of lines
    FPS - framerate per second

    for 1920x1080 and 30 fps required bandwidth is 746.496Mbps
    Quote Quote  
  11. Originally Posted by pandy View Post
    Originally Posted by rockerovo View Post
    the last question here how I can determine the bitrate for yuv file ? to compare saving on bitrate?
    I can only provide correct answer for this question - YUV (YCbCr) bandwidth calculation are quite simple - for 8 bit pixel depth (YCbCr 4:2:0 HX*VY*1.5*FPS)/125000=Mbps where:
    HX - amount of pixels in line
    VY - amount of lines
    FPS - framerate per second

    for 1920x1080 and 30 fps required bandwidth is 746.496Mbps

    look at this, from "Performance Comparison of High-Efficiency Video Coding (HEVC) with H.264
    AVC
    "
    Image
    [Attachment 46174 - Click to enlarge]


    Compression factor

    How he did this from HW and JM, Did he convert them to MP4 or MKV?

    all this question because I know all this coding, so we can get a file with good quilty and less bitrate, so how can I calculate the bitrate for the original yuv and output encoder file?
    Quote Quote  
  12. Originally Posted by rockerovo View Post
    look at this, from "Performance Comparison of High-Efficiency Video Coding (HEVC) with H.264
    AVC
    "
    Image
    [Attachment 46174 - Click to enlarge]


    Compression factor

    How he did this from HW and JM, Did he convert them to MP4 or MKV?

    all this question because I know all this coding, so we can get a file with good quilty and less bitrate, so how can I calculate the bitrate for the original yuv and output encoder file?
    Nope - just calculate raw h264 size (file size in bits - so multiply bytes by 8) divided by duration - this will give you bitrate... same as for yuv only difference is that yuv is uncompressed thus way bigger.

    mkv or mp4 are containers and they main purpose is to organise video, audio and other sub streams within single unity i.e. encapsulate multipple data within single file - each container add some amount of overhead related to own container requirements

    As you comparing only video then you can compare raw compressed video size.
    Quote Quote  
  13. Member
    Join Date
    Oct 2016
    Location
    Spain
    Search PM
    Originally Posted by rockerovo View Post

    for VP9 , i run this command :

    Code:
    ffmpeg -f rawvideo -c:v rawvideo -s 1920x1080 -r 120 -pix_fmt yuv420p -i Jockey_1920x1080_120fps_420_8bit_YUV.yuv -vf fps=fps=120 -keyint_min 50 -g 50 -pass 1 -passlogfile jockey-2160 -c:v libvpx-vp9 -threads 8 -cpu-used 4 -tile-columns 3 -frame-parallel 1 -b:v 0 -crf 29 -an -f webm -y NUL
    Code:
    ffmpeg -f rawvideo -c:v rawvideo -s 1920x1080 -r 120 -pix_fmt yuv420p -i Jockey_1920x1080_120fps_420_8bit_YUV.yuv -vf fps=fps=120 -keyint_min 50 -g 50 -pass 2 -passlogfile jockey-2160 -c:v libvpx-vp9 -threads 8 -cpu-used 3 -tile-columns 3 -frame-parallel 1 -auto-alt-ref 1 -b:v 0 -crf 29 -an -f webm -y jockey.webm
    For VP9 I recomend you to use something like:
    Code:
    ffmpeg -f rawvideo -c:v rawvideo -s 1920x1080 -r 120 -pix_fmt yuv420p -i Jockey_1920x1080_120fps_420_8bit_YUV.yuv -frames:v 10 -vf fps=fps=120 -keyint_min 50 -g 50 -c:v libvpx-vp9 -pass 1 -passlogfile jockey-2160 -b:v 0 -crf 29 -threads 8 -deadline good -cpu-used 4 -tile-columns 0 -tile-rows 0 -frame-parallel 0 -qmax 29 -auto-alt-ref 1 -aq-mode 0 -an -f null NUL
    ffmpeg -f rawvideo -c:v rawvideo -s 1920x1080 -r 120 -pix_fmt yuv420p -i Jockey_1920x1080_120fps_420_8bit_YUV.yuv -frames:v 10 -vf fps=fps=120 -keyint_min 50 -g 50 -c:v libvpx-vp9 -pass 2 -passlogfile jockey-2160 -b:v 0 -crf 29 -row-mt 1 -threads 8 -deadline good -cpu-used 1 -tile-columns 3 -tile-rows 0 -frame-parallel 0 -qmax 29 -auto-alt-ref 1 -aq-mode 0 -an -f webm -y jockey_1.webm
    That is basically the configuration that i use with vp9, also if it is for test is also interesting to use "-deadline best" and "-cpu-used 0" in the second pass, is the slowest and better mode of vp9 but I always recommend "-deadline good -cpu-used 1" for practical purposes , row-mt is recommended in the second pass as it don't hurt the quality and boost the compression speed.

    Also the quantification scale is not the same for the different video codecs, then q29 in x265 or x264 is not the same a q29 in vp9 (is more like the equivalent to be in the range of 36-46)
    Quote Quote  
  14. Originally Posted by gdx View Post
    Originally Posted by rockerovo View Post

    for VP9 , i run this command :

    Code:
    ffmpeg -f rawvideo -c:v rawvideo -s 1920x1080 -r 120 -pix_fmt yuv420p -i Jockey_1920x1080_120fps_420_8bit_YUV.yuv -vf fps=fps=120 -keyint_min 50 -g 50 -pass 1 -passlogfile jockey-2160 -c:v libvpx-vp9 -threads 8 -cpu-used 4 -tile-columns 3 -frame-parallel 1 -b:v 0 -crf 29 -an -f webm -y NUL
    Code:
    ffmpeg -f rawvideo -c:v rawvideo -s 1920x1080 -r 120 -pix_fmt yuv420p -i Jockey_1920x1080_120fps_420_8bit_YUV.yuv -vf fps=fps=120 -keyint_min 50 -g 50 -pass 2 -passlogfile jockey-2160 -c:v libvpx-vp9 -threads 8 -cpu-used 3 -tile-columns 3 -frame-parallel 1 -auto-alt-ref 1 -b:v 0 -crf 29 -an -f webm -y jockey.webm
    For VP9 I recomend you to use something like:
    Code:
    ffmpeg -f rawvideo -c:v rawvideo -s 1920x1080 -r 120 -pix_fmt yuv420p -i Jockey_1920x1080_120fps_420_8bit_YUV.yuv -frames:v 10 -vf fps=fps=120 -keyint_min 50 -g 50 -c:v libvpx-vp9 -pass 1 -passlogfile jockey-2160 -b:v 0 -crf 29 -threads 8 -deadline good -cpu-used 4 -tile-columns 0 -tile-rows 0 -frame-parallel 0 -qmax 29 -auto-alt-ref 1 -aq-mode 0 -an -f null NUL
    ffmpeg -f rawvideo -c:v rawvideo -s 1920x1080 -r 120 -pix_fmt yuv420p -i Jockey_1920x1080_120fps_420_8bit_YUV.yuv -frames:v 10 -vf fps=fps=120 -keyint_min 50 -g 50 -c:v libvpx-vp9 -pass 2 -passlogfile jockey-2160 -b:v 0 -crf 29 -row-mt 1 -threads 8 -deadline good -cpu-used 1 -tile-columns 3 -tile-rows 0 -frame-parallel 0 -qmax 29 -auto-alt-ref 1 -aq-mode 0 -an -f webm -y jockey_1.webm
    That is basically the configuration that i use with vp9, also if it is for test is also interesting to use "-deadline best" and "-cpu-used 0" in the second pass, is the slowest and better mode of vp9 but I always recommend "-deadline good -cpu-used 1" for practical purposes , row-mt is recommended in the second pass as it don't hurt the quality and boost the compression speed.

    Also the quantification scale is not the same for the different video codecs, then q29 in x265 or x264 is not the same a q29 in vp9 (is more like the equivalent to be in the range of 36-46)
    okay, how I can get almost the same settings? between AVC,HVEC,VP9,AV1 to make the comparison fair
    Quote Quote  
  15. Originally Posted by rockerovo View Post
    okay, how I can get almost the same settings? between AVC,HVEC,VP9,AV1 to make the comparison fair
    You need to understand codec and you need verify your settings - this is very difficult and very challenging - everything what i've wrote earlier was not to bash you but to trig some thoughts about your work. You can for example search settings where PSNR will be as close as possible or SSIM but... even small difference in numbers will make comparison incomparable - bitrate seem to be less biased so IMHO you set some target bitrate and match settings to reach this bitrate - however your concern should be maturity of each evaluated codec - some of them are mature some of them not and codec efficiency depends on maturity - of course this is learning curve so every new codec gain profit from knowledge acquired during older codec development but not everything can be easily accommodated.
    Quote Quote  
  16. Originally Posted by gdx View Post
    For VP9 I recomend you to use something like:
    Code:
    ffmpeg -f rawvideo -c:v rawvideo -s 1920x1080 -r 120 -pix_fmt yuv420p -i Jockey_1920x1080_120fps_420_8bit_YUV.yuv -frames:v 10 -vf fps=fps=120 -keyint_min 50 -g 50 -c:v libvpx-vp9 -pass 1 -passlogfile jockey-2160 -b:v 0 -crf 29 -threads 8 -deadline good -cpu-used 4 -tile-columns 0 -tile-rows 0 -frame-parallel 0 -qmax 29 -auto-alt-ref 1 -aq-mode 0 -an -f null NUL
    ffmpeg -f rawvideo -c:v rawvideo -s 1920x1080 -r 120 -pix_fmt yuv420p -i Jockey_1920x1080_120fps_420_8bit_YUV.yuv -frames:v 10 -vf fps=fps=120 -keyint_min 50 -g 50 -c:v libvpx-vp9 -pass 2 -passlogfile jockey-2160 -b:v 0 -crf 29 -row-mt 1 -threads 8 -deadline good -cpu-used 1 -tile-columns 3 -tile-rows 0 -frame-parallel 0 -qmax 29 -auto-alt-ref 1 -aq-mode 0 -an -f webm -y jockey_1.webm
    That is basically the configuration that i use with vp9, also if it is for test is also interesting to use "-deadline best" and "-cpu-used 0" in the second pass, is the slowest and better mode of vp9 but I always recommend "-deadline good -cpu-used 1" for practical purposes , row-mt is recommended in the second pass as it don't hurt the quality and boost the compression speed.

    Also the quantification scale is not the same for the different video codecs, then q29 in x265 or x264 is not the same a q29 in vp9 (is more like the equivalent to be in the range of 36-46)
    Thank you for the hint with respect to row-mt

    I haven't seen it before and it seems to be an interesting improvement!

    But also with -row-mt you need to consider that there is a relationship between horizontal resolution (width), -tile-columns and -threads. It is described at the following url:
    http://permalink.gmane.org/gmane.comp.multimedia.webm.devel/2339

    As I see, -row-mt now allows to double up the number threads. See:
    https://groups.google.com/a/webmproject.org/forum/#!topic/codec-devel/oiHjgEdii2U
    https://developers.google.com/media/vp9/live-encoding/

    So, if you set "-row-mt 1", this means:
    • width < 512 => tile-columns = 0, threads = 2
    • 512 <= width < 1024 => tile-columns = 1, threads = 4
    • 1024 <= width < 2048 => tile-columns = 2, threads = 8
    • width >= 2048 => tile-columns = 3, threads = 16

    For a resolution of 1920 * 1080 px you should use:
    -tile-columns 2 -threads 8

    If you set -tile-columns to 0 (as in your 1st pass), then the encoder can only use 2 threads. With a width of 1920 you can set -tile-columns to a maximum of 2.
    Quote Quote  
  17. Member
    Join Date
    Oct 2016
    Location
    Spain
    Search PM
    [QUOTE=rockerovo;2525212][QUOTE=gdx;2525193]
    Originally Posted by rockerovo View Post
    okay, how I can get almost the same settings? between AVC,HVEC,VP9,AV1 to make the comparison fair
    That is difficult, and the correct way is to compare across a set of qualities/birate targets instead of a sole target, Also you have to had in mind what you want to compare as some tuning can worsen the visual quality but boost metric or in reverse as hurting metrics but boost the perceived quality. One example of what tuning can do is that in x265 using "very slow" and this config "-x265-params keyint=250:ref=6:limit-refs=3:rc-lookahead=60:amp=1ubme=7:aq-mode=3:aq-strength=1.25sy-rd=0.75sy-rdoq=2.00" this make that a crf25 file have the same size approximately of a crf23 file whiteout tuning, the quality degradation in the bright part of the video is unnoticeable but the dark parts looks better in the tuned crf25 version than in the untuned crf23 version, making the visual quality to be perceived as being better in the tuned version.

    For example in this comparison https://wyohknott.github.io/video-formats-comparison/ I find that if he had used for vp9 "--cpu-used=1" he had improved the metrics and the encoder speed had been more similar to the one of x265. Also for AV1 he used "--cpu-used=4" with drops the quality but is understandable as it is painful slow for now.

    Also I recommend you to read this comparison http://goughlui.com/2016/08/27/video-compression-testing-x264-vs-x265-crf-in-handbrake-0-10-5/ , also not perfect but at least comply with its function.

    Originally Posted by fornit View Post
    Thank you for the hint with respect to row-mt

    I haven't seen it before and it seems to be an interesting improvement!

    But also with -row-mt you need to consider that there is a relationship between horizontal resolution (width), -tile-columns and -threads. It is described at the following url:
    http://permalink.gmane.org/gmane.comp.multimedia.webm.devel/2339

    As I see, -row-mt now allows to double up the number threads. See:
    https://groups.google.com/a/webmproject.org/forum/#!topic/codec-devel/oiHjgEdii2U
    https://developers.google.com/media/vp9/live-encoding/

    So, if you set "-row-mt 1", this means:
    • width < 512 => tile-columns = 0, threads = 2
    • 512 <= width < 1024 => tile-columns = 1, threads = 4
    • 1024 <= width < 2048 => tile-columns = 2, threads = 8
    • width >= 2048 => tile-columns = 3, threads = 16

    For a resolution of 1920 * 1080 px you should use:
    -tile-columns 2 -threads 8

    If you set -tile-columns to 0 (as in your 1st pass), then the encoder can only use 2 threads. With a width of 1920 you can set -tile-columns to a maximum of 2.
    You are wrong in various things, tiles-colums actually limit the maximum number of the tiles the encoder can use, each having a with of 256 pixels, the encoder uses what it needs not the specified value. Also row-mt don't only duplicate the number of treads, its more than double (for tiles-colums 0 it at least triplicates the treads), its performance gains come in 2 savors, it ads more treads and it makes the treads more efficient.

    One thing that demonstrates that you are wrong is that with row-mt 1 and tile-columns 1 encoding a 1280p video in my machine I'm able to max the 8 thread that my CPU is capable, your table tell that I had 4 threads while in reality I was using 8 threads.

    Also 2x speedup is not the same a 2x threads, you need to have in consideration Amdahl's law as parallelization is not perfect in most cases.
    Last edited by gdx; 25th Jul 2018 at 09:34.
    Quote Quote  
  18. Originally Posted by gdx View Post
    You are wrong in various things, tiles-colums actually limit the maximum number of the tiles the encoder can use, each having a with of 256 pixels, the encoder uses what it needs not the specified value.
    Sorry, but you are mixing things. Tiles have a minimum width of 256 px. But please note that the -tile-columns option is in log2 format. So -tile-columns 3 means: 8 tiles (2^3 = 8). And 8 * 256 = 2048. That's why it seems unlikely that your encoding suggestion of 3 tile-columns (8 tiles) for 1920 px can really make sense.

    The post I've linked above, and which is confirming that, has been written by VP9 developer Yinqing Wang. If that isn't enough for you, please find a more detailed explanation at:
    https://stackoverflow.com/questions/41372045/vp9-encoding-limited-to-4-threads

    Originally Posted by gdx View Post
    Also row-mt don't only duplicate the number of treads, its more than double (for tiles-colums 0 it at least triplicates the treads), its performance gains come in 2 savors, it ads more treads and it makes the treads more efficient.
    I don't know that and have to test it. The Google documentation at https://developers.google.com/media/vp9/live-encoding/ says that it's doubling the number of threads. Maybe it's wrong. If you can really have 8 threads with 2 tiles then this will surely make me happy
    Quote Quote  
  19. Member
    Join Date
    Oct 2016
    Location
    Spain
    Search PM
    Originally Posted by fornit View Post
    Sorry, but you are mixing things. Tiles have a minimum width of 256 px. But please note that the -tile-columns option is in log2 format. So -tile-columns 3 means: 8 tiles (2^3 = 8). And 8 * 256 = 2048. That's why it seems unlikely that your encoding suggestion of 3 tile-columns (8 tiles) for 1920 px can really make sense.

    The post I've linked above, and which is confirming that, has been written by VP9 developer Yinqing Wang. If that isn't enough for you, please find a more detailed explanation at:
    https://stackoverflow.com/questions/41372045/vp9-encoding-limited-to-4-threads
    True, I had been mixing things, ironically in near all my scripts I use "-tile-columns 2", for other part if you put "-tile-columns 3" in a 1080p ffmpeg command is treated as if you had put "-tile-columns 2" then no damage is done. One explication for what I have done it is that the ffmpeg command it was originally done for 4k, then altered for 1080.

    Originally Posted by fornit View Post
    I don't know that and have to test it. The Google documentation at https://developers.google.com/media/vp9/live-encoding/ says that it's doubling the number of threads. Maybe it's wrong. If you can really have 8 threads with 2 tiles then this will surely make me happy
    Apart that most of vp9 documentation is obsolete or incorrect for the last versions, really the vp9 documentation is a mess. Actually neither ffmpeg or vpxenc is a good way to verify the threads, ffmpeg make more threads for itself and vpxenc limit the treads to the CPU threads, in this last case if vpxenc without row-mt does 4 threads and your CPU support 8, enabling it actually double the threads but if your CPU support 12 threads it is going to spawn more threads and possibly 12 of them (this can explain for what they wrote that row-mt doubles the treads).

    PD.->

    Revisiting this: https://groups.google.com/a/webmproject.org/forum/#!topic/codec-devel/oiHjgEdii2U you can read:
    In tests[1] of encoding HD videos with 4 column tiles, the improved VP9 MT encoder achieved speedups over the original of 11% with 2 threads, 27% with 4 threads, 101% with 8 threads, and 135% with 16 threads.
    then its telling you that a 1080 video with 4 columns and row-mt enable spawns 16 treads, basically is multiply threads x4 contradicting the https://developers.google.com/media/vp9/live-encoding/ page.
    Last edited by gdx; 26th Jul 2018 at 06:41.
    Quote Quote  
  20. Originally Posted by gdx View Post
    Revisiting this: https://groups.google.com/a/webmproject.org/forum/#!topic/codec-devel/oiHjgEdii2U you can read:
    In tests[1] of encoding HD videos with 4 column tiles, the improved VP9 MT encoder achieved speedups over the original of 11% with 2 threads, 27% with 4 threads, 101% with 8 threads, and 135% with 16 threads.
    then its telling you that a 1080 video with 4 columns and row-mt enable spawns 16 treads, basically is multiply threads x4 contradicting the https://developers.google.com/media/vp9/live-encoding/ page.
    That sounds great. Thank you for digging that out!

    To give a summary of our discussion and to keep my table up-to-date, this will mean the following.

    With the setting of "-row-mt 1" tile-columns and threads can be set as follows:
    • width < 512 => tile-columns = 0, threads = 4
    • 512 <= width < 1024 => tile-columns = 1, threads = 8
    • 1024 <= width < 2048 => tile-columns = 2, threads = 16
    • width >= 2048 => tile-columns = 3, threads = 32

    I'll provide a feedback after I've tested it by myself. I will also have a look at the behaviour on Windows.
    Quote Quote  
  21. Member
    Join Date
    Oct 2016
    Location
    Spain
    Search PM
    That table is not that simple, with vpxenc using tile 0 and row-mt spawns 8 threads at least from 428 pixel of height upwards, but the number of threads and threads deficiency is not the same as in a previous test of mine I found that the speed in a 720p video and 8 threads: [tiles 2 row-mt 1] = [tiles 1 row-mt 1] > [tiles 0 row-mt 1] > [tiles 2 row-mt 0] > [tiles 1 row-mt 0] > [tiles 0 row-mt 0].
    Quote Quote  
  22. Originally Posted by pandy View Post
    Originally Posted by rockerovo View Post
    look at this, from "Performance Comparison of High-Efficiency Video Coding (HEVC) with H.264
    AVC
    "
    Image
    [Attachment 46174 - Click to enlarge]


    Compression factor

    How he did this from HW and JM, Did he convert them to MP4 or MKV?

    all this question because I know all this coding, so we can get a file with good quilty and less bitrate, so how can I calculate the bitrate for the original yuv and output encoder file?
    Nope - just calculate raw h264 size (file size in bits - so multiply bytes by 8) divided by duration - this will give you bitrate... same as for yuv only difference is that yuv is uncompressed thus way bigger.

    mkv or mp4 are containers and they main purpose is to organise video, audio and other sub streams within single unity i.e. encapsulate multipple data within single file - each container add some amount of overhead related to own container requirements

    As you comparing only video then you can compare raw compressed video size.
    Thank you
    but what is the bitstream for VP9, AV1? webm? I thought that webm is a container
    Quote Quote  
  23. Originally Posted by rockerovo View Post
    but what is the bitstream for VP9, AV1? webm? I thought that webm is a container
    Bitstream is a sequence of bits that carry data (for example compressed video) and additional information that allow to restore in our case video by decoder - if you go to almost any sane codec specification then you will immediately realize that video codecs (especially those standardised) are very complex structures and to cover such complexity they use particular solutions like sequences and headers, even primitive commands etc so at some point they are very similar to primitive quasi programming language...
    Quote Quote  
  24. Member
    Join Date
    Aug 2013
    Location
    Central Germany
    Search PM
    Generalized, the bitstream is the content inside any container.

    The common "raw" (or, minimalistic container) format for VPx and AV1 is called IVF, the vintage "Indeo Video File" format.

    Additionally, AV1 supports an "OBU" format (Open Bitstream Units) which is more similar to AVC and HEVC raw video streams.
    Last edited by LigH.de; 30th Jul 2018 at 04:28.
    Quote Quote  
  25. Originally Posted by LigH.de View Post
    Generalized, the bitstream is the content inside any container.

    The common "raw" (or, minimalistic container) format for VPx and AV1 is called IVF, the vintage "Indeo Video File" format.

    Additionally, AV1 supports an "OBU" format (Open Bitstream Units) which is more similar to AVC and HEVC raw video streams.
    Thank you for explaining to me the differences between them

    I'm trying to output IVF bitstream with vpxenc Vp9 , but i think there is a probelm or my command may be wrong , because the size of ivf is larger than webm file with the same settings

    for ivf output :

    Code:
    /vpxenc --output=ReadySteadyGo_1920x1080_120fps_420_8bit.ivf --codec=vp9 --passes=2 --good --verbose --psnr --i420 --threads=8 --width=1920 --height=1080 --profile=0 --fps=50000/1001 --min-q=29 --max-q=29 --kf-min-dist=50 --kf-max-dist=50 --cpu-used=1 --auto-alt-ref=1 --input-bit-depth=8 --tile-columns=2 /home/siraj/Desktop/Project/Samples/1080p/ReadySteadyGo_1920x1080_120fps_420_8bit_YUV.yuv --bit-depth=8 --row-mt=1 --end-usage=q --ivf --limit=100
    for webm output :

    Code:
    ./vpxenc --output=ReadySteadyGo_1920x1080_120fps_420_8bit_webm.webm --codec=vp9 --passes=2 --good --verbose --psnr --i420 --threads=8 --width=1920 --height=1080 --profile=0 --fps=50000/1001 --min-q=29 --max-q=29 --kf-min-dist=50 --kf-max-dist=50 --cpu-used=1 --auto-alt-ref=1 --input-bit-depth=8 --tile-columns=2 /home/siraj/Desktop/Project/Samples/1080p/ReadySteadyGo_1920x1080_120fps_420_8bit_YUV.yuv --bit-depth=8 --row-mt=1 --end-usage=q --webm --limit=100
    is it normal that the size of ivf larger than webm?
    Quote Quote  
  26. Member
    Join Date
    Aug 2013
    Location
    Central Germany
    Search PM
    Well, unfortunately I lack practical experience.

    IVF still has a container header and additional headers to every video frame. It is indeed possible that Matroska (WebM is a Matroska subformat) can store it more efficiently. Apparently there is no real raw format vpxenc would produce, because it may be impossible to read it back into a container without losing important information.

    I guess the OBU format aomenc can create comes close to a "raw" video format without container and with minimal headers. But vpxenc does not create it.
    Quote Quote  
  27. Hm,

    I've just tried the jockey.webm example, I've created last week for you:

    ffmpeg -i jockey.webm -c:v copy -f ivf jockey.ivf

    Size seems to be the same (or nearly the same) as with webm:
    jockey.webm: 15.282 KB
    jockey.ivf: 15.283 KB
    Quote Quote  
  28. Originally Posted by LigH.de View Post
    Well, unfortunately I lack practical experience.

    IVF still has a container header and additional headers to every video frame. It is indeed possible that Matroska (WebM is a Matroska subformat) can store it more efficiently. Apparently there is no real raw format vpxenc would produce, because it may be impossible to read it back into a container without losing important information.

    I guess the OBU format aomenc can create comes close to a "raw" video format without container and with minimal headers. But vpxenc does not create it.
    Thank you for clarifying the difference
    Quote Quote  
  29. Originally Posted by fornit View Post
    Hm,

    I've just tried the jockey.webm example, I've created last week for you:

    ffmpeg -i jockey.webm -c:v copy -f ivf jockey.ivf

    Size seems to be the same (or nearly the same) as with webm:
    jockey.webm: 15.282 KB
    jockey.ivf: 15.283 KB

    so if I want to compare between bitstream I should use IVF with .264 .265?
    and whats the difference between .265 and .hevc is it the same?
    Quote Quote  
  30. I a newbie in video coding, so I have new questions?
    I know that the quantization level or qp, when I chose the higher value you will get good compression but bad quality? I think its the same as Quantizer in PCM higher level means higher quality and less error?

    I'm still don't know how to measure bitrate for bitstream, let us say I have .265 with size 100kB and duration 3sec and 30 ms, so I need to multiply 100 by 8 and divide it by duration which is 3.33 then I will get the average bitrate, right?

    I'm using x265, x264 when I want to output the bitstream I write -o example.265 but if I change the .265 to a container like .mp4 or .mkv I got the same file size? the -o example .265 work great if I open the file with hevc analyzer, they have the same size because I don't multiplexer the bitstream with the audio file ? is this thing made the file bigger?

    which one of this two is good compression method, using fixed qp during the whole process with every codec and measure the bitrate with psnr,ssim... or using certain bitrate and compare the bitrate with psnr.ssim ?

    btw I'm using x265.x264.libvpx-vp9,aomenc av1 without ffmpeg , i don't want to use ffmpeg

    what is the lag in the frame option?

    why does some encoder ask to enter the fps with a fraction like --fps=50000/1001 not 49 to 50 ?

    whats the reason for JM and HW reference encoder?
    Last edited by rockerovo; 31st Jul 2018 at 06:45.
    Quote Quote  



Similar Threads

Visit our sponsor! Try DVDFab and backup Blu-rays!