Some qustion about AV1,VP9,HEVC,AVC

23rd Jul 2018 08:16 #1

Member

right now I'm working on a project about "Performance comparison of AV1, VP9, HEVC, AVC, THOR"

I have some questions, I read a lot of website and papers but I can't find a good answer for my questions

I'm using HEVC HW reference software, AVC JM reference software, AOM AV1, VP9

Kubuntu 18.04 , Intel® Core™ i7-3610QM,6GB ram,7670m 2gb ddr3 AMD

Questions :

- where I get 8k yuv? I just found 8k 360 VR

- when I'm using video with 120fps, should I tell the encoder its 120fps, or if I input its 50fps, Is it a problem and get the wrong readings? I'm asking because of AVC encoder JM the limit for frames 100fps

- why there is no sound in yuv files?

- whats the better yuv player, I'm using yuviwe and vooya, and sometimes I need to edit the setting to make it read the yuv file

- I read some paper on IEEE I see them compare the compression rate, when I encode av1 or vp9 I got WebM file , and for avc and hevc using (JM, HW ) I got for avc yuv and .264 and for hevc I got .bin
so the question is the size for this files, is the final result , I mean how I can convert them to mp4 or mkv ?

- I will compare them with the same setting and then I change QP different values, and then calculate (PSNR, SSIM, MS-SSIM, VQM,) is this right way to compare them?

- whats your thoughts about "VMAF - Video Multi-Method Assessment Fusion NETFLIX" do you recommend me to use it for compare?

- whats the highest sitting for each one of them?

- I read a paper about Performance comparison between hevc and avc, and he was using different QP not the same, I mean avc qp=30 and hevc qp=29, is this normal?

- what happens to the sound in compression process?

- is it good when I use all CPU power? I heard its not good for video quilty when you speed up the process

- I know you can encode AV1 with FFmpeg or aomedia source code, so which better?

sorry for all this question, but I don't find a good site to answer my questions

Last edited by rockerovo; 23rd Jul 2018 at 08:29.

Quote

23rd Jul 2018 08:59 #2

LigH.de

Member

Do not care about encoding UHD resolutions when your PC has only 6 GB RAM. You wish you had 32 GB if you had 8K UHD sources, you already need 16 GB for 4K UHD to be encoded with x265 to avoid swapping to your harddisk all day long.

Yes, the encoder should know the frame rate. I do remember that x264/x265 tune the meaning of CRF in relation to it, because the shorter you can see a frame, the less you will notice flaws in them.

YUV is a color space for video frames, similar to RGB. Video only. No audio. But you are testing video encoders only, anyway, neither of them will process audio. It will be ignored.

Video only encoders may produce only raw video streams if they don't contain a multiplexer to wrap them with a container (the raw stream format for VPx and AOM is IVF). If you want to compare the quality at a specific size, you will want to compare the size of the raw video stream only, additional container headers would produce wrong results. To watch the videos in players which need containers (some players can't identify raw video streams properly), use the usual recommended multiplexers (e.g. GPAC MP4Box to multiplex AVC or HEVC into MP4, or MKVtoolnix to multiplex anything into MKV).

The subjectively optimal way to compare videos is to try to produce raw streams of the same output size, as good as it gets, and then watch them in comparison to the original video, without knowing which is which (ABX blind test), rating how annoying the differences subjectively appear, using thousands of participants. Objective metrics calculating a difference between original and encoded-and-restored result hardly get close to the average opinion of a variety of people. But if you don't have a choice ... SSIM (and variations), VQM, VMAF are some of the best objective metrics we have; PSNR is known to be fooled easily with academic samples.

Performance comparisons are a complicated topic. One codec can be more efficient than another because it does not do the same calculations; but how do we compare obviously different algorithms if we can't compare the speed of the same, because they just don't do the same? It's hardly possible to make two codecs have the same calculation efforts. I would not even try to achieve that. You can, of course, report that a range of available presets, as designed by the developers of each codec, may result in magnitudes of durations. But better be vague and don't claim a certain speed without knowing specific hardware.

A high CPU utilization is good, it means good parallelization, thus good efficiency. But different algorithms are more or less parallelizable, and some codecs use heavily parallelized SIMD / Vector instructions which are not reported as utilizing all cores equally. It even depends on the video material. So don't compare CPU utilization values across several different codecs; you can't get fair results.

It doesn't matter whether you use a separate aomenc encoder or ffmpeg if both contain exactly the same libaom codec core.

Last edited by LigH.de; 23rd Jul 2018 at 09:04.

Quote

23rd Jul 2018 10:29 #3

pandy

Member

Originally Posted by rockerovo

right now I'm working on a project about "Performance comparison of AV1, VP9, HEVC, AVC, THOR"

I have some questions, I read a lot of website and papers but I can't find a good answer for my questions

I'm using HEVC HW reference software, AVC JM reference software, AOM AV1, VP9

Kubuntu 18.04 , Intel® Core™ i7-3610QM,6GB ram,7670m 2gb ddr3 AMD

Questions :

Based on your questions i would advise you to gain some knowledge for example start reading video codec developer blogs:
https://medium.com/@luc.trudeau

Codec testing can be done in many ways but seem your area of codec testing is most difficult one.

Originally Posted by rockerovo

- where I get 8k yuv? I just found 8k 360 VR

You can shot 8k your self, you can ask camera manufacturers, optical sensor manufacturers etc, computer generated graphics (synthetic patterns and normal graphics) is also some option.

Originally Posted by rockerovo

- when I'm using video with 120fps, should I tell the encoder its 120fps, or if I input its 50fps, Is it a problem and get the wrong readings? I'm asking because of AVC encoder JM the limit for frames 100fps

It is not important - if your goal is compare codecs then use same settings for each of them, common denominator should be codec with lowest capabilities. Personally i doubt if any of them is capable to deliver more than 5fps for 8K so realtime encoding speed is not feasible

Originally Posted by rockerovo

- why there is no sound in yuv files?

because video codec is video codec - only video data are required and yuv are RAW video data - i'm not aware of existence any video codec that store audio data.

Originally Posted by rockerovo

- whats the better yuv player, I'm using yuviwe and vooya, and sometimes I need to edit the setting to make it read the yuv file

You can use ffplay or you can wrote own yuv player (yuv is simply data - known regular structure)

Originally Posted by rockerovo

- I read some paper on IEEE I see them compare the compression rate, when I encode av1 or vp9 I got WebM file , and for avc and hevc using (JM, HW ) I got for avc yuv and .264 and for hevc I got .bin
so the question is the size for this files, is the final result , I mean how I can convert them to mp4 or mkv ?

You need to use multiplexer capable to understand you raw codec to create desired multimedia container - however for video comparison i consider this unneeded exercise.

Originally Posted by rockerovo

- I will compare them with the same setting and then I change QP different values, and then calculate (PSNR, SSIM, MS-SSIM, VQM,) is this right way to compare them?

I have no clue what you wish to compare as most of those codecs are immature and suffer from many problems - personally i think (based on your questions) that thorough comparison may not be possible - you should definitely focus on gaining some basic knowledge before starting such difficult task (i would not dare to compare those codecs without coding experience and experience with codec implementations as most of them is available in plain C code which focus on other than speed aspects).

Originally Posted by rockerovo

- whats your thoughts about "VMAF - Video Multi-Method Assessment Fusion NETFLIX" do you recommend me to use it for compare?

VMAF focus on video (picture and motion) where PSNR, SSIM and similar are focused on picture quality - no clue if VMAF was trained for 8K video's (AFAIR it is trained for HD only).

Originally Posted by rockerovo

- whats the highest sitting for each one of them?

Don't understand your question

Originally Posted by rockerovo

- I read a paper about Performance comparison between hevc and avc, and he was using different QP not the same, I mean avc qp=30 and hevc qp=29, is this normal?

No clue - IMHO with different codec structure each codec may have different QP (that's why i've wrote that i would not dare to perform codec comparison without sophisticated knowledge how codec is designed - you need to understand how to evaluate codecs by understanding truly how codec process data - this is very challenging even for people with many years experience on this).

Originally Posted by rockerovo

- what happens to the sound in compression process?

Nothing - sound is processed completely independently from video - audio and video need to be combined and synchronized - those things are controlled not by video codec.

Originally Posted by rockerovo

- is it good when I use all CPU power? I heard its not good for video quilty when you speed up the process

You need to know how codec is designed - every codec is designed in particular way and some of them are capable to use for example more cores.
So real codec comparison will rather require to count CPU cycles spent by codec in every codec block - you need to be familiar with development tools and code profilers (if you have this kind of knowledge then you can easily work for some company for +150k$ yearly)

Originally Posted by rockerovo

- I know you can encode AV1 with FFmpeg or aomedia source code, so which better?

You can compare them both... there is also other AV1 codec: RAV1E https://github.com/xiph/rav1e

Quote

24th Jul 2018 08:33 #4

rockerovo

Member

Originally Posted by LigH.de

Do not care about encoding UHD resolutions when your PC has only 6 GB RAM. You wish you had 32 GB if you had 8K UHD sources, you already need 16 GB for 4K UHD to be encoded with x265 to avoid swapping to your harddisk all day long.

Yes, the encoder should know the frame rate. I do remember that x264/x265 tune the meaning of CRF in relation to it, because the shorter you can see a frame, the less you will notice flaws in them.

YUV is a color space for video frames, similar to RGB. Video only. No audio. But you are testing video encoders only, anyway, neither of them will process audio. It will be ignored.

Video only encoders may produce only raw video streams if they don't contain a multiplexer to wrap them with a container (the raw stream format for VPx and AOM is IVF). If you want to compare the quality at a specific size, you will want to compare the size of the raw video stream only, additional container headers would produce wrong results. To watch the videos in players which need containers (some players can't identify raw video streams properly), use the usual recommended multiplexers (e.g. GPAC MP4Box to multiplex AVC or HEVC into MP4, or MKVtoolnix to multiplex anything into MKV).

The subjectively optimal way to compare videos is to try to produce raw streams of the same output size, as good as it gets, and then watch them in comparison to the original video, without knowing which is which (ABX blind test), rating how annoying the differences subjectively appear, using thousands of participants. Objective metrics calculating a difference between original and encoded-and-restored result hardly get close to the average opinion of a variety of people. But if you don't have a choice ... SSIM (and variations), VQM, VMAF are some of the best objective metrics we have; PSNR is known to be fooled easily with academic samples.

Performance comparisons are a complicated topic. One codec can be more efficient than another because it does not do the same calculations; but how do we compare obviously different algorithms if we can't compare the speed of the same, because they just don't do the same? It's hardly possible to make two codecs have the same calculation efforts. I would not even try to achieve that. You can, of course, report that a range of available presets, as designed by the developers of each codec, may result in magnitudes of durations. But better be vague and don't claim a certain speed without knowing specific hardware.

A high CPU utilization is good, it means good parallelization, thus good efficiency. But different algorithms are more or less parallelizable, and some codecs use heavily parallelized SIMD / Vector instructions which are not reported as utilizing all cores equally. It even depends on the video material. So don't compare CPU utilization values across several different codecs; you can't get fair results.

It doesn't matter whether you use a separate aomenc encoder or ffmpeg if both contain exactly the same libaom codec core.

hi

Thank you for the detailed and informative answer
but I'm trying to encode short files (5-10 second), I did encode some files, I don't have any problems with 6gb ram

Quote

24th Jul 2018 08:42 #5

rockerovo

Member

You can shot 8k your self, you can ask camera manufacturers, optical sensor manufacturers etc, computer generated graphics (synthetic patterns and normal graphics) is also some option.

okay, how I can convert Canon raw "CRW" to yuv?

Thank you for your response

Quote

24th Jul 2018 09:36 #6

pandy

Member

Originally Posted by rockerovo

okay, how I can convert Canon raw "CRW" to yuv?

https://www.lifewire.com/crw-file-2620390
http://rawtherapee.com/
https://helpx.adobe.com/photoshop/using/adobe-dng-converter.html
http://www.cybercom.net/~dcoffin/dcraw/

Or use Canon provided software (or newer Canon Digital Photo Professional software)

Quote

24th Jul 2018 13:19 #7

rockerovo

Member

Hi sorry for asking a lot of questions, right now I'm encoding a 1080p file with :
HEVC : HW Reference Software
AVC : JW Reference Software
VP9 : FFMPEG-libvpx-vp9

the cfg sieeting for HW encoder :

Code:

#======== File I/O ===============
InputBitDepth                 : 8          # Input bitdepth
InputChromaFormat             : 420         # Ratio of luminance to chrominance samples
FrameRate                     : 120          # Frame Rate per second
FrameSkip                     : 0           # Number of frames to be skipped in input
SourceWidth                   : 1920        # Input  frame width
SourceHeight                  : 1080        # Input  frame height
FramesToBeEncoded             : 10         # Number of frames to be coded

PrintFrameMSE         : 1
PrintSequenceMSE        : 1

#======== Profile ================
Profile                       : main
Level                         : 5.2

#======== Unit definition ================
MaxCUWidth                    : 64          # Maximum coding unit width in pixel
MaxCUHeight                   : 64          # Maximum coding unit height in pixel
MaxPartitionDepth             : 4           # Maximum coding unit depth
QuadtreeTULog2MaxSize         : 5           # Log2 of maximum transform size for
                                            # quadtree-based TU coding (2...6)
QuadtreeTULog2MinSize         : 2           # Log2 of minimum transform size for
                                            # quadtree-based TU coding (2...6)
QuadtreeTUMaxDepthInter       : 3
QuadtreeTUMaxDepthIntra       : 3

#======== Coding Structure =============
IntraPeriod                   : 32          # Period of I-Frame ( -1 = only first)
DecodingRefreshType           : 1           # Random Accesss 0:none, 1:CRA, 2:IDR, 3:Recovery Point SEI
GOPSize                       : 8           # GOP Size (number of B slice = GOPSize-1)
ReWriteParamSetsFlag          : 1           # Write parameter sets with every IRAP

IntraQPOffset                 : -3
LambdaFromQpEnable            : 1           # see JCTVC-X0038 for suitable parameters for IntraQPOffset, QPoffset, QPOffsetModelOff, QPOffsetModelScale when enabled
#        Type POC QPoffset QPOffsetModelOff QPOffsetModelScale CbQPoffset CrQPoffset QPfactor tcOffsetDiv2 betaOffsetDiv2 temporal_id #ref_pics_active #ref_pics reference pictures     predict deltaRPS #ref_idcs reference idcs 
Frame1:  B    8   1        0.0                      0.0        0          0          0.442    0            0              0           2                3         -8 -12 -16             0
Frame2:  B    4   2        0.0                      0.0        0          0          0.3536   0            0              1           2                3         -4  -8   4             1       4        4         1 1 0 1
Frame3:  B    2   3        0.0                      0.0        0          0          0.3536   0            0              2           2                4         -2  -6   2 6           1       2        4         1 1 1 1
Frame4:  B    1   4        0.0                      0.0        0          0          0.68     0            0              3           2                4         -1   1   3 7           1       1        5         1 0 1 1 1
Frame5:  B    3   4        0.0                      0.0        0          0          0.68     0            0              3           2                4         -1  -3   1 5           1      -2        5         1 1 1 1 0
Frame6:  B    6   3        0.0                      0.0        0          0          0.3536   0            0              2           2                3         -2  -6   2             1      -3        5         0 1 1 1 0
Frame7:  B    5   4        0.0                      0.0        0          0          0.68     0            0              3           2                4         -1  -5   1 3           1       1        4         1 1 1 1
Frame8:  B    7   4        0.0                      0.0        0          0          0.68     0            0              3           2                4         -1  -3  -7 1           1      -2        5         1 1 1 1 0 

#=========== Motion Search =============
FastSearch                    : 1           # 0:Full search  1:TZ search
SearchRange                   : 256         # (0: Search range is a Full frame)
BipredSearchRange             : 4           # Search range for bi-prediction refinement
HadamardME                    : 1           # Use of hadamard measure for fractional ME
FEN                           : 1           # Fast encoder decision
FDM                           : 1           # Fast Decision for Merge RD cost

#======== Quantization =============
QP                            : 29          # Quantization parameter(0-51)
MaxDeltaQP                    : 0           # CU-based multi-QP optimization
MaxCuDQPDepth                 : 0           # Max depth of a minimum CuDQP for sub-LCU-level delta QP
DeltaQpRD                     : 0           # Slice-based multi-QP optimization
RDOQ                          : 1           # RDOQ
RDOQTS                        : 1           # RDOQ for transform skip
SliceChromaQPOffsetPeriodicity: 0           # Used in conjunction with Slice Cb/Cr QpOffsetIntraOrPeriodic. Use 0 (default) to disable periodic nature.
SliceCbQpOffsetIntraOrPeriodic: 0           # Chroma Cb QP Offset at slice level for I slice or for periodic inter slices as defined by SliceChromaQPOffsetPeriodicity. Replaces offset in the GOP table.
SliceCrQpOffsetIntraOrPeriodic: 0           # Chroma Cr QP Offset at slice level for I slice or for periodic inter slices as defined by SliceChromaQPOffsetPeriodicity. Replaces offset in the GOP table.

#=========== Deblock Filter ============
LoopFilterOffsetInPPS         : 1           # Dbl params: 0=varying params in SliceHeader, param = base_param + GOP_offset_param; 1 (default) =constant params in PPS, param = base_param)
LoopFilterDisable             : 0           # Disable deblocking filter (0=Filter, 1=No Filter)
LoopFilterBetaOffset_div2     : 0           # base_param: -6 ~ 6
LoopFilterTcOffset_div2       : 0           # base_param: -6 ~ 6
DeblockingFilterMetric        : 0           # blockiness metric (automatically configures deblocking parameters in bitstream). Applies slice-level loop filter offsets (LoopFilterOffsetInPPS and LoopFilterDisable must be 0)

#=========== Misc. ============
InternalBitDepth              : 8           # codec operating bit-depth

#=========== Coding Tools =================
SAO                           : 1           # Sample adaptive offset  (0: OFF, 1: ON)
AMP                           : 1           # Asymmetric motion partitions (0: OFF, 1: ON)
TransformSkip                 : 1           # Transform skipping (0: OFF, 1: ON)
TransformSkipFast             : 1           # Fast Transform skipping (0: OFF, 1: ON)
SAOLcuBoundary                : 0           # SAOLcuBoundary using non-deblocked pixels (0: OFF, 1: ON)

#============ Slices ================
SliceMode                : 0                # 0: Disable all slice options.
                                            # 1: Enforce maximum number of LCU in an slice,
                                            # 2: Enforce maximum number of bytes in an 'slice'
                                            # 3: Enforce maximum number of tiles in a slice
SliceArgument            : 1500             # Argument for 'SliceMode'.
                                            # If SliceMode==1 it represents max. SliceGranularity-sized blocks per slice.
                                            # If SliceMode==2 it represents max. bytes per slice.
                                            # If SliceMode==3 it represents max. tiles per slice.

LFCrossSliceBoundaryFlag : 1                # In-loop filtering, including ALF and DB, is across or not across slice boundary.
                                            # 0:not across, 1: across

#============ PCM ================
PCMEnabledFlag                      : 0                # 0: No PCM mode
PCMLog2MaxSize                      : 5                # Log2 of maximum PCM block size.
PCMLog2MinSize                      : 3                # Log2 of minimum PCM block size.
PCMInputBitDepthFlag                : 1                # 0: PCM bit-depth is internal bit-depth. 1: PCM bit-depth is input bit-depth.
PCMFilterDisableFlag                : 0                # 0: Enable loop filtering on I_PCM samples. 1: Disable loop filtering on I_PCM samples.

#============ Tiles ================
TileUniformSpacing                  : 0                # 0: the column boundaries are indicated by TileColumnWidth array, the row boundaries are indicated by TileRowHeight array
                                                       # 1: the column and row boundaries are distributed uniformly
NumTileColumnsMinus1                : 0                # Number of tile columns in a picture minus 1
TileColumnWidthArray                : 2 3              # Array containing tile column width values in units of CTU (from left to right in picture)   
NumTileRowsMinus1                   : 0                # Number of tile rows in a picture minus 1
TileRowHeightArray                  : 2                # Array containing tile row height values in units of CTU (from top to bottom in picture)

LFCrossTileBoundaryFlag             : 1                # In-loop filtering is across or not across tile boundary.
                                                       # 0:not across, 1: across 

#============ WaveFront ================
WaveFrontSynchro                    : 0                # 0:  No WaveFront synchronisation (WaveFrontSubstreams must be 1 in this case).
                                                       # >0: WaveFront synchronises with the LCU above and to the right by this many LCUs.

#=========== Quantization Matrix =================
ScalingList                   : 0                      # ScalingList 0 : off, 1 : default, 2 : file read
ScalingListFile               : scaling_list.txt       # Scaling List file name. If file is not exist, use Default Matrix.

#============ Lossless ================
TransquantBypassEnableFlag : 0                         # Value of PPS flag.
CUTransquantBypassFlagForce: 0                         # Force transquant bypass mode, when transquant_bypass_enable_flag is enabled

#============ Rate Control ======================
RateControl                         : 0                # Rate control: enable rate control
TargetBitrate                       : 1000000          # Rate control: target bitrate, in bps
KeepHierarchicalBit                 : 2                # Rate control: 0: equal bit allocation; 1: fixed ratio bit allocation; 2: adaptive ratio bit allocation
LCULevelRateControl                 : 1                # Rate control: 1: LCU level RC; 0: picture level RC
RCLCUSeparateModel                  : 1                # Rate control: use LCU level separate R-lambda model
InitialQP                           : 0                # Rate control: initial QP
RCForceIntraQP                      : 0                # Rate control: force intra QP to be equal to initial QP

### DO NOT ADD ANYTHING BELOW THIS LINE ###
### DO NOT DELETE THE EMPTY LINE BELOW ###

then i run this command :

Code:

 ./TAppEncoderStatic -c encoder_randomaccess_main.cfg -i /home/siraj/Desktop/Project/Samples/Jockey_1920x1080_120fps_420_8bit_YUV.yuv -b hevc_1080_29.bin -o hevc_1080_29.yuv >> hevc_1080_29.txt

the "hevc_1080_29.txt" output :

Code:

HM software: Encoder Version [16.18] (including RExt)[Linux][GCC 7.3.0][64 bit] 


Input          File                    : /home/siraj/Desktop/Project/Samples/Jockey_1920x1080_120fps_420_8bit_YUV.yuv
Bitstream      File                    : hevc_1080_29.bin
Reconstruction File                    : hevc_1080_29.yuv
Real     Format                        : 1920x1080 120Hz
Internal Format                        : 1920x1080 120Hz
Sequence PSNR output                   : Linear average only
Sequence MSE output                    : Enabled
Frame MSE output                       : Enabled
MS-SSIM output                         : Disabled
Cabac-zero-word-padding                : Enabled
Frame/Field                            : Frame based coding
Frame index                            : 0 - 9 (10 frames)
Profile                                : main
CU size / depth / total-depth          : 64 / 4 / 4
RQT trans. size (min / max)            : 4 / 32
Max RQT depth inter                    : 3
Max RQT depth intra                    : 3
Min PCM size                           : 8
Motion search range                    : 256
Intra period                           : 32
Decoding refresh type                  : 1
QP                                     : 29
Max dQP signaling depth                : 0
Cb QP Offset                           : 0
Cr QP Offset                           : 0
QP adaptation                          : 0 (range=0)
GOP size                               : 8
Input bit depth                        : (Y:8, C:8)
MSB-extended bit depth                 : (Y:8, C:8)
Internal bit depth                     : (Y:8, C:8)
PCM sample bit depth                   : (Y:8, C:8)
Intra reference smoothing              : Enabled
diff_cu_chroma_qp_offset_depth         : -1
extended_precision_processing_flag     : Disabled
implicit_rdpcm_enabled_flag            : Disabled
explicit_rdpcm_enabled_flag            : Disabled
transform_skip_rotation_enabled_flag   : Disabled
transform_skip_context_enabled_flag    : Disabled
cross_component_prediction_enabled_flag: Disabled
high_precision_offsets_enabled_flag    : Disabled
persistent_rice_adaptation_enabled_flag: Disabled
cabac_bypass_alignment_enabled_flag    : Disabled
log2_sao_offset_scale_luma             : 0
log2_sao_offset_scale_chroma           : 0
Cost function:                         : Lossy coding (default)
RateControl                            : 0
WPMethod                               : 0
Max Num Merge Candidates               : 5

TOOL CFG: IBD:0 HAD:1 RDQ:1 RDQTS:1 RDpenalty:0 LQP:0 SQP:0 ASR:0 MinSearchWindow:8 RestrictMESampling:0 FEN:1 ECU:0 FDM:1 CFM:0 ESD:0 RQT:1 TransformSkip:1 TransformSkipFast:1 TransformSkipLog2MaxSize:2 Slice: M=0 SliceSegment: M=0 CIP:0 SAO:1 PCM:0 TransQuantBypassEnabled:0 WPP:0 WPB:0 PME:2  WaveFrontSynchro:0 WaveFrontSubstreams:1 ScalingList:0 TMVPMode:1 AQpS:0 SignBitHidingFlag:1 RecalQP:0

Non-environment-variable-controlled macros set as follows: 

                                RExt__DECODER_DEBUG_BIT_STATISTICS =   0
                                      RExt__HIGH_BIT_DEPTH_SUPPORT =   0
                            RExt__HIGH_PRECISION_FORWARD_TRANSFORM =   0
                                        O0043_BEST_EFFORT_DECODING =   0
                                         ME_ENABLE_ROUNDING_OF_MVS =   1

                   Input ChromaFormatIDC =   4:2:0
       Output (internal) ChromaFormatIDC =   4:2:0

POC    0 TId: 0 ( I-SLICE, nQP 26 QP 26 )     418088 bits [Y 42.8646 dB    U 43.7874 dB    V 44.1285 dB] [Y MSE 3.3621  U MSE 2.7185  V MSE 2.5132] [ET    12 ] [L0 ] [L1 ]
POC    8 TId: 0 ( B-SLICE, nQP 30 QP 30 )      83864 bits [Y 41.7588 dB    U 43.2854 dB    V 43.7040 dB] [Y MSE 4.3371  U MSE 3.0517  V MSE 2.7713] [ET    27 ] [L0 0 ] [L1 0 ]
POC    4 TId: 1 ( B-SLICE, nQP 31 QP 31 )      32248 bits [Y 41.7168 dB    U 43.3183 dB    V 43.7337 dB] [Y MSE 4.3792  U MSE 3.0287  V MSE 2.7524] [ET    26 ] [L0 0 8 ] [L1 8 0 ]
POC    2 TId: 2 ( B-SLICE, nQP 32 QP 32 )      20880 bits [Y 41.8175 dB    U 43.3482 dB    V 43.7739 dB] [Y MSE 4.2789  U MSE 3.0078  V MSE 2.7270] [ET    26 ] [L0 0 4 ] [L1 4 8 ]
POC    1 TId: 3 ( B-SLICE, nQP 33 QP 33 )      10656 bits [Y 41.9152 dB    U 43.4544 dB    V 43.8358 dB] [Y MSE 4.1837  U MSE 2.9352  V MSE 2.6884] [ET    25 ] [L0 0 2 ] [L1 2 4 ]
POC    3 TId: 3 ( B-SLICE, nQP 33 QP 33 )      11448 bits [Y 41.6468 dB    U 43.3309 dB    V 43.7687 dB] [Y MSE 4.4504  U MSE 3.0199  V MSE 2.7303] [ET    28 ] [L0 2 0 ] [L1 4 8 ]
POC    6 TId: 2 ( B-SLICE, nQP 32 QP 32 )      21800 bits [Y 41.5797 dB    U 43.2551 dB    V 43.7030 dB] [Y MSE 4.5197  U MSE 3.0730  V MSE 2.7719] [ET    27 ] [L0 4 0 ] [L1 8 4 ]
POC    5 TId: 3 ( B-SLICE, nQP 33 QP 33 )      12456 bits [Y 41.5673 dB    U 43.3008 dB    V 43.6896 dB] [Y MSE 4.5326  U MSE 3.0409  V MSE 2.7805] [ET    27 ] [L0 4 0 ] [L1 6 8 ]
POC    7 TId: 3 ( B-SLICE, nQP 33 QP 33 )      11824 bits [Y 41.5538 dB    U 43.2267 dB    V 43.6695 dB] [Y MSE 4.5467  U MSE 3.0932  V MSE 2.7934] [ET    24 ] [L0 6 4 ] [L1 8 6 ]
POC    9 TId: 3 ( B-SLICE, nQP 33 QP 33 )      19696 bits [Y 41.4120 dB    U 43.1222 dB    V 43.5762 dB] [Y MSE 4.6977  U MSE 3.1686  V MSE 2.8541] [ET    17 ] [L0 8 ] [L1 8 ]


SUMMARY --------------------------------------------------------
	Total Frames |   Bitrate     Y-PSNR    U-PSNR    V-PSNR    YUV-PSNR  Y-MSE     U-MSE     V-MSE    YUV-MSE 
	       10    a    7715.5200   41.7833   43.3429   43.7583   42.2823    4.3288    3.0138    2.7383    3.8445


I Slices--------------------------------------------------------
	Total Frames |   Bitrate     Y-PSNR    U-PSNR    V-PSNR    YUV-PSNR  Y-MSE     U-MSE     V-MSE    YUV-MSE 
	        1    i   50170.5600   42.8646   43.7874   44.1285   43.1985    3.3621    2.7185    2.5132    3.1134


P Slices--------------------------------------------------------
	Total Frames |   Bitrate     Y-PSNR    U-PSNR    V-PSNR    YUV-PSNR  Y-MSE     U-MSE     V-MSE    YUV-MSE 
	        0    p         -nan      -nan      -nan      -nan      -nan      -nan      -nan      -nan      -nan


B Slices--------------------------------------------------------
	Total Frames |   Bitrate     Y-PSNR    U-PSNR    V-PSNR    YUV-PSNR  Y-MSE     U-MSE     V-MSE    YUV-MSE 
	        9    b    2998.2933   41.6631   43.2936   43.7172   42.1915    4.4362    3.0466    2.7633    3.9258

RVM: 0.000
Bytes written to file: 80370 (7715.520 kbps)

 Total Time:      239.852 sec.

for the slow of the encoder, I choose to encode 10 frames only

Last edited by rockerovo; 24th Jul 2018 at 13:27.

Quote

24th Jul 2018 13:26 #8

rockerovo

Member

for AVC, the CFG file for JM :

"attachments"

then i run this command :

Code:

$ ./lencod.exe -d mine.cfg >> AVC_1080_29.txt

AVC_1080_29.txt output :

Code:


	Code:
	 -------------------------------------------------------------- 
  This file contains statistics for the last encoded sequence   
 -------------------------------------------------------------- 
 Sequence                     : /home/siraj/Desktop/Project/Samples/Jockey_1920x1080_120fps_420_8bit_YUV.yuv
 No.of coded pictures         :   10
 Freq. for encoded bitstream  :  120
 I Slice Bitrate(kb/s)        : 4459.01
 P Slice Bitrate(kb/s)        : 3454.94
 B Slice Bitrate(kb/s)        : 2158.56
 Total Bitrate(kb/s)          : 10076.64
 ME Level 0 Metric            : SAD
 ME Level 1 Metric            : Hadamard SAD
 ME Level 2 Metric            : Hadamard SAD
 Mode Decision Metric         : Hadamard SAD
 ME for components            : Y
 Image format                 : 1920x1080
 Error robustness             : Off
 Search range                 : 32
 Total number of references   : 5
 References for P slices      : 5
 List0 refs for B slices      : 5
 List1 refs for B slices      : 1
 Profile/Level IDC            : (100,52)
 Entropy coding method        : CABAC
 EPZS Pattern                 : Extended Diamond
 EPZS Dual Pattern            : Extended Diamond
 EPZS Fixed Predictors        : Aggressive
 EPZS Aggressive Predictors   : Disabled
 EPZS Temporal Predictors     : Enabled
 EPZS Spatial Predictors      : Enabled
 EPZS Threshold Multipliers   : (1 0 2)
 EPZS Subpel ME               : Basic
 EPZS Subpel ME BiPred        : Basic
 Search range restrictions    : none
 RD-optimized mode decision   : used

 ---------------------|----------------|---------------|
     Item             |     Intra      |   All frames  |
 ---------------------|----------------|---------------|
 SNR Y(dB)            | 41.26          | 40.74         |
 SNR U/V (dB)         | 42.23/42.73    | 42.09/42.62   |
 ---------------------|----------------|---------------|

 ---------------------|----------------|---------------|---------------|
     SNR              |        I       |       P       |       B       |
 ---------------------|----------------|---------------|---------------|
 SNR Y(dB)            |      41.256    |     40.900    |     40.574    |
 SNR U(dB)            |      42.228    |     42.100    |     42.068    |
 SNR V(dB)            |      42.733    |     42.625    |     42.601    |
 ---------------------|----------------|---------------|---------------|

 ---------------------|----------------|---------------|---------------|
     Ave Quant        |        I       |       P       |       B       |
 ---------------------|----------------|---------------|---------------|
        QP            |      29.000    |     29.000    |     29.000    |
 ---------------------|----------------|---------------|---------------|

 ---------------------|----------------|
   Intra              |   Mode used    |
 ---------------------|----------------|
 Mode 0  intra 4x4    |    278         |
 Mode 1  intra 8x8    |   6284         |
 Mode 2+ intra 16x16  |   1598         |
 Mode    intra IPCM   |      0         |
 ---------------------|----------------|-----------------|
   P Slice            |   Mode used    | MotionInfo bits |
 ---------------------|----------------|-----------------|
 Mode  0  (copy)      |  16182         |        0.00     |
 Mode  1  (16x16)     |   3621         |     7626.00     |
 Mode  2  (16x8)      |    515         |     2712.00     |
 Mode  3  (8x16)      |    623         |     2770.67     |
 Mode  4  (8x8)       |    238         |     3136.33     |
 Mode  5  intra 4x4   |    111         |-----------------|
 Mode  6  intra 8x8   |   2303         |
 Mode  7+ intra 16x16 |    887         |
 Mode     intra IPCM  |      0         |
 ---------------------|----------------|-----------------|
   B Slice            |   Mode used    | MotionInfo bits |
 ---------------------|----------------|-----------------|
 Mode  0  (copy)      |  42128         |        0.00     |
 Mode  1  (16x16)     |   4937         |     5343.00     |
 Mode  2  (16x8)      |    654         |     1342.00     |
 Mode  3  (8x16)      |    736         |     1661.17     |
 Mode  4  (8x8)       |    212         |      641.50     |
 Mode  5  intra 4x4   |     19         |-----------------|
 Mode  6  intra 8x8   |    150         |
 Mode  7+ intra 16x16 |    124         |
 Mode     intra IPCM  |      0         |
 ---------------------|----------------|

 ---------------------|----------------|----------------|----------------|----------------|
  Bit usage:          |      Intra     |      Inter     |    B frame     |    SP frame    |
 ---------------------|----------------|----------------|----------------|----------------|
 Header               |      32.00     |      32.00     |      32.00     |
 Mode                 |   66375.00     |   18769.67     |    7295.83     |
 Motion Info          |        ./.     |   16245.00     |    8987.67     |
 CBP Y/C              |   27092.00     |   10668.00     |    2928.67     |
 Coeffs. Y            |  216124.00     |   41784.67     |    9169.50     |       0.00     |
 Coeffs. C            |   61226.00     |    8127.00     |    1330.67     |       0.00     |
 Coeffs. CB           |       0.00     |       0.00     |       0.00     |       0.00     |
 Coeffs. CR           |       0.00     |       0.00     |       0.00     |       0.00     |
 Delta quant          |     456.00     |     113.00     |      26.67     |
 Stuffing Bits        |       7.00     |       6.33     |       5.67     |
 ---------------------|----------------|----------------|----------------|
 average bits/frame   |  371312.00     |   95745.66     |   29776.67     |
 ---------------------|----------------|----------------|----------------|

for VP9 , i run this command :

Code:

ffmpeg -f rawvideo -c:v rawvideo -s 1920x1080 -r 120 -pix_fmt yuv420p -i Jockey_1920x1080_120fps_420_8bit_YUV.yuv -vf fps=fps=120 -keyint_min 50 -g 50 -pass 1 -passlogfile jockey-2160 -c:v libvpx-vp9 -threads 8 -cpu-used 4 -tile-columns 3 -frame-parallel 1 -b:v 0 -crf 29 -an -f webm -y NUL

Code:

ffmpeg -f rawvideo -c:v rawvideo -s 1920x1080 -r 120 -pix_fmt yuv420p -i Jockey_1920x1080_120fps_420_8bit_YUV.yuv -vf fps=fps=120 -keyint_min 50 -g 50 -pass 2 -passlogfile jockey-2160 -c:v libvpx-vp9 -threads 8 -cpu-used 3 -tile-columns 3 -frame-parallel 1 -auto-alt-ref 1 -b:v 0 -crf 29 -an -f webm -y jockey.webm

then i pass the webm file to the decoder with this command :

Code:

./vpxdec  jockey.webm --420 -o vp9.yuv

Attached Files

mine.cfg.zip (12.0 KB, 19 views)

Quote

24th Jul 2018 13:36 #9

rockerovo

Member

I'm still studying the video coding, but I want to know how things work

the result, compare with each yuv with the original yuv :

https://drive.google.com/drive/folders/1kvBg-_FuOkVefb-sylVx2qx_7D6AGEP1?usp=sharing

As you notice vp9 looks better between them, I mean i can't compare the JM, HW with FFMPEG VP9, VP9 the fastest right now, how i can make a Fair comparison?
do you recommend me to change HW, JM to x265 and x264?

the last question here how I can determine the bitrate for yuv file ? to compare saving on bitrate?

Quote

24th Jul 2018 13:58 #10

pandy

Member

Originally Posted by rockerovo

the last question here how I can determine the bitrate for yuv file ? to compare saving on bitrate?

I can only provide correct answer for this question - YUV (YCbCr) bandwidth calculation are quite simple - for 8 bit pixel depth (YCbCr 4:2:0 HX*VY*1.5*FPS)/125000=Mbps where:
HX - amount of pixels in line
VY - amount of lines
FPS - framerate per second

for 1920x1080 and 30 fps required bandwidth is 746.496Mbps

Quote

24th Jul 2018 14:16 #11

rockerovo

Member

Originally Posted by pandy

Originally Posted by rockerovo

the last question here how I can determine the bitrate for yuv file ? to compare saving on bitrate?

I can only provide correct answer for this question - YUV (YCbCr) bandwidth calculation are quite simple - for 8 bit pixel depth (YCbCr 4:2:0 HX*VY*1.5*FPS)/125000=Mbps where:
HX - amount of pixels in line
VY - amount of lines
FPS - framerate per second

for 1920x1080 and 30 fps required bandwidth is 746.496Mbps

look at this, from "Performance Comparison of High-Efficiency Video Coding (HEVC) with H.264
AVC
"

[Attachment 46174 - Click to enlarge]

Compression factor

How he did this from HW and JM, Did he convert them to MP4 or MKV?

all this question because I know all this coding, so we can get a file with good quilty and less bitrate, so how can I calculate the bitrate for the original yuv and output encoder file?

Quote

24th Jul 2018 18:24 #12

pandy

Member

Originally Posted by rockerovo

look at this, from "Performance Comparison of High-Efficiency Video Coding (HEVC) with H.264
AVC
"

[Attachment 46174 - Click to enlarge]

Compression factor

How he did this from HW and JM, Did he convert them to MP4 or MKV?

all this question because I know all this coding, so we can get a file with good quilty and less bitrate, so how can I calculate the bitrate for the original yuv and output encoder file?

Nope - just calculate raw h264 size (file size in bits - so multiply bytes by 8) divided by duration - this will give you bitrate... same as for yuv only difference is that yuv is uncompressed thus way bigger.

mkv or mp4 are containers and they main purpose is to organise video, audio and other sub streams within single unity i.e. encapsulate multipple data within single file - each container add some amount of overhead related to own container requirements

As you comparing only video then you can compare raw compressed video size.

Quote

24th Jul 2018 20:01 #13

gdx

Member

Originally Posted by rockerovo
for VP9 , i run this command :
Code:
ffmpeg -f rawvideo -c:v rawvideo -s 1920x1080 -r 120 -pix_fmt yuv420p -i Jockey_1920x1080_120fps_420_8bit_YUV.yuv -vf fps=fps=120 -keyint_min 50 -g 50 -pass 1 -passlogfile jockey-2160 -c:v libvpx-vp9 -threads 8 -cpu-used 4 -tile-columns 3 -frame-parallel 1 -b:v 0 -crf 29 -an -f webm -y NUL
Code:
ffmpeg -f rawvideo -c:v rawvideo -s 1920x1080 -r 120 -pix_fmt yuv420p -i Jockey_1920x1080_120fps_420_8bit_YUV.yuv -vf fps=fps=120 -keyint_min 50 -g 50 -pass 2 -passlogfile jockey-2160 -c:v libvpx-vp9 -threads 8 -cpu-used 3 -tile-columns 3 -frame-parallel 1 -auto-alt-ref 1 -b:v 0 -crf 29 -an -f webm -y jockey.webm
For VP9 I recomend you to use something like:
Code:
ffmpeg -f rawvideo -c:v rawvideo -s 1920x1080 -r 120 -pix_fmt yuv420p -i Jockey_1920x1080_120fps_420_8bit_YUV.yuv -frames:v 10 -vf fps=fps=120 -keyint_min 50 -g 50 -c:v libvpx-vp9 -pass 1 -passlogfile jockey-2160 -b:v 0 -crf 29 -threads 8 -deadline good -cpu-used 4 -tile-columns 0 -tile-rows 0 -frame-parallel 0 -qmax 29 -auto-alt-ref 1 -aq-mode 0 -an -f null NUL
ffmpeg -f rawvideo -c:v rawvideo -s 1920x1080 -r 120 -pix_fmt yuv420p -i Jockey_1920x1080_120fps_420_8bit_YUV.yuv -frames:v 10 -vf fps=fps=120 -keyint_min 50 -g 50 -c:v libvpx-vp9 -pass 2 -passlogfile jockey-2160 -b:v 0 -crf 29 -row-mt 1 -threads 8 -deadline good -cpu-used 1 -tile-columns 3 -tile-rows 0 -frame-parallel 0 -qmax 29 -auto-alt-ref 1 -aq-mode 0 -an -f webm -y jockey_1.webm
That is basically the configuration that i use with vp9, also if it is for test is also interesting to use "-deadline best" and "-cpu-used 0" in the second pass, is the slowest and better mode of vp9 but I always recommend "-deadline good -cpu-used 1" for practical purposes , row-mt is recommended in the second pass as it don't hurt the quality and boost the compression speed.

Also the quantification scale is not the same for the different video codecs, then q29 in x265 or x264 is not the same a q29 in vp9 (is more like the equivalent to be in the range of 36-46)

Quote

24th Jul 2018 23:36 #14

rockerovo

Member

Originally Posted by gdx
Originally Posted by rockerovo
for VP9 , i run this command :
Code:
ffmpeg -f rawvideo -c:v rawvideo -s 1920x1080 -r 120 -pix_fmt yuv420p -i Jockey_1920x1080_120fps_420_8bit_YUV.yuv -vf fps=fps=120 -keyint_min 50 -g 50 -pass 1 -passlogfile jockey-2160 -c:v libvpx-vp9 -threads 8 -cpu-used 4 -tile-columns 3 -frame-parallel 1 -b:v 0 -crf 29 -an -f webm -y NUL
Code:
ffmpeg -f rawvideo -c:v rawvideo -s 1920x1080 -r 120 -pix_fmt yuv420p -i Jockey_1920x1080_120fps_420_8bit_YUV.yuv -vf fps=fps=120 -keyint_min 50 -g 50 -pass 2 -passlogfile jockey-2160 -c:v libvpx-vp9 -threads 8 -cpu-used 3 -tile-columns 3 -frame-parallel 1 -auto-alt-ref 1 -b:v 0 -crf 29 -an -f webm -y jockey.webm
For VP9 I recomend you to use something like:
Code:
ffmpeg -f rawvideo -c:v rawvideo -s 1920x1080 -r 120 -pix_fmt yuv420p -i Jockey_1920x1080_120fps_420_8bit_YUV.yuv -frames:v 10 -vf fps=fps=120 -keyint_min 50 -g 50 -c:v libvpx-vp9 -pass 1 -passlogfile jockey-2160 -b:v 0 -crf 29 -threads 8 -deadline good -cpu-used 4 -tile-columns 0 -tile-rows 0 -frame-parallel 0 -qmax 29 -auto-alt-ref 1 -aq-mode 0 -an -f null NUL
ffmpeg -f rawvideo -c:v rawvideo -s 1920x1080 -r 120 -pix_fmt yuv420p -i Jockey_1920x1080_120fps_420_8bit_YUV.yuv -frames:v 10 -vf fps=fps=120 -keyint_min 50 -g 50 -c:v libvpx-vp9 -pass 2 -passlogfile jockey-2160 -b:v 0 -crf 29 -row-mt 1 -threads 8 -deadline good -cpu-used 1 -tile-columns 3 -tile-rows 0 -frame-parallel 0 -qmax 29 -auto-alt-ref 1 -aq-mode 0 -an -f webm -y jockey_1.webm
That is basically the configuration that i use with vp9, also if it is for test is also interesting to use "-deadline best" and "-cpu-used 0" in the second pass, is the slowest and better mode of vp9 but I always recommend "-deadline good -cpu-used 1" for practical purposes , row-mt is recommended in the second pass as it don't hurt the quality and boost the compression speed.

Also the quantification scale is not the same for the different video codecs, then q29 in x265 or x264 is not the same a q29 in vp9 (is more like the equivalent to be in the range of 36-46)
okay, how I can get almost the same settings? between AVC,HVEC,VP9,AV1 to make the comparison fair

Quote

25th Jul 2018 03:33 #15

pandy

Member

Originally Posted by rockerovo

okay, how I can get almost the same settings? between AVC,HVEC,VP9,AV1 to make the comparison fair

You need to understand codec and you need verify your settings - this is very difficult and very challenging - everything what i've wrote earlier was not to bash you but to trig some thoughts about your work. You can for example search settings where PSNR will be as close as possible or SSIM but... even small difference in numbers will make comparison incomparable - bitrate seem to be less biased so IMHO you set some target bitrate and match settings to reach this bitrate - however your concern should be maturity of each evaluated codec - some of them are mature some of them not and codec efficiency depends on maturity - of course this is learning curve so every new codec gain profit from knowledge acquired during older codec development but not everything can be easily accommodated.

Quote

25th Jul 2018 05:52 #16

fornit

Member

Originally Posted by gdx
For VP9 I recomend you to use something like:
Code:
ffmpeg -f rawvideo -c:v rawvideo -s 1920x1080 -r 120 -pix_fmt yuv420p -i Jockey_1920x1080_120fps_420_8bit_YUV.yuv -frames:v 10 -vf fps=fps=120 -keyint_min 50 -g 50 -c:v libvpx-vp9 -pass 1 -passlogfile jockey-2160 -b:v 0 -crf 29 -threads 8 -deadline good -cpu-used 4 -tile-columns 0 -tile-rows 0 -frame-parallel 0 -qmax 29 -auto-alt-ref 1 -aq-mode 0 -an -f null NUL
ffmpeg -f rawvideo -c:v rawvideo -s 1920x1080 -r 120 -pix_fmt yuv420p -i Jockey_1920x1080_120fps_420_8bit_YUV.yuv -frames:v 10 -vf fps=fps=120 -keyint_min 50 -g 50 -c:v libvpx-vp9 -pass 2 -passlogfile jockey-2160 -b:v 0 -crf 29 -row-mt 1 -threads 8 -deadline good -cpu-used 1 -tile-columns 3 -tile-rows 0 -frame-parallel 0 -qmax 29 -auto-alt-ref 1 -aq-mode 0 -an -f webm -y jockey_1.webm
That is basically the configuration that i use with vp9, also if it is for test is also interesting to use "-deadline best" and "-cpu-used 0" in the second pass, is the slowest and better mode of vp9 but I always recommend "-deadline good -cpu-used 1" for practical purposes , row-mt is recommended in the second pass as it don't hurt the quality and boost the compression speed.

Also the quantification scale is not the same for the different video codecs, then q29 in x265 or x264 is not the same a q29 in vp9 (is more like the equivalent to be in the range of 36-46)
Thank you for the hint with respect to row-mt

I haven't seen it before and it seems to be an interesting improvement!

But also with -row-mt you need to consider that there is a relationship between horizontal resolution (width), -tile-columns and -threads. It is described at the following url:
http://permalink.gmane.org/gmane.comp.multimedia.webm.devel/2339

As I see, -row-mt now allows to double up the number threads. See:
https://groups.google.com/a/webmproject.org/forum/#!topic/codec-devel/oiHjgEdii2U
https://developers.google.com/media/vp9/live-encoding/

So, if you set "-row-mt 1", this means:
width < 512 => tile-columns = 0, threads = 2

512 <= width < 1024 => tile-columns = 1, threads = 4

1024 <= width < 2048 => tile-columns = 2, threads = 8

width >= 2048 => tile-columns = 3, threads = 16

For a resolution of 1920 * 1080 px you should use:
-tile-columns 2 -threads 8

If you set -tile-columns to 0 (as in your 1st pass), then the encoder can only use 2 threads. With a width of 1920 you can set -tile-columns to a maximum of 2.

Quote

25th Jul 2018 10:25 #17

gdx

Member

[QUOTE=rockerovo;2525212][QUOTE=gdx;2525193]

Originally Posted by rockerovo

okay, how I can get almost the same settings? between AVC,HVEC,VP9,AV1 to make the comparison fair

That is difficult, and the correct way is to compare across a set of qualities/birate targets instead of a sole target, Also you have to had in mind what you want to compare as some tuning can worsen the visual quality but boost metric or in reverse as hurting metrics but boost the perceived quality. One example of what tuning can do is that in x265 using "very slow" and this config "-x265-params keyint=250:ref=6:limit-refs=3:rc-lookahead=60:amp=1ubme=7:aq-mode=3:aq-strength=1.25sy-rd=0.75sy-rdoq=2.00" this make that a crf25 file have the same size approximately of a crf23 file whiteout tuning, the quality degradation in the bright part of the video is unnoticeable but the dark parts looks better in the tuned crf25 version than in the untuned crf23 version, making the visual quality to be perceived as being better in the tuned version.

For example in this comparison https://wyohknott.github.io/video-formats-comparison/ I find that if he had used for vp9 "--cpu-used=1" he had improved the metrics and the encoder speed had been more similar to the one of x265. Also for AV1 he used "--cpu-used=4" with drops the quality but is understandable as it is painful slow for now.

Also I recommend you to read this comparison http://goughlui.com/2016/08/27/video-compression-testing-x264-vs-x265-crf-in-handbrake-0-10-5/ , also not perfect but at least comply with its function.

Originally Posted by fornit

Thank you for the hint with respect to row-mt

I haven't seen it before and it seems to be an interesting improvement!

But also with -row-mt you need to consider that there is a relationship between horizontal resolution (width), -tile-columns and -threads. It is described at the following url:
http://permalink.gmane.org/gmane.comp.multimedia.webm.devel/2339

As I see, -row-mt now allows to double up the number threads. See:
https://groups.google.com/a/webmproject.org/forum/#!topic/codec-devel/oiHjgEdii2U
https://developers.google.com/media/vp9/live-encoding/

So, if you set "-row-mt 1", this means:
width < 512 => tile-columns = 0, threads = 2

512 <= width < 1024 => tile-columns = 1, threads = 4

1024 <= width < 2048 => tile-columns = 2, threads = 8

width >= 2048 => tile-columns = 3, threads = 16

For a resolution of 1920 * 1080 px you should use:
-tile-columns 2 -threads 8

If you set -tile-columns to 0 (as in your 1st pass), then the encoder can only use 2 threads. With a width of 1920 you can set -tile-columns to a maximum of 2.

You are wrong in various things, tiles-colums actually limit the maximum number of the tiles the encoder can use, each having a with of 256 pixels, the encoder uses what it needs not the specified value. Also row-mt don't only duplicate the number of treads, its more than double (for tiles-colums 0 it at least triplicates the treads), its performance gains come in 2 savors, it ads more treads and it makes the treads more efficient.

One thing that demonstrates that you are wrong is that with row-mt 1 and tile-columns 1 encoding a 1280p video in my machine I'm able to max the 8 thread that my CPU is capable, your table tell that I had 4 threads while in reality I was using 8 threads.

Also 2x speedup is not the same a 2x threads, you need to have in consideration Amdahl's law as parallelization is not perfect in most cases.

Last edited by gdx; 25th Jul 2018 at 10:34.

Quote

25th Jul 2018 11:48 #18

fornit

Member

Originally Posted by gdx

You are wrong in various things, tiles-colums actually limit the maximum number of the tiles the encoder can use, each having a with of 256 pixels, the encoder uses what it needs not the specified value.

Sorry, but you are mixing things. Tiles have a minimum width of 256 px. But please note that the -tile-columns option is in log2 format. So -tile-columns 3 means: 8 tiles (2^3 = 8). And 8 * 256 = 2048. That's why it seems unlikely that your encoding suggestion of 3 tile-columns (8 tiles) for 1920 px can really make sense.

The post I've linked above, and which is confirming that, has been written by VP9 developer Yinqing Wang. If that isn't enough for you, please find a more detailed explanation at:
https://stackoverflow.com/questions/41372045/vp9-encoding-limited-to-4-threads

Originally Posted by gdx

Also row-mt don't only duplicate the number of treads, its more than double (for tiles-colums 0 it at least triplicates the treads), its performance gains come in 2 savors, it ads more treads and it makes the treads more efficient.

I don't know that and have to test it. The Google documentation at https://developers.google.com/media/vp9/live-encoding/ says that it's doubling the number of threads. Maybe it's wrong. If you can really have 8 threads with 2 tiles then this will surely make me happy

Quote

26th Jul 2018 07:09 #19

gdx

Member

Originally Posted by fornit

Sorry, but you are mixing things. Tiles have a minimum width of 256 px. But please note that the -tile-columns option is in log2 format. So -tile-columns 3 means: 8 tiles (2^3 = 8). And 8 * 256 = 2048. That's why it seems unlikely that your encoding suggestion of 3 tile-columns (8 tiles) for 1920 px can really make sense.

The post I've linked above, and which is confirming that, has been written by VP9 developer Yinqing Wang. If that isn't enough for you, please find a more detailed explanation at:
https://stackoverflow.com/questions/41372045/vp9-encoding-limited-to-4-threads

True, I had been mixing things, ironically in near all my scripts I use "-tile-columns 2", for other part if you put "-tile-columns 3" in a 1080p ffmpeg command is treated as if you had put "-tile-columns 2" then no damage is done. One explication for what I have done it is that the ffmpeg command it was originally done for 4k, then altered for 1080.

Originally Posted by fornit

I don't know that and have to test it. The Google documentation at https://developers.google.com/media/vp9/live-encoding/ says that it's doubling the number of threads. Maybe it's wrong. If you can really have 8 threads with 2 tiles then this will surely make me happy

Apart that most of vp9 documentation is obsolete or incorrect for the last versions, really the vp9 documentation is a mess. Actually neither ffmpeg or vpxenc is a good way to verify the threads, ffmpeg make more threads for itself and vpxenc limit the treads to the CPU threads, in this last case if vpxenc without row-mt does 4 threads and your CPU support 8, enabling it actually double the threads but if your CPU support 12 threads it is going to spawn more threads and possibly 12 of them (this can explain for what they wrote that row-mt doubles the treads).

PD.->

Revisiting this: https://groups.google.com/a/webmproject.org/forum/#!topic/codec-devel/oiHjgEdii2U you can read:

In tests[1] of encoding HD videos with 4 column tiles, the improved VP9 MT encoder achieved speedups over the original of 11% with 2 threads, 27% with 4 threads, 101% with 8 threads, and 135% with 16 threads.

then its telling you that a 1080 video with 4 columns and row-mt enable spawns 16 treads, basically is multiply threads x4 contradicting the https://developers.google.com/media/vp9/live-encoding/ page.

Last edited by gdx; 26th Jul 2018 at 07:41.

Quote

26th Jul 2018 19:33 #20

fornit

Member

Originally Posted by gdx

Revisiting this: https://groups.google.com/a/webmproject.org/forum/#!topic/codec-devel/oiHjgEdii2U you can read:

In tests[1] of encoding HD videos with 4 column tiles, the improved VP9 MT encoder achieved speedups over the original of 11% with 2 threads, 27% with 4 threads, 101% with 8 threads, and 135% with 16 threads.

then its telling you that a 1080 video with 4 columns and row-mt enable spawns 16 treads, basically is multiply threads x4 contradicting the https://developers.google.com/media/vp9/live-encoding/ page.

That sounds great. Thank you for digging that out!

To give a summary of our discussion and to keep my table up-to-date, this will mean the following.

With the setting of "-row-mt 1" tile-columns and threads can be set as follows:
width < 512 => tile-columns = 0, threads = 4

512 <= width < 1024 => tile-columns = 1, threads = 8

1024 <= width < 2048 => tile-columns = 2, threads = 16

width >= 2048 => tile-columns = 3, threads = 32

I'll provide a feedback after I've tested it by myself. I will also have a look at the behaviour on Windows.

Quote

26th Jul 2018 20:33 #21

gdx

Member

That table is not that simple, with vpxenc using tile 0 and row-mt spawns 8 threads at least from 428 pixel of height upwards, but the number of threads and threads deficiency is not the same as in a previous test of mine I found that the speed in a 720p video and 8 threads: [tiles 2 row-mt 1] = [tiles 1 row-mt 1] > [tiles 0 row-mt 1] > [tiles 2 row-mt 0] > [tiles 1 row-mt 0] > [tiles 0 row-mt 0].

Quote

29th Jul 2018 21:02 #22

rockerovo

Member

Originally Posted by pandy

Originally Posted by rockerovo

look at this, from "Performance Comparison of High-Efficiency Video Coding (HEVC) with H.264
AVC
"

[Attachment 46174 - Click to enlarge]

Compression factor

How he did this from HW and JM, Did he convert them to MP4 or MKV?

all this question because I know all this coding, so we can get a file with good quilty and less bitrate, so how can I calculate the bitrate for the original yuv and output encoder file?

Nope - just calculate raw h264 size (file size in bits - so multiply bytes by 8) divided by duration - this will give you bitrate... same as for yuv only difference is that yuv is uncompressed thus way bigger.

mkv or mp4 are containers and they main purpose is to organise video, audio and other sub streams within single unity i.e. encapsulate multipple data within single file - each container add some amount of overhead related to own container requirements

As you comparing only video then you can compare raw compressed video size.

Thank you
but what is the bitstream for VP9, AV1? webm? I thought that webm is a container

Quote

30th Jul 2018 04:45 #23

pandy

Member

Originally Posted by rockerovo

but what is the bitstream for VP9, AV1? webm? I thought that webm is a container

Bitstream is a sequence of bits that carry data (for example compressed video) and additional information that allow to restore in our case video by decoder - if you go to almost any sane codec specification then you will immediately realize that video codecs (especially those standardised) are very complex structures and to cover such complexity they use particular solutions like sequences and headers, even primitive commands etc so at some point they are very similar to primitive quasi programming language...

Quote

30th Jul 2018 04:58 #24

LigH.de

Member

Generalized, the bitstream is the content inside any container.

The common "raw" (or, minimalistic container) format for VPx and AV1 is called IVF, the vintage "Indeo Video File" format.

Additionally, AV1 supports an "OBU" format (Open Bitstream Units) which is more similar to AVC and HEVC raw video streams.

Last edited by LigH.de; 30th Jul 2018 at 05:28.

Quote

30th Jul 2018 09:30 #25

rockerovo

Member

Originally Posted by LigH.de

Generalized, the bitstream is the content inside any container.

The common "raw" (or, minimalistic container) format for VPx and AV1 is called IVF, the vintage "Indeo Video File" format.

Additionally, AV1 supports an "OBU" format (Open Bitstream Units) which is more similar to AVC and HEVC raw video streams.

Thank you for explaining to me the differences between them

I'm trying to output IVF bitstream with vpxenc Vp9 , but i think there is a probelm or my command may be wrong , because the size of ivf is larger than webm file with the same settings

for ivf output :
Code:
/vpxenc --output=ReadySteadyGo_1920x1080_120fps_420_8bit.ivf --codec=vp9 --passes=2 --good --verbose --psnr --i420 --threads=8 --width=1920 --height=1080 --profile=0 --fps=50000/1001 --min-q=29 --max-q=29 --kf-min-dist=50 --kf-max-dist=50 --cpu-used=1 --auto-alt-ref=1 --input-bit-depth=8 --tile-columns=2 /home/siraj/Desktop/Project/Samples/1080p/ReadySteadyGo_1920x1080_120fps_420_8bit_YUV.yuv --bit-depth=8 --row-mt=1 --end-usage=q --ivf --limit=100
for webm output :
Code:
./vpxenc --output=ReadySteadyGo_1920x1080_120fps_420_8bit_webm.webm --codec=vp9 --passes=2 --good --verbose --psnr --i420 --threads=8 --width=1920 --height=1080 --profile=0 --fps=50000/1001 --min-q=29 --max-q=29 --kf-min-dist=50 --kf-max-dist=50 --cpu-used=1 --auto-alt-ref=1 --input-bit-depth=8 --tile-columns=2 /home/siraj/Desktop/Project/Samples/1080p/ReadySteadyGo_1920x1080_120fps_420_8bit_YUV.yuv --bit-depth=8 --row-mt=1 --end-usage=q --webm --limit=100
is it normal that the size of ivf larger than webm?

Quote

30th Jul 2018 16:56 #26

LigH.de

Member

Well, unfortunately I lack practical experience.

IVF still has a container header and additional headers to every video frame. It is indeed possible that Matroska (WebM is a Matroska subformat) can store it more efficiently. Apparently there is no real raw format vpxenc would produce, because it may be impossible to read it back into a container without losing important information.

I guess the OBU format aomenc can create comes close to a "raw" video format without container and with minimal headers. But vpxenc does not create it.

Quote

30th Jul 2018 17:30 #27

fornit

Member

Hm,

I've just tried the jockey.webm example, I've created last week for you:

ffmpeg -i jockey.webm -c:v copy -f ivf jockey.ivf

Size seems to be the same (or nearly the same) as with webm:
jockey.webm: 15.282 KB
jockey.ivf: 15.283 KB

Quote

31st Jul 2018 07:11 #28

rockerovo

Member

Originally Posted by LigH.de

Well, unfortunately I lack practical experience.

IVF still has a container header and additional headers to every video frame. It is indeed possible that Matroska (WebM is a Matroska subformat) can store it more efficiently. Apparently there is no real raw format vpxenc would produce, because it may be impossible to read it back into a container without losing important information.

I guess the OBU format aomenc can create comes close to a "raw" video format without container and with minimal headers. But vpxenc does not create it.

Thank you for clarifying the difference

Quote

31st Jul 2018 07:16 #29

rockerovo

Member

Originally Posted by fornit

Hm,

I've just tried the jockey.webm example, I've created last week for you:

ffmpeg -i jockey.webm -c:v copy -f ivf jockey.ivf

Size seems to be the same (or nearly the same) as with webm:
jockey.webm: 15.282 KB
jockey.ivf: 15.283 KB

so if I want to compare between bitstream I should use IVF with .264 .265?
and whats the difference between .265 and .hevc is it the same?

Quote

31st Jul 2018 07:39 #30

rockerovo

Member

I a newbie in video coding, so I have new questions?
I know that the quantization level or qp, when I chose the higher value you will get good compression but bad quality? I think its the same as Quantizer in PCM higher level means higher quality and less error?

I'm still don't know how to measure bitrate for bitstream, let us say I have .265 with size 100kB and duration 3sec and 30 ms, so I need to multiply 100 by 8 and divide it by duration which is 3.33 then I will get the average bitrate, right?

I'm using x265, x264 when I want to output the bitstream I write -o example.265 but if I change the .265 to a container like .mp4 or .mkv I got the same file size? the -o example .265 work great if I open the file with hevc analyzer, they have the same size because I don't multiplexer the bitstream with the audio file ? is this thing made the file bigger?

which one of this two is good compression method, using fixed qp during the whole process with every codec and measure the bitrate with psnr,ssim... or using certain bitrate and compare the bitrate with psnr.ssim ?

btw I'm using x265.x264.libvpx-vp9,aomenc av1 without ffmpeg , i don't want to use ffmpeg

what is the lag in the frame option?

why does some encoder ask to enter the fps with a fraction like --fps=50000/1001 not 49 to 50 ?

whats the reason for JM and HW reference encoder?

Last edited by rockerovo; 31st Jul 2018 at 07:45.

Quote

Some qustion about AV1,VP9,HEVC,AVC

Thread Tools

Search Thread

Similar Threads

GF GTX 1060 can't efficiently accelerate HEVC & VP9 with software players.

HEVC to AVC

Why has HEVC and VP9 so bad efficiency in a comparison with perseus V-NOVA

X.264 AVC is better than DIVX265 HEVC. A TEST which proved it.DEAL WITH IT!

NEW FFMPEG 2.1: native support for HEVC and VP9!!! :)))