[Ask for guidance] Effect of -bframes on decoding speed

31st Oct 2016 10:59 #1
SuNova

View Profile

View Forum Posts

Private Message
Member

Join Date
Sep 2016

Location
Iran
I use this settings in my encoding (I don't care about encoding speed):
--preset placebo --crf 21.5 --qpstep 30 --rc-lookahead 80 --aq-mode 2 --psy-rd 1:0.15 --keyint infinite --bframes 1 --partitions p8x8,b8x8,i8x8,i4x4 --merange 36 --ref 5 --threads 1
Do you have any advice for me in order to improve decoding speed with almost no quality loss? Because I think I'm doing it in a really aggressive way.
In order to explain, I was using p4x4 with -bframes 0 because I was thinking decoding bframes requires more processing power, but I read somewhere else in doom9 that they may even improve decoding speed with lowering bitrate in crf mode. Also p4x4 can be more harmful in decoding speed with very little gain in quality so turned it off and set -bframes to 1. Now for example if I set -bframes 2 or even -bframes 16 with -b-adapt 2 what is the effect on decoding speed? Better or worse?
Sorry guys I'm really confused. Just give me tips because I'm encoding for mobile processors.

Quote
31st Oct 2016 12:18 #2
KarMa

View Profile

View Forum Posts

Private Message
Dinosaur Supervisor

Join Date
Jul 2015

Location
US
Assuming this is for x264. Prolly going to want to turn CABAC off as that is a big decoding hog, but will hurt compression efficiency by 10-20%. Turning off CABAC makes it use the faster CALVC. Can also turn off weighted prediction for B-frames and/or P-Frames which can also be harder to decode, which will so hurt efficiency. Or maybe using b-pyramid as strict or disabled.

Last edited by KarMa; 31st Oct 2016 at 12:35.

Quote
31st Oct 2016 15:13 #3
SuNova

View Profile

View Forum Posts

Private Message
Member

Join Date
Sep 2016

Location
Iran
Originally Posted by KarMa

Assuming this is for x264. Prolly going to want to turn CABAC off as that is a big decoding hog, but will hurt compression efficiency by 10-20%. Turning off CABAC makes it use the faster CALVC. Can also turn off weighted prediction for B-frames and/or P-Frames which can also be harder to decode, which will so hurt efficiency. Or maybe using b-pyramid as strict or disabled.

Thanks for nice answer. About CABAC because I encode in relatively low bitrate and I heard that the processing overhead with CABAC is dependent on bitrate, and because CABAC is lossless I prefer keeping it turned on. And about weighted prediction, can you give me some numbers for its overhead? Maybe if I turn it off but use -bframes 3 totally I see improvements in PSNR with smoother decoding, what do you think?? I really don't know how it works that's why I'm asking. If one could simply say hey using -bframes 1 has x percent decoding overhead while using -bframes 16 has y percent overhead and using weighted precition for b/p has z percent overhead, and also specifies gain for each parameter, I can decide better.

Quote
31st Oct 2016 16:55 #4
poisondeathray

View Profile

View Forum Posts

Private Message
Member

Join Date
Sep 2007

Location
Canada
Originally Posted by SuNova

If one could simply say hey using -bframes 1 has x percent decoding overhead while using -bframes 16 has y percent overhead and using weighted precition for b/p has z percent overhead, and also specifies gain for each parameter, I can decide better.

You can't really establish a firm relationship, only a general trend, because max b-frames are also source dependent.

ie. Setting --b-frames 16 doesn't necessarily mean your actual encode uses 16 b-frames. On some sources with low motion, long strings of duplicate frames, it might, but the majority of sources might only use 5-6 max.

But you can you can measure it yourself. Encode several versions whatever settings and check the cpu usage and speed upon decoding. You can use something like avsmeter which reports it but there are other methods like vdub analysis pass , ffmpeg etc...

The problem is there are so many other variables, different decoders, different HW architectures

In this example, default medium settings, crf 22, b-adapt 2 ; 1 vs 16 bframes . Generally you wouldn't use crf encoding when testing another variable, because the filesize won't be the same. But that's what you wanted to test in this case

As you can see for the b16 run, the majority of consecutive b-frame runs are < 5 consec b-frames . (93.5% of the entire number of b-frames are contained within runs of 5 or less)
x264 [info]: consecutive B-frames: 9.7% 19.8% 29.1% 10.7% 8.4% 15.8% 2.9% 1.5% 0.8% 0.0% 0.5% 0.0% 0.0% 0.0% 0.0% 0.0% 0.8%

The fps are min/max/avg

#avsmeter b1.avs
45.1MB
186/1020/416

#avsmeter b16.avs
43.0MB
182/853/394

So the b16 test for this particular source was about -5.3% speed average on this setup . Different sources or hw setups might only be -1%, others -10%. It varies. The trend is more b-frames, higher cpu usage, slower fps

IMO, You should just use settings appropriate for the target device . This includes proper --vbv-maxrate and bufsize settings. --keyint infinite is a bad idea for many sources, very poor seek performance . If you're targetting very slow or older devices, or are in doubt of decoding performance, use --tune fastdecode (which disables cabac, deblocking, weightb and weightp)

Last edited by poisondeathray; 31st Oct 2016 at 17:03.

Quote
31st Oct 2016 17:12 #5
LigH.de

View Profile

View Forum Posts

Private Message
Member

Join Date
Aug 2013

Location
Central Germany
Do not use --preset placebo as base for experiments. It is just the maximum possible, but not the maximum sensible. Or in brief, a waste of time and energy.

Many consecutive B frames may have some additional overhead. But you can't tell a general malus for all video material possible, because in practical matters, an encoder will hardly ever use 16 B frames consecutively, it will decide depending on the motion and other details in the video material. Comparing statistics of many different videos, you will only rarely see more than 6 consecutive B frames in reality (best chances are probably in cartoons with long static scenes).

Furthermore, talking about improvements in PSNR metrics ... really? Then you must have missed several years of video encoder development proving that the PSNR metric can be fooled easily with academic examples, it tells very little about subjectively felt quality impressions. Not even SSIM matches the result of an ABX test with hundreds of participants.

Improving decoding speed requires reducing the complexity. This will also lead to a reduced overall efficiency. You are trying to compensate the lack of horsepower with a spoiler; you will hardly notice the difference. Or with different words: You can have the encoder either look at small parts of the video to improve decoding speed, or look at larger parts to improve compression. But both at the same time won't work. Efficiency requires complexity, speed requires the lack of it.

Summary: You are wasting a lot of time and energy for a tiny little advantage; better invest it in base knowledge about video compression.

Quote
31st Oct 2016 22:53 #6
vhelp

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2001

Location
New York
In my experience, placebo works noticeable if source video is noisy, like in vhs or if there is a fair amount of grain in the video. Most sources like dvd are clean, so there is no noticeable gain in video quality, otherwise, everything that has already been said (in the posts above) is true--its a waste of time.

Quote
1st Nov 2016 02:17 #7
LigH.de

View Profile

View Forum Posts

Private Message
Member

Join Date
Aug 2013

Location
Central Germany
Still, you should not start with preset "placebo"; instead you should start with a sensible preset, think about the differences to the insane preset "placebo" (which exists merely as "bad role model", as "upper limit you should try to avoid"), adding only such differences which will probably increase quality for a specific use case, but not adding differences with a too bad ratio between achievable advantage and required efforts.

Quote
1st Nov 2016 02:40 #8
kuka1

View Profile

View Forum Posts

Private Message

Visit Homepage
Member

Join Date
Apr 2008

Location
California
Although previous author was absolutely right "You are wasting a lot of time and energy for a tiny little advantage; better invest it in base knowledge about video compression", I guess there is a way to solve your problem.
First of all, what is the resolution of your video? What kind of problems do you see during playback? Why do you think, that it is due to insufficient processor power?
Almost any modern mobile processor has built-in AVC decoder, thus will smoothly play almost any file encoded with more or less common parameters.
Does it played normally, if you encode it with something like this: --preset veryslow --crf 21.5 --keyint 300 ?

Quote
1st Nov 2016 04:22 #9
SuNova

View Profile

View Forum Posts

Private Message
Member

Join Date
Sep 2016

Location
Iran
Well thanks all for your suggestions and advises. I read carefully and helped so much (especially avsmeter as a tool for measurements)
But let I answer some points. First of all, I don't care about encoding speed. Because I encode short vines (but many per a day) not long time movies with extremely high resolution. And
- Most of audiences use mobile data so I want to achieve best compression possible with acceptable quality
- Most of users use mobile devices
- Many resolutions, many fps, many sources.
Why am I complaining about decoding speed? Because I had many cases which users complained about it. Sometimes I encode the same vine in different fps in order to get feedback from users by their comparison. They sometimes say for example, 30fps vine plays slower than 15 fps version (same content, same encoding settings), or sometimes opposite! (30fps plays smoother compared to 21fps). Well as you mentioned, I don't have good information about video encoding.
Some day I encoded a video file with an encoder (MAGIX Pro) and enforced IBB...BI pattern and the playback was too slow on mobile devices I was thinking b frames cause trouble, so I set --bframes 0 in my vine encoding settings. Well there was really no improvement in playback speed (user reports). After sometime I realized that I was wrong, so turned p4x4 off and set --bframes to 1. Users reported a better experience but the problem still persists. That's why I'm asking here, they see some materials playback abnormally and report to me.
Again, thanks everyone, I put myself on testing different settings, but your guidance will light my path.

Quote
1st Nov 2016 08:56 #10
poisondeathray

View Profile

View Forum Posts

Private Message
Member

Join Date
Sep 2007

Location
Canada
Testing decoding speed on a computer is of limited value, since you're not targeting computer playback. You're targetting mobile devices. You can get a rough idea of what settings have an impact, but it's not that important. What is important is using appropriate settings for the target device

Smooth streaming is also dependent on bandwidth, both up and client downstream; so it's not necessarily encoding settings only

The 16 reference frames is a big problem. It will break compatibility on many HW chips for devices, some won't even play it at all . Same with 16 b-frames (which might actually be used on some sources). On your tests, unless you check what the actual b-frame pattern was of the encode - your observations might not even be valid. Use appropriate buffer settings (--vbv-bufsize and --vbv-maxrate) appropriate for the device. A buffer underrun will cause stutter, choppy playback on some devices. You can't have the "best" compression/quality, yet make it compatible with everything. You need to choose the lowest denominator or exclude those devices which cannot play certain settings, or have separate fallback videos/sites for different weaker devices

Quote
1st Nov 2016 09:50 #11
jagabo

View Profile

View Forum Posts

Private Message
Member

Join Date
Dec 2005
Yes. For example, 1080p Blu-ray discs are limited to 3 b-frames, 4 reference frames, and ~1 second GOPs (ie, 24 frames at 24 fps, 60 frames at 60 fps, etc.). Many hardware decoders have similar limitations.

Quote
1st Nov 2016 15:32 #12
KarMa

View Profile

View Forum Posts

Private Message
Dinosaur Supervisor

Join Date
Jul 2015

Location
US
To test out the effects of CABAC I used "Test Source 2.mkv" as a source, which is about 90 seconds longs (1080p) and encoded it twice at 2000kbps. One with CABAC and one with CALVC but everything else was the same. Using AVSMeter, the CABAC was decoded in 9.23 seconds while CALVC was decoded in 8.25 seconds. Nothing else was really running during these test.

Using CALVC, disabling weighted B and P, and disabling b-pyriamid got it to 7.80 seconds.

Last edited by KarMa; 1st Nov 2016 at 15:53.

Quote
3rd Nov 2016 20:48 #13
kuka1

View Profile

View Forum Posts

Private Message

Visit Homepage
Member

Join Date
Apr 2008

Location
California
I don't think your users has problems due to 4x4 partitioning or number of B-frames. Most likely it's due to high bitrate and big number of reference frames
Let me try to explain what means all your parameters.
--crf 21.5 - this defines video quality, average quantizer, it defines how much resulting video will differs from the original. In case of using --crf output will be more or less the same quality, other parameters will change files size/bitrate
--preset placebo - preset defines best parameter set for a given encoding speed. Better parameter set - slower encoding. Do not use placebo, it for development purposes, use veryslow instead. In most cases veryslow will give you lower bitrate then placebo.
--qpstep 30 - difference between max and min quantizer, no need to specify. Use defaults for a given preset.
--rc-lookahead 80 - number of frames to lookahead for rate control. No need in your case - you don't use rate control. In most cases use 10, 30 is far enough.
--aq-mode 2 - adaptive quantization method, no need to change, use default for given preset
--psy-rd 1:0.15 - strength of psychovisual optimization. Use --psy instead.
--keyint infinite - GOP size, number of frames between key frames. In your case - vines of a few seconds you may use infinite, but in most cases better to limit around 150-250 frames. One I-frame each 5 sec will add about 1% to bitrate, but make navigation much easier.
--bframes 1 - limit on number of B-frames, better to use 2-3, two b-frames may save a few percents of bitrate in comparison to 1 and will not affect compatibility or decoding speed
--partitions p8x8,b8x8,i8x8,i4x4 - block partitioning depth, influence on encoding speed, doesn't affect decoding speed, use defaults for the given preset
--merange 36 - limit of motion vectors length, influence on encoding speed, doesn't affect decoding speed, use defaults for the given preset
--ref 5 - number of reference frames, influence on encoding speed, doesn't matter for decoder. In most cases 4 enough, for compatibility reasons better to use 2. Some hardware decoders has no decoded picture buffer for more then two references.
--threads 1 - number of threads used for encoding. There are two cases when you may need to limit threads to 1 - if you have large queue of files for encoding, it's better to run in parallel number of encodings equal to number of physical cores in your system with --threads 1. Say, on a processor with four cores four encodings in parallel with --threads 1 will run approx 20-30% faster, then four successive encoding with unlimited number of threads.
The second case is a bit harder to explain. Multithread encoding provides different output if rate control is used. Say, four threads are running asynchronousely encoding the same frame. Each thread reports number of encoded bits to rate control, and each get this data from rate control in order to set quantizer for current frame. Position of each thread in macroblock row is unpredictable, so their reports may slightely differ from one run to another.
In some cases, usually for development needs, bitwise identical output required if encode the same source with the same parameters.

Again, it is better to use defaults and change crf/preset parameters in order to get better quality, faster encoding. Also you may limit profile, number of reference frames, B-frames, key frame interval for compatibility.
If you need to encode video for streaming you may need to use rate control instead of --crf, or at least limit upper bitrate level, depends on resolution.
My favorite settings for 1280x720 at 30 fps, very good quality, very good network conditions, compatible with 100% mobiles less then 5 years old:
--profile:v main --preset veryslow --crf 19 --bufsize 6000K --maxrate 6000K --force_key_frames 'expr:if( gt(n_forced,3), gte(t, (n_forced-2)*5), gte(t, n_forced*2)

Quote

[Ask for guidance] Effect of -bframes on decoding speed

Thread Tools

Search Thread

Similar Threads

Reference frames and Bframes

A Little Guidance with Improving VHS Captures

Need some guidance with downloading m3u8

New poster looking for guidance

Need Guidance for scientific video use