Should I buy an Intel video card for AV1 encoding?

25th Dec 2022 23:10 #1
sophisticles

View Profile

View Forum Posts
Banned

Join Date
Jul 2014
That is the question I have been asking myself for for a month now.

Microcenter has the Arc A380 for $140 and I have seen a number of tests that show the AV1 hardware encoder on this card matching x264+veryslow in terms of quality as measured by VMAF, so I have been running a bunch of x264+veryslow test encodes as a way of simulating this card in order to make a decision.

Long story short, SVT-AV1 has convinced me to give up on hardware encoders.

Look at these test encodes, done with the 13gb ToS 4k DCP:

https://media.xiph.org/tearsofsteel/

There a (in)famous "scene group" who are known for releasing supposed "high quality" full movies at very small file sizes, and there have been a number of people that have asked in this forum how they do it.

I decided to try to encode the above reference file down to 150mb, as a way of simulating converting a full 4k release to a watchable 1.2gb.

For x264 I used a 2-pass, very slow preset, tune film, 10-bit, 4:4:4 from within Hybrid.

For x265, Hybrid kept crashing, so I used Staxrip, 12-bit, 4:2:0, 2-pass and I think I used the fast preset, but I don't remember.

It doesn't really matter, because neither x265 nor x264 is watchable at 1.7mb/s.

For SVT-AV1, I used Handbrake, because both Hybrid and Staxrip crash constantly, with preset 12, the fastest preset, 10-bit, 1-pass.

I have also done a bunch of test encodes using Meridian, and some other sources, and even if the best hardware encoder can match x264+veryslow, I don't see any reason to use any encoder other than SVT-AV1.

It is that much better than everything else.

Attached Files

ToS Extreme x264.mp4 (144.02 MB, 132 views)

ToS Extreme x265.mkv (151.02 MB, 105 views)

tos_picture.mkv (145.55 MB, 115 views)
Quote
27th Dec 2022 07:43 #2
JN-

View Profile

View Forum Posts

Private Message
Member

Join Date
Jun 2022

Location
Dublin
I don’t have a use case for AV1 but it does fascinate me. I used the ffmpeg SVT-AV1 using CPU and found it good quality and fast if using the appropriate preset.

I’m hoping to get a Nvidia or AMD GPU with Av1 HW encoding, they are thin on the ground at the moment. I have an Intel i9-9900K which isn’t compatible with Intels Arc gpus.

When I do get say a Nvidia card i’ll post some quality comparison results here between AV1 Cpu and AV1 HW encoding

I don’t yet see ffmpeg support for Amd Av1 HW encoding, but hopefully it will come soon, in that case going for the Amd card would be not so expensive.

My interest wouldn’t be in very low data rate clips, When I previously tested Av1 Cpu encoding I used a FHD clip and a target data rate of ~ 20 Mbps. It was top of the table for sure.

Last edited by JN-; 27th Dec 2022 at 18:04.

Quote

27th Dec 2022 09:44 #3

Member

and fast if using the appropriate preset.

out of curiosity: What is an appropriate preset for encoding a 4k 12bit source that one could consider fast?

Using:

Code:

ffmpeg -y -loglevel fatal -noautorotate -nostdin -threads 8 -i "G:\tos_version_05\tos_picture.mxf" -map 0:0 -an -sn -vf  scale=in_range=pc:out_range=pc -sws_flags accurate_rnd+full_chroma_inp -pix_fmt yuv420p10le -strict -1 -vsync 0 -f yuv4mpegpipe - | NVEnc --y4m -i - --fps 24.000 --codec av1 --sar 1:1 --output-depth 10 --vbr 0 --vbr-quality 50.00 --aq --aq-strength 5 --aq-temporal --gop-len 0 --ref 7 --nonrefp --weightp --bframes 7 --bref-mode middle --mv-precision Q-pel --preset quality --colorrange full --colormatrix bt2020c --cuda-schedule sync --output "G:\Temp\tos_picture_2022-12-27@15_51_30_8410_01.av1"

Note the unholy high 'quality level' of 51, which is the max value NVEncC allows! encoding ran @~12.7fps.

Code:

NVEncC (x64) 7.06 (r2388) by rigaya, Dec 10 2022 12:26:56 (VC 1929/Win)
OS Version     Windows 11 x64 (22621) [UTF-8]
CPU            AMD Ryzen 9 7950X 16-Core Processor [5.72GHz] (16C/32T)
GPU            #0: NVIDIA GeForce RTX 4080 (9728 cores, 2505 MHz)[PCIe1x16][527.56]
NVENC / CUDA   NVENC API 12.0, CUDA 12.0, schedule mode: sync
Input Buffers  CUDA, 20 frames
Input Info     y4m(yv12(10bit))->p010 [AVX2], 4096x1714, 24/1 fps
Vpp Filters    copyHtoD
Output Info    AV1 main 10bit @ Level auto
4096x1714p 1:1 24.000fps (24/1fps)
Encoder Preset quality
Rate Control   VBR
Multipass      none
Bitrate        0 kbps (Max: 0 kbps)
Target Quality 51.00
Initial QP     I:20  P:23  B:25
QP Offset      cb:0  cr:0
VBV buf size   auto
Lookahead      off
GOP length     240 frames
B frames       7 frames [ref mode: middle]
Ref frames     7 frames, MultiRef L0:auto L1:auto
AQ             on
Part size      max auto / min auto
Tile num       columns auto / rows auto
TemporalLayers max 1
Refs           forward auto, backward auto
VUI            matrix:bt2020c,range:full
Others         mv:Q-pel nonrefp
encoded 17616 frames, 12.75 fps, 1824.31 kbps, 159.63 MB
encode time 0:23:01, CPU: 0.0, GPU: 12.2, VE: 19.7, GPUClock: 1527MHz, VEClock: 1582MHz
frame type IDR    74
frame type I      74,  total size    4.78 MB
frame type P    2202,  total size    0.01 MB
frame type B   15340,  total size  154.83 MB
2022-12-27@16_11_46_5910_01_video finished after 00:23:02.767
finished...

Looking at the output size, at least atm. NVIDIA hardware encoding AV-1 can't achieve 1700kBit/s on this source with 4k 10bit YUV 4:2:0.

For x265, Hybrid kept crashing,

eats ~11.5GB RAM, but works fine here,...

Cu Selur

Ps.: btw. at such low bitrates VMAF&co don't really say much it's more about how to properly hide issues. (blurring vs blocking)

Attached Files

tos_picture_51.mp4 (159.67 MB, 94 views)

Last edited by Selur; 27th Dec 2022 at 09:52.

users currently on my ignore list: deadrats, Stears555, marcorocchini

Quote

27th Dec 2022 09:51 #4
JN-

View Profile

View Forum Posts

Private Message
Member

Join Date
Jun 2022

Location
Dublin
Using SVT-AV1 there are 14 presets from 0 to 13. I would simply experiment, then check out the time taken to render and the quality, pick then what suits. Final size may also be something to consider.

Best to start with Medium, preset 6 Cpu, 15 Nvenc, if it gives you what you want then stay with it.

Last edited by JN-; 27th Dec 2022 at 10:20.

Quote
27th Dec 2022 09:55 #5
poisondeathray

View Profile

View Forum Posts

Private Message
Member

Join Date
Sep 2007

Location
Canada
NVIDIA hardware encoding AV-1 can't achieve 1700kBit/s on this source with 4k 10bit YUV 4:2:0.

Selur, can you test explicitly "forcing" --qp-min and --qp-max to some high value for NVEncC to see if it makes a difference ?

--qp-min <int> or set min QP
<int>:<int>:<int> default: unset

--qp-max <int> or set max QP
<int>:<int>:<int> default: unset

Quote
27th Dec 2022 09:56 #6
Selur

View Profile

View Forum Posts

Private Message

Visit Homepage
Member

Join Date
Jun 2011

Location
Germany
The target was 1700kBit/s so, final file size is given. But I suspect with hardware encoding and normal cpus you probably can forget the fast.

users currently on my ignore list: deadrats, Stears555, marcorocchini

Quote
27th Dec 2022 10:09 #7
Selur

View Profile

View Forum Posts

Private Message

Visit Homepage
Member

Join Date
Jun 2011

Location
Germany
@poisondeathray:

Selur, can you test explicitly "forcing" --qp-min and --qp-max to some high value for NVEncC to see if it makes a difference ?

With what goal?
NVEncC used 'Initial QP I:20 P:23 B:25' so it starts at 20-25 trying to guess the right quantizer.
Not knowing what the used quantizers were setting, a lower limit might speed things up a (tiny) bit, by restricting the possible choices and or might lower the resulting quality.
-> trying to hit 1700kBit/s seems too much off a hassle and the speedup is probably below 1fps in speed, so probably not worth the trouble.

Cu Selur

users currently on my ignore list: deadrats, Stears555, marcorocchini

Quote
27th Dec 2022 10:12 #8
poisondeathray

View Profile

View Forum Posts

Private Message
Member

Join Date
Sep 2007

Location
Canada
Maybe I misunderstood - I though goal was to go lower ? The log says 1824.31 kbps

Quote
27th Dec 2022 10:20 #9
Selur

View Profile

View Forum Posts

Private Message

Visit Homepage
Member

Join Date
Jun 2011

Location
Germany
Yeah, but having to encode that clip again tons of times to find the right combination seems like a lot of hassle, especially since I doubt that anybody would reencode a 4k source multiple times for this. But since it's rather cold outside, I'll use my gpu some more for heating,...

trying:

Code:

Target Quality 51.00 Initial QP I:20 P:23 B:25 QP range I:25-51 P:28-51 B:30-51

now,... (doesn't seem to change speed at all)
users currently on my ignore list: deadrats, Stears555, marcorocchini
Quote

27th Dec 2022 10:48 #10

Selur

Member

Okay, that didn't change the output:

Code:

NVEncC (x64) 7.06 (r2388) by rigaya, Dec 10 2022 12:26:56 (VC 1929/Win)
OS Version     Windows 11 x64 (22621) [UTF-8]
CPU            AMD Ryzen 9 7950X 16-Core Processor [5.52GHz] (16C/32T)
GPU            #0: NVIDIA GeForce RTX 4080 (9728 cores, 2505 MHz)[PCIe1x16][527.56]
NVENC / CUDA   NVENC API 12.0, CUDA 12.0, schedule mode: sync
Input Buffers  CUDA, 20 frames
Input Info     y4m(yv12(10bit))->p010 [AVX2], 4096x1714, 24/1 fps
Vpp Filters    copyHtoD
Output Info    AV1 main 10bit @ Level auto
4096x1714p 1:1 24.000fps (24/1fps)
Encoder Preset quality
Rate Control   VBR
Multipass      none
Bitrate        0 kbps (Max: 0 kbps)
Target Quality 51.00
Initial QP     I:20  P:23  B:25
QP range       I:25-51  P:28-51  B:30-51
QP Offset      cb:0  cr:0
VBV buf size   auto
Lookahead      off
GOP length     240 frames
B frames       7 frames [ref mode: middle]
Ref frames     7 frames, MultiRef L0:auto L1:auto
AQ             on
Part size      max auto / min auto
Tile num       columns auto / rows auto
TemporalLayers max 1
Refs           forward auto, backward auto
VUI            matrix:bt2020c,range:full
Others         mv:Q-pel nonrefp
encoded 17616 frames, 12.70 fps, 1824.31 kbps, 159.63 MB
encode time 0:23:06, CPU: 0.0, GPU: 12.7, VE: 19.5, GPUClock: 1533MHz, VEClock: 1589MHz
frame type IDR    74
frame type I      74,  total size    4.78 MB
frame type P    2202,  total size    0.01 MB
frame type B   15340,  total size  154.83 MB
2022-12-27@17_22_25_8810_01_video finished after 00:23:07.687

-> seems like either thos values get ignored, or the min quantizer used is higher,...

users currently on my ignore list: deadrats, Stears555, marcorocchini

Quote

27th Dec 2022 11:16 #11

Selur

Member

Same result with 30, 33, 35:

Code:

NVEncC (x64) 7.06 (r2388) by rigaya, Dec 10 2022 12:26:56 (VC 1929/Win)
OS Version     Windows 11 x64 (22621) [UTF-8]
CPU            AMD Ryzen 9 7950X 16-Core Processor [5.78GHz] (16C/32T)
GPU            #0: NVIDIA GeForce RTX 4080 (9728 cores, 2505 MHz)[PCIe1x16][527.56]
NVENC / CUDA   NVENC API 12.0, CUDA 12.0, schedule mode: sync
Input Buffers  CUDA, 20 frames
Input Info     y4m(yv12(10bit))->p010 [AVX2], 4096x1714, 24/1 fps
Vpp Filters    copyHtoD
Output Info    AV1 main 10bit @ Level auto
4096x1714p 1:1 24.000fps (24/1fps)
Encoder Preset quality
Rate Control   VBR
Multipass      none
Bitrate        0 kbps (Max: 0 kbps)
Target Quality 51.00
Initial QP     I:20  P:23  B:25
QP range       I:30-51  P:33-51  B:35-51
QP Offset      cb:0  cr:0
VBV buf size   auto
Lookahead      off
GOP length     240 frames
B frames       7 frames [ref mode: middle]
Ref frames     7 frames, MultiRef L0:auto L1:auto
AQ             on
Part size      max auto / min auto
Tile num       columns auto / rows auto
TemporalLayers max 1
Refs           forward auto, backward auto
VUI            matrix:bt2020c,range:full
Others         mv:Q-pel nonrefp
encoded 17616 frames, 12.80 fps, 1824.31 kbps, 159.63 MB
encode time 0:22:56, CPU: 0.1, GPU: 17.6, VE: 19.5, GPUClock: 1501MHz, VEClock: 1590MHz
frame type IDR    74
frame type I      74,  total size    4.78 MB
frame type P    2202,  total size    0.01 MB
frame type B   15340,  total size  154.83 MB
2022-12-27@17_48_59_1310_01_video finished after 00:22:56.773

will do a last test with "--qp-min 40:43:45"

users currently on my ignore list: deadrats, Stears555, marcorocchini

Quote

27th Dec 2022 11:58 #12

Selur

Member

Code:

vspipe "G:\Temp\encodingTempSynthSkript_2022-12-27@18_25_25_0710.vpy" - -c y4m | NVEnc --y4m -i - --fps 24.000 --codec av1 --sar 1:1 --output-depth 10 --vbr 0 --vbr-quality 51.00 --aq --aq-strength 5 --aq-temporal --gop-len 0 --ref 7 --nonrefp --weightp --bframes 7 --bref-mode middle --mv-precision Q-pel --preset quality --colorrange full --colormatrix bt2020c --cuda-schedule sync --qp-min 40:43:45 --output "G:\Temp\tos_picture_51_40_43_45_2022-12-27@18_25_25_0710_02.av1"

results in:

Code:

NVEncC (x64) 7.06 (r2388) by rigaya, Dec 10 2022 12:26:56 (VC 1929/Win)
OS Version     Windows 11 x64 (22621) [UTF-8]
CPU            AMD Ryzen 9 7950X 16-Core Processor [5.77GHz] (16C/32T)
GPU            #0: NVIDIA GeForce RTX 4080 (9728 cores, 2505 MHz)[PCIe1x16][527.56]
NVENC / CUDA   NVENC API 12.0, CUDA 12.0, schedule mode: sync
Input Buffers  CUDA, 20 frames
Input Info     y4m(yv12(10bit))->p010 [AVX2], 4096x1714, 24/1 fps
Vpp Filters    copyHtoD
Output Info    AV1 main 10bit @ Level auto
4096x1714p 1:1 24.000fps (24/1fps)
Encoder Preset quality
Rate Control   VBR
Multipass      none
Bitrate        0 kbps (Max: 0 kbps)
Target Quality 51.00
Initial QP     I:20  P:23  B:25
QP range       I:40-51  P:43-51  B:45-51
QP Offset      cb:0  cr:0
VBV buf size   auto
Lookahead      off
GOP length     240 frames
B frames       7 frames [ref mode: middle]
Ref frames     7 frames, MultiRef L0:auto L1:auto
AQ             on
Part size      max auto / min auto
Tile num       columns auto / rows auto
TemporalLayers max 1
Refs           forward auto, backward auto
VUI            matrix:bt2020c,range:full
Others         mv:Q-pel nonrefp
encoded 17616 frames, 9.93 fps, 1628.03 kbps, 142.45 MB
encode time 0:29:34, CPU: 0.0, GPU: 9.6, VE: 15.2, GPUClock: 1441MHz, VEClock: 1563MHz
frame type IDR    74
frame type I      74,  total size    4.28 MB
frame type P    2202,  total size    0.01 MB
frame type B   15340,  total size  138.16 MB
2022-12-27@18_25_25_0710_02_video finished after 00:29:39.308
finished...

Cu Selur

Attached Files

tos_picture_51_40_43_45.mp4 (142.49 MB, 59 views)

users currently on my ignore list: deadrats, Stears555, marcorocchini

Quote

27th Dec 2022 17:30 #13
RogerTango

View Profile

View Forum Posts

Private Message
Member

Join Date
Nov 2007

Location
United States
I apologize in advance if I deviate from the OP's question, but what does AV1 offer that HEVC does not?

The reason, for me, that I picked HEVC was because my 1050ti already supports it, and so many products (Micca media player, Roku 4k, etc..) support HEVC... but not AV1.

Whatever the reason, good luck with your project!
Andrew

Last edited by RogerTango; 27th Dec 2022 at 18:52.

Quote
27th Dec 2022 18:00 #14
JN-

View Profile

View Forum Posts

Private Message
Member

Join Date
Jun 2022

Location
Dublin
Did u mean AV1 ?

Quote
27th Dec 2022 18:52 #15
RogerTango

View Profile

View Forum Posts

Private Message
Member

Join Date
Nov 2007

Location
United States
Originally Posted by JN-

Did u mean AV1 ?

Yes.

Quote
27th Dec 2022 22:34 #16
Selur

View Profile

View Forum Posts

Private Message

Visit Homepage
Member

Join Date
Jun 2011

Location
Germany
what does AV1 offer that HEVC does not?

AV1 potentially has less license issues during commercial use.
AV1 is newer and:
thus not as widely supported as older formats

at least the software encoders contain new features like noise modelling. (General idea is to remove grain/noise during encoding and add similar noise back during playback.)

it's implementations are less optimized.

it's not from the MPEG group.

In general, AV1 aims to offer better compression than older formats (like HEVC) and thus interesting to users.

Cu Selur
users currently on my ignore list: deadrats, Stears555, marcorocchini
Quote

28th Dec 2022 01:31 #17

Selur

Member

@ sophisticles: Something is wrong with your x265 encode.

Using:

Code:

x265 --input - --output-depth 12 --y4m --profile main444-12 --tu-intra-depth 3 --tu-inter-depth 3 --limit-tu 4 --subme 4 --limit-modes --max-merge 4 --bframes 8 --weightb --rc-lookahead 40 --lookahead-slices 0 --pass 1 --no-slow-firstpass --bitrate 1700 --opt-qp-pps --qpfile GENERATED_QP_FILE --rdoq-level 2 --psy-rdoq 1.00 --range full --colormatrix bt2020c --stats "G:\Temp\tos_picture_x265_slower_generated.stats" --output NUL

x265 --preset slower --input - --output-depth 12 --y4m --profile main444-12   --pass 2 --bitrate 1700 --opt-qp-pps --range full --colormatrix bt2020c --stats "G:\Temp\tos_picture_x265_slower_generated.stats" --output "G:\Temp\tos_picture_x265_slower.265"

Code:

y4m  [info]: 4096x1714 fps 24/1 i444p12 sar 1:1 unknown frame count
raw  [info]: output file: G:\Temp\2022-12-28@05_31_05_4210_02.265
x265 [info]: HEVC encoder version 3.5+69-dc12b9de0
x265 [info]: build info [Windows][GCC 12.2.0][64 bit] 12bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
x265 [warning]: halving the quality when psy-rd is enabled for 444 input. Setting cbQpOffset = 6 and crQpOffset = 6
x265 [info]: Main 4:4:4 12 profile, Level-5 (Main tier)
x265 [info]: Thread pool created using 32 threads
x265 [info]: Slices                              : 1
x265 [info]: frame threads / pool features       : 5 / wpp(27 rows)
x265 [info]: Coding QT: max CU size, min CU size : 64 / 8
x265 [info]: Residual QT: max TU size, max depth : 32 / 3 inter / 3 intra
x265 [info]: ME / range / subpel / merge         : star / 57 / 4 / 4
x265 [info]: Keyframe min / max / scenecut / bias  : 24 / 250 / 40 / 5.00
x265 [info]: Cb/Cr QP Offset                     : 6 / 6
x265 [info]: Lookahead / bframes / badapt        : 40 / 8 / 2
x265 [info]: b-pyramid / weightp / weightb       : 1 / 1 / 1
x265 [info]: References / ref-limit  cu / depth  : 5 / off / on
x265 [info]: AQ: mode / str / qg-size / cu-tree  : 2 / 1.0 / 32 / 1
x265 [info]: Rate Control / qCompress            : ABR-1700 kbps / 0.60
x265 [info]: tools: rect amp limit-modes rd=6 psy-rd=2.00 rdoq=2 psy-rdoq=1.00
x265 [info]: tools: rskip mode=1 limit-tu=4 signhide tmvp b-intra
x265 [info]: tools: strong-intra-smoothing deblock sao stats-read
x265 [info]: frame I:    132, Avg QP:32.87  kb/s: 14030.59
x265 [info]: frame P:   3447, Avg QP:36.82  kb/s: 4823.51
x265 [info]: frame B:  14037, Avg QP:41.89  kb/s: 811.35
x265 [info]: Weighted P-Frames: Y:7.1% UV:2.0%
x265 [info]: Weighted B-Frames: Y:5.8% UV:1.1%
encoded 17616 frames in 6040.69s (2.92 fps), 1695.48 kb/s, Avg QP:40.83
2022-12-28@05_31_05_4210_02_video finished after 01:40:41.638

I get way better results than your file.

Cu Selur

Ps.: reading about your crashes and looking at the broken output, I would recommend checking your system. (heat, memory, disable any overclocking,...)

Attached Files

tos_picture_x265_slower.mp4 (148.59 MB, 64 views)

Last edited by Selur; 28th Dec 2022 at 02:16.

users currently on my ignore list: deadrats, Stears555, marcorocchini

Quote

Should I buy an Intel video card for AV1 encoding?

Thread Tools

Search Thread

Similar Threads

AV1 encoding: Intel ARC GPU encoding vs CPU encoders

What parameters to use to convert videos to av1 with SVT-AV1 with FFmpeg

AOM adopts Intel SVT-AV1

Intel Tiger Lake Linux AV1 Hardware Video Decoding Support Ready

NetFlix to start using Intel's SVT-AV1