VideoHelp Forum
+ Reply to Thread
Results 1 to 24 of 24
Thread
  1. Hi there, I've recently tested encoding performances on an a friend's MSI Stealth GS77 - 12U laptop: when the process was done with one of the GPUs, the other remained unused - although it still consumed some power (neither goes totally idle) and of course produce heat - and vice versa.

    As for the title: how - if possible - write a command line to assign the Intel iGPU to decode source video stream and Nvidia dGPU to the encoding workload only (and, why not, vice-versa) ?

    Thanks in advance to anyone that can/will help.
    Quote Quote  
  2. Decoding and encoding using different HW so i doubt if you gain anything trying to decode on one SoC and encode on another.
    If you check NVidia and Intel guidelines for ffmpeg encoding then both vendors pointing that fastest transcoding is performed when there is no transfers beyond vendor SoC (i.e. decoding/scaling/processing/encoding is performed only by SoC).
    Quote Quote  
  3. Member
    Join Date
    Jun 2022
    Location
    Dublin
    Search Comp PM
    The only time I think it makes sense to use 2 gpu's and not the single most powerful one is if decoding say 422 10 bit. AFAIK only intel can do that in HW. So in that case an nle such as Vegas Pro can be set to Intel for decoding and say Nvidia or Amd for encoding. Because typically the Intel device would be a lot less powerful than say a high end Nvidia device it makes sense to split the render across both, as using the Intel device alone would be slower.

    It would be nice to have an ffmpeg syntax to do that for sure: i.e. use both.

    Actually some of the above re: VP is incorrect. The intel device can be set to assist playback within the nle for the 10 bit 422 clip but the render only selects a single gpu for encoding output. The render times might be faster though, (due to using Intel to decode frames) I must check that out.
    Last edited by JN-; 26th Jul 2024 at 19:25.
    Quote Quote  
  4. Member
    Join Date
    Jun 2022
    Location
    Dublin
    Search Comp PM
    Checked it out and in VP rendering out using Nvidia GPU but with Intel GPU as decoder it really really does go faster.

    So getting that to work using ffmpeg would be something. Beyond my capabilities though.

    Note that the Intel GPU is only in a 4x PCIE slot.

    Render times:
    Without Intel used as decoder: 91 seconds
    With Intel used as decoder: 23 seconds (Nvidia was used as decoder)

    A 4x difference.

    Test clip was a 27 second UHD hevc 422 10 bit.

    Playback in VP of course was vastly better using Intel decoder. The last 25% of clip played at less than1 fps without using Intel decoder. Using Intel Decoder clip played at full 25 fps for all of clip length.

    Output rendered files were to UHD 420 8 bit.

    Output quality was the same ...
    Start Date and Time .. 27/07/2024 .. 12:30:47.26 ------------------------------------------------------------------------------------

    ----------------------------------------------------- Video Help Test-with Using Intel as decoder.mp4 metrics below

    SSIM All........ 0.697930 (5.198917)
    PSNR Average ... 20.644672
    VMAF ........... 47.087881


    ----------------------------------------------------- Video Help Test-without using Intel as decoder.mp4 metrics below

    SSIM All........ 0.697954 (5.199270)
    PSNR Average ... 20.644662
    VMAF ........... 47.090000

    End Date and Time .... 27/07/2024 .. 12:32:04.59 ... Duration [ 00:01:18.62 ]. Number of input files [ 2 ].
    --------------------------------------------------------------------------------------------------------------------------------------
    Above file(s) processed by ....... [ RQM.bat ] ..... Reference file ... [ Source.mov ]
    __________________________________________________ __________________________________________________ __________________________________
    ffmpeg version date .. 07-09-2023
    ffmpeg version ....... ffmpeg version 2023-09-07-git-9c9f48e7f2-full_build-www.gyan.dev Copyright (c) 2000-2023 the FFmpeg developers
    __________________________________________________ __________________________________________________ __________________________________

    The overall low output RQM results are because I didn't aim for high values. The output h264 file size and data rates were about 3.4 times smaller than the source hevc file


    Nvenc encoding in VP (latest version 21 b315) still suffers with poor nvenc encoding, Stop/Start. Heart beat, Hacksaw or whatever you want to call it. Since it happens with both renders it can be ignored. Other non Nvidia encoding in VP wouldn't suffer this issue.
    Last edited by JN-; 27th Jul 2024 at 07:27.
    Quote Quote  
  5. You can use qsv for decoding and nvenc for encoding with ffmpeg. Just specify the decoder before the input, and the encoder after.


    Code:
    ffmpeg -y -hide_banner -benchmark ^
        -c:v h264_qsv -i "%~dpnx1" ^
        -c:v h264_nvenc -preset fast -cq:v 23 ^
        "%~dpn1.h264.nvenc.mkv"
    That's from a Windows drag/drop batch file. With a 4:2:2 10 bit source you may need to coerce the pix_fmt.
    Last edited by jagabo; 27th Jul 2024 at 12:08.
    Quote Quote  
  6. Member
    Join Date
    Jun 2022
    Location
    Dublin
    Search Comp PM
    Thanks for the syntax. I've struggled with it for some time but cannot get it to work if input is not 8 bit 420.

    Using input of 8 bit 420 = using Intel decoding is a fraction slower.
    I'd expect that will reverse with input of 10 bit 422, if I could get it to work.

    Input file of 10 bit 422 causes errors.
    Image
    [Attachment 81011 - Click to enlarge]


    This is what I am using ...


    @echo off
    mode CON:cols=228 lines=80
    color 0A
    SETLOCAL EnableDelayedExpansion
    TITLE A test

    SET _INPUT_FILE="%~n1%~x1"
    SET _INPUT_FILENAME="%~n1"
    SET _INPUT_EXTN=%~x1

    SET "_PIXELFORMAT=-pix_fmt yuv420p"

    REM Without Intel decoding.
    REM ffmpeg -y -hide_banner -benchmark -i "%~n1%~x1" -c:v libx264 -preset fast -crf 23 "%~n1-[A-TEST].mkv"

    REM With Intel decoding.
    REM ffmpeg -y -hide_banner -benchmark -c:v h264_qsv -i "%~n1%~x1" -c:v libx264 -preset fast -crf 23 "%~n1-[A-TEST-uses-Intel-decoder].mp4"


    ffmpeg -y -hide_banner -benchmark -vcodec h264_qsv -i "%~n1%~x1" -c:v libx264 %_PIXELFORMAT% -preset fast -crf 23 "%~n1-[A-TEST-uses-Intel-decoder].mp4"


    echo.
    echo.
    echo.
    pause
    Last edited by JN-; 27th Jul 2024 at 16:21.
    Quote Quote  
  7. Do you have a short sample I can test with?
    Quote Quote  
  8. Member
    Join Date
    Jun 2022
    Location
    Dublin
    Search Comp PM
    Hi Jagabo, only saw reply a little while ago.

    This is a dropbox link with 2 10 bit 422 samples.

    https://www.dropbox.com/scl/fi/topo8mks0kfl91ou5d5ok/Short-10-bit-422-pieces.zip?rlkey...10depuesc&dl=0
    Quote Quote  
  9. It turns out that my CPU (i9 9900K, Coffee Lake) doesn't support 10 bit 4:2:2 HEVC decoding.

    https://en.wikipedia.org/wiki/Intel_Quick_Sync_Video

    I don't think I can help you since I won't be able to tell when a command line should work with a CPU that supports it.
    Quote Quote  
  10. Member
    Join Date
    Jun 2022
    Location
    Dublin
    Search Comp PM
    Aok. Thanks for the suggestions. I think 10 bit 422 was from 11 gen onwards.

    I had a list of command lines for hw encoding, with many variations that I tested in the past, although not for this particular scenario, i.e. using 2 gpu's. But it might have given me some idea of what i'm missing. Unfortunately after much searching I cannot find. I’ll have another look tomorrow.
    Quote Quote  
  11. Please post the command line here if you find something that works. I keep a log of such things for reference, even if I can't use it right now.
    Quote Quote  
  12. Member
    Join Date
    Jun 2022
    Location
    Dublin
    Search Comp PM
    It would always be my intention to do so.

    I already burned the midnight oil on this last night and so for now I am out of ideas. There may be other users here that may have suggestions. Even if they don’t have qsv capability for 10 bit 422 hw I am happy to try any alternate offered ffmpeg syntax

    Last thing to do shortly is to try an older ffmpeg version say end of last year
    Quote Quote  
  13. Member
    Join Date
    Jun 2022
    Location
    Dublin
    Search Comp PM
    Image
    [Attachment 81019 - Click to enlarge]


    Shortly just arrived, no luck just a more concise error message. I used ffmpeg.exe from November 2023.

    Used ... ffmpeg -y -hide_banner -c:v h264_qsv -i "%~n1%~x1" -c:v libx264 -preset fast -crf 23 "%~n1-[A TEST, uses Intel decoder].mp4"


    Following were h264 files.
    When input is 8 bit 420 above works.

    When input is 8 bit 422 or is
    10 bit 420 or is
    10 bit 422 then above doesn't work.

    Tried one 10 bit 422 hevc file, still errors.
    Last edited by JN-; 28th Jul 2024 at 10:04.
    Quote Quote  
  14. Query both (decoder and encoder) for supported pixel formats ( use ffmpeg -h encoder=xxx_nvenc similarly for intel encoder - decoder may need fmpeg -h decoder=xxx ) - if i recall correctly GPU's natively using different pixel formats than common video formats - your goal is to avoid CPU interactions as much as possible - preferably copy only video data between cards (from Intel to NVidia) - as you are using PCIe then anyway this is most impacting performance operation - check vendors papers - they explain some transcoding limitations:
    https://duckduckgo.com/?q=intel+ffmpeg+quicksync+filetype%3Apdf
    https://duckduckgo.com/?q=nvenc+ffmpeg+filetype%3Apdf
    Quote Quote  
  15. Member
    Join Date
    Jun 2022
    Location
    Dublin
    Search Comp PM
    Thanks for suggestions pandy. My use case requirements may be different to the OP's

    “your goal is to avoid CPU interactions as much as possible - preferably copy only video data between cards (from Intel to NVidia)”

    My interest in this is to use ffmpeg to improve rendering/encoding of 10 bit 422 to typically 8 bit 420 output.

    So far I have only used cpu to render and qsv to decode the 10 bit 422 input.

    I'll try using nvidia instead of cpu to see if I have better luck. Thanks for links.

    “if i recall correctly GPU's natively using different pixel formats than common video formats”

    Corrrct.

    I will insert the appropriate pixel formats for qsv and nvenc.
    Quote Quote  
  16. Originally Posted by JN- View Post
    Thanks for suggestions pandy. My use case requirements may be different to the OP's

    “your goal is to avoid CPU interactions as much as possible - preferably copy only video data between cards (from Intel to NVidia)”

    My interest in this is to use ffmpeg to improve rendering/encoding of 10 bit 422 to typically 8 bit 420 output.

    So far I have only used cpu to render and qsv to decode the 10 bit 422 input.

    I'll try using nvidia instead of cpu to see if I have better luck. Thanks for links.

    “if i recall correctly GPU's natively using different pixel formats than common video formats”

    Corrrct.

    I will insert the appropriate pixel formats for qsv and nvenc.
    Not necessarily your goal is so different - by trying to push as much as possible video processing on GPU HW and reducing overall CPU load you may try to use complex filters impractical from CPU perspective - OpenCL or CUDA based processing may be beneficial to your goal.

    You have some GPU accelerated processing options:
    https://ffmpeg.org/ffmpeg-filters.html#libplacebo
    https://ffmpeg.org/ffmpeg-filters.html#OpenCL-Video-Filters
    https://ffmpeg.org/ffmpeg-filters.html#Vulkan-Video-Filters
    Quote Quote  
  17. Member
    Join Date
    Jun 2022
    Location
    Dublin
    Search Comp PM
    I tried using encoder nvidia instead of CPU but get similar errors.

    ffmpeg -y -init_hw_device qsv=intel,child_device=2 -c:v h264_qsv -i "%~n1%~x1" -map 0:v -c:v h264_nvenc "%~n1-[ Uses Intel decoder-YES ].mp4"

    I am using AMD CPU with iGPU and Nvidia and Intel cards, so the gpu number above "2" is correct. i.e. 0=Nvidia, 1=AMD and 2=Intel.

    As mentioned previously if input is 8 bit 420 it works ok.
    Quote Quote  
  18. Member
    Join Date
    Jun 2022
    Location
    Dublin
    Search Comp PM
    I didn't insert the pixel formats for QSV or Nvidia because neither supports 10 bit 422. I figure the complexities of this is above me.

    However, maybe this is all a bit of a brain fart on my behalf, looking for something that's already mostly there.

    When I found that I was getting a 4x render time increase in Vegas Pro by using Intel for decoding and Nvidia to encode the render I thought why not aim for that with ffmpeg.

    Thing is you can feed 10 bit 422 to nvidia only ( no QSV used ) using ffmpeg and it will output say any of it's available pixel formats, 8bit 420 and 8bit 444. Typically I would only need 8 bit 420 as output.

    I just assumed that if the Intel decoder is also used in the command line that it would be faster still. ? Maybe not by much.
    Last edited by JN-; 29th Jul 2024 at 14:15.
    Quote Quote  
  19. Member
    Join Date
    Jun 2022
    Location
    Dublin
    Search Comp PM
    My assumption up to now has been that when using ffmpeg, and utilising Intel's QSV to decode 10 bit 422 input, then that would improve encode time using say ffmpeg's CPU, x264/hevc or Nvenc. But maybe ffmpeg converts the frames from the input 10 bit 422 very fast anyway. Without getting the syntax working i'll never know.
    Quote Quote  
  20. Member
    Join Date
    Jun 2022
    Location
    Dublin
    Search Comp PM
    OK. I used the following, with and without Intel decoding using Nvenc and x264 CPU for rendering. It works on UHD 10 bit 422 hevc input file of 1m:56 seconds clip.

    SET "_PIXELFORMAT=-pix_fmt yuv420p"

    Input is UHD 10 bit 422 hevc, output is to UHD 8 bit 420 h264.

    REM ffmpeg -y -init_hw_device qsv=intel,child_device=2 -i "%~n1%~x1" -map 0:v -c:v h264_nvenc %_PIXELFORMAT% -map 0:a -c:a copy "%~n1-[ Uses Intel decoder-YES ]-Nvenc encoder.mp4"
    REM ffmpeg -y -i "%~n1%~x1" -map 0:v -c:v h264_nvenc %_PIXELFORMAT% -map 0:a -c:a copy "%~n1-[ Uses Intel decoder-NO ]-Nvenc encoder.mp4"

    REM ffmpeg -y -init_hw_device qsv=intel,child_device=2 -i "%~n1%~x1" -map 0:v -c:v libx264 %_PIXELFORMAT% -map 0:a -c:a copy "%~n1-[ Uses Intel decoder-YES ]-CPU encoder.mp4"
    REM ffmpeg -y -i "%~n1%~x1" -map 0:v -c:v libx264 %_PIXELFORMAT% -map 0:a -c:a copy "%~n1-[ Uses Intel decoder-NO ]-CPU encoder.mp4"

    The render times are near identical for both sets of tests. They are the opposite of what I would expect, using the Intel device is slower in both cases. So i'm not sure if it is really using the Intel device to decode. In the ffmpeg on screen display it does clearly show it though when running with the Intel syntax.

    Duration = [00:00:52.43] ... "Grafton Street Music, 10 bit 422.mp4" Nvenc encoding, WITH Intel decoder
    Duration = [00:00:52.18] ... "Grafton Street Music, 10 bit 422.mp4" Nvenc encoding, WITHOUT Intel decoder

    Duration = [00:01:41.37] ... "Grafton Street Music, 10 bit 422.mp4" CPU x264 encoding, WITH Intel decoder
    Duration = [00:01:41.09] ... "Grafton Street Music, 10 bit 422.mp4" CPU x264 encoding, WITHOUT Intel decoder
    Last edited by JN-; 29th Jul 2024 at 17:32.
    Quote Quote  
  21. Member
    Join Date
    Jun 2022
    Location
    Dublin
    Search Comp PM
    SET "_PIXELFORMAT=-pix_fmt yuv420p"

    ffmpeg -y -init_hw_device qsv=intel,child_device=2 -i "%~n1%~x1" -c:v h264_nvenc %_PIXELFORMAT% -c:a copy "%~n1-[ Uses Intel decoder-YES ]-Nvenc encoder-.mp4"

    I see that as per Task Manager the Intel is doing nothing, not decoding.

    Although it gets initialised, that's it.


    Oddly the GPU numbers in TM are 0=Nvidia, 1=Intel and 2=AMD.

    When I use 2 for Intel I get no errors, but using 1 does.

    I don't have an ffmpeg syntax to get these numbers.
    Last edited by JN-; 29th Jul 2024 at 19:25.
    Quote Quote  
  22. Well, another option is to pipe QSV/NV-Enc into FFMPEG (or vice-versa): https://github.com/rigaya/QSVEnc/issues/207
    Quote Quote  
  23. Member
    Join Date
    Jun 2022
    Location
    Dublin
    Search Comp PM
    Another option for me is to just use a single Intel GPU to do both decoding and encoding. I realise that this isn't what the OP queried and it would still be nice to get the multiple GPU working.

    In the case of an input clip of 10 bit 422 I would expect a big improvement using Intel QSV decoding capabilities for this type of input, currently not handled by AMD/Intel.

    Why it took me so long to get here (no excuse) was because in VP I have to set I/O to Intel for decoding and say CPU/Nvidia for encoding. I just didn't see the obvious when using ffmpeg.

    I tried comparing an ffmpeg encode of 10 bit 422 clip using Nvenc vs QSV expecting a difference in favour of Intel, despite the Intel GPU running in a 4x slot.

    However there was none. Reason is that no decoding (Task Manager) was taking place on either GPU, CPU only.

    So I have to find how to enable ffmpeg HW decoding later for nvenc and QSV.
    Last edited by JN-; 30th Jul 2024 at 06:23.
    Quote Quote  
  24. Member
    Join Date
    Jun 2022
    Location
    Dublin
    Search Comp PM
    I used this ... -hwaccel_output_format qsv -c:v h264_qsv -i "%_INPUT_FILE%" -map 0:v -c:v h264_qsv etc etc

    It was a slight bit slower without above. Maybe that's because the GPU is in a 4x slot and I use a 16 core cpu. So I won't be using that, also it doesn't handle 10 bit 422 input. Leaving the above syntax off allows 10 bit 422 input, for ffmpeg. Surprising since latest intel should handle 10 bit 422. I guess there is something else that I should add in to above?

    I did another test in VP and found that with an h264 input file enabling Intel decoding gave no benefit. However when the input file was hevc there was a nearly 2x render speed improvement. I got a 4x speed improvement previously with a different file.

    So back to square one. I know that enabling the Intel decoder is worth doing if the input files are typically hevc and 10 bit 422.

    As to th OP's query, I don't know how to enable typically one GPU for decoding and another one for encoding using ffmpeg.
    Last edited by JN-; 30th Jul 2024 at 12:51.
    Quote Quote  



Similar Threads

Visit our sponsor! Try DVDFab and backup Blu-rays!