To GPU process, or not to GPU process ? 450fps HD -> SD

16th Jan 2017 23:34 #1

Member

I had a temporary fixation on GPU based decoding/resizing/sharpening/encoding, using ffmpeg, for standardising non-vital videos to mp4/avc/aac for playback across a range of my home devices.

To GPU, or not to GPU, that was the question. Elapsed time is valuable.

Having wondered that - there's not much choice of filters inbuilt in ffmpeg for denoising sharpening and whatnot, having been spoiled by AviSynth and VapourSynth, and much less choice in GPU/OpenCL filters.

Was it worth it ? Yes and no. Maybe you have better suggestions (I hope so).
Many videos just aren't worth enhancing beyond a tad of sharpening anyway.

So, with a lot of assistance I built an ffmpeg x64 with OpenCL as well as the usual bits, and then did some testing.
Build related matters here https://ffmpeg.zeranoe.com/forum/viewtopic.php?f=5&t=1787&p=11908#p11908
Test setup: i3820, nvidia 750Tix2Gb, 16Gb.
I ended up giving up on ffmpeg CUVID GPU decoding(-hwaccel cuvid -c:v h264_cuvid), apparently called "nvdec" by Nvidia now, as too unreliable. Errors thrown and sometimes crashing, I wonder what I did wrong.

So ... after a bit of testing, once-off results seemed to indicate
- DXVA2 for decoding mpeg2 was not quicker than vanilla ffmpeg decoding (it was consistently slower for me)
- Vanilla ffmpeg internal mpeg2 decoding was pretty reasonable speed, it didn't appear to be worth doing fancy GPU decoding by itself
- Using OpenCL in ffmpeg's unsharp mask can make a huge difference for larger dimensioned sources like 1080i, end-to-end rate 5.08x vs 2.01x
- ffmpeg deinterlacing with yadif was pretty reasonable (the only way I could readily find to GPU deinterlace was to open with VapourSynth/dgdecode_NV and then pipe into ffmpeg for unsharp-OpenCL/encoding)
- GPU deinterlacing/resizing with VapourSynth/dgdecode_NV (i.e. nvidia's PureVideo GPU deinterlacer) and then piped into ffmpeg for unsharp-OpenCL/encoding, beat the ffmpeg-only combination by an absolutely whopping margin, end-to-end rate 19.5x vs 5.49x

Some of the test results below from conversion of a 1080i mpeg2 video into h264.
Source: 3hr 1080i mpeg2, unsharp without and with OpenCL and DXVA2 - elapsed time, fps, speed.

4.0 OpenCL=0 homebuilt ffmpeg_x64 NOOpenCL_unsharp + nvenc + standard_ffmpeg_decoding - 01:29:38, fps= 50, speed=2.01x
4.1 OpenCL=1 homebuilt ffmpeg_x64 OpenCL_unsharp + nvenc + standard_ffmpeg_decoding - 00:35:27, fps=127, speed=5.08x
4.2 OpenCL=1 homebuilt ffmpeg_x64 OpenCL_unsharp + nvenc + DXVA2_GPU_decoding - 00:48:32, fps= 93, speed=3.71x
Interestingly, DXVA2 always slows it down.

Other test results for: first 2 minutes of the 3hr 1080i mpeg2 :

1. deinterlace to 25fps, unsharp with OpenCL=1, encode with nvenc in ffmpeg

deinterlace to 25fps using standard_ffmpeg_decoding and yadif, homebuilt ffmpeg_x64 OpenCL_unsharp + nvenc
frame= 3000 fps=138 q=21.0 Lsize= 267317kB time=00:01:59.96 bitrate=18254.9kbits/s speed=5.52x
Tue 17/01/2017 14:27:01.50 2. non-VS opencl=1 homebuilt x64 opencl + nvenc in ffmpeg
Tue 17/01/2017 14:27:24.14 2. end non-VS

deinterlace to 25fps using GPU dgdecode_NV_decoding in vapoursynth_x64 piped to ffmpeg_x64, homebuilt ffmpeg_x64 OpenCL_unsharp + nvenc
frame= 3000 fps=106 q=21.0 Lsize= 248463kB time=00:01:59.96 bitrate=16967.4kbits/s speed=4.23x
Tue 17/01/2017 14:26:31.59 1. VS - This will use the 64bit version of everything - hopefully. No Yadif since nividia PureVideo deinterlacing is done by VS/DGDecodeNV
Tue 17/01/2017 14:27:01.47 1. end VS

2. deinterlace to 25fps, resize to 720:576, unsharp with OpenCL=1, encode with nvenc in ffmpeg

deinterlace to 25fps using standard_ffmpeg_decoding and yadif, ffmpeg scale=720:576, homebuilt ffmpeg_x64 OpenCL_unsharp + nvenc in ffmpeg
frame= 3000 fps=137 q=21.0 Lsize= 70775kB time=00:01:59.96 bitrate=4833.2kbits/s speed=5.49x
Tue 17/01/2017 14:38:56.46 2. non-VS opencl=1 homebuilt x64 opencl + nvenc in ffmpeg
Tue 17/01/2017 14:39:19.09 2. end non-VS

deinterlace to 25fps using GPU dgdecode_NV_decoding in vapoursynth_x64 GPU scaling to 720:576 by dgdecode_NV at the same time, piped to ffmpeg_x64, homebuilt ffmpeg_x64 OpenCL_unsharp + nvenc in ffmpeg
frame= 3000 fps=487 q=21.0 Lsize= 65897kB time=00:01:59.96 bitrate=4500.1kbits/s speed=19.5x
Tue 17/01/2017 14:38:48.22 1. VS - This will use the 64bit version of everything - hopefully. No Yadif since nividia PureVideo deinterlacing is done by VS/DGDecodeNV
Tue 17/01/2017 14:38:56.43 1. end VS

If you have suggestions and/or criticisms, please feel very free to post them.

... one could be: do anything non-trivial inside VapourSynth, assuming one could find working 64bit GPU based filters, and pass the result to ffmpeg for nvenc encoding.

P.S. I'm not fixated on the unsharp sharpening filter, it's the only GPU (OpenCL) based one that I could find.

Last edited by hydra3333; 26th Jan 2017 at 00:57.

Quote

26th Jan 2017 01:13 #2

hydra3333

Member

OK, a hack up example windows .bat "script" showing how to convert PAL 1080i50 mpeg2 to PAL 576p25 mpeg4 with minor sharpening @ circa 450fps using an old nvidia 750-Ti 2Gb.
It isn't definitive, only a test script which works.
It assumes

portable vapoursynth x64 available with the associated x64 portable python and related vapoursynth x64 plugins (eg dgdecodenv x64)
an openCL enabled ffmpeg (a home built ffmpeg using a modified rdp's script)
portable mp4box (built along with the openCL enabled ffmpeg)
Donald Graft's DGdecodeNV x64 ($15 donation) to decode and resize in-GPU using cuda
an nvidia GPU (decoding and resizing and deinterlacing is done with the GPU, unsharp uses openCL in the GPU to sharpen, encoding is done with ffmpeg's inbuilt nividia nvenc encoder)

Speed is great, quality is acceptable for non-critical videos, depending on your threshold for speed vs fussy tradeoff.

Denoising can also be done inside Vapoursynth (post resizing) via KNLmeansCL which uses openCL in GPU, however it reduces 450fps to circa 45 fps

Code:

@echo on
@setlocal ENABLEDELAYEDEXPANSION
REM
REM convert all into .MP4, one at a time
REM
CALL ".\000-setup-exe-paths.bat"

REM ***** PREVENT PC FROM GOING TO SLEEP *****
REM header comes from ".\000-setup-exe-paths.bat"
set iFile=%header%-Insomnia.exe
copy "C:\SOFTWARE\Insomnia\64-bit\Insomnia.exe" ".\%iFile%"
start /min "%iFile%" ".\%iFile%"
REM ***** PREVENT PC FROM GOING TO SLEEP *****

SET sourcePath=.\
SET DonePath=.\done\
SET targetMPGpathRoot=T:\HDTV\autoTVS-mpg\
SET ConvertedPath=%targetMPGpathRoot%Converted\
SET SCRATCHPATH=D:\temp\SCRATCH\
REM --------- resolve any relative paths into absolute paths --------- 
REM --------- ensure nos spaces between brackets and SET statement --------- 
echo before sourcePath="%sourcePath%"
FOR /F %%i IN ("%sourcePath%") DO (SET sourcePath=%%~fi)
echo after sourcePath="%sourcePath%"
echo before MPGpath="%MPGpath%"
FOR /F %%i IN ("%MPGpath%") DO (SET MPGpath=%%~fi)
echo after MPGpath="%MPGpath%"
echo before TSpath="%TSpath%"
FOR /F %%i IN ("%DonePath%") DO (SET DonePath=%%~fi)
echo after DonePath="%DonePathh%"
echo before ConvertedPath="%ConvertedPath%"
FOR /F %%i IN ("%ConvertedPath%") DO (SET ConvertedPath=%%~fi)
echo after ConvertedPath="%ConvertedPath%"
echo before SCRATCHPATH="%SCRATCHPATH%"
FOR /F %%i IN ("%SCRATCHPATH%") DO (SET SCRATCHPATH=%%~fi)
echo after SCRATCHPATH="%SCRATCHPATH%"
REM ---------------------------------------
md "%DonePath%"
md "%targetMPGpathRoot%"
md "%ConvertedPath%"
md "%SCRATCHPATH%"

for %%f in ("%sourcePath%*.mpg") do (
REM the subname test is from another test suite, leave it in just as an example of how to exclude even though its useless here
   set fname=%%~nxf
   set subfname=!fname:~-8!
   CALL :LoCase subfname
   IF NOT "!subfname!" == ".aac.mp4" (
      CALL :NV "%%f" "%ConvertedPath%"
      MOVE "%%f" "%DonePath%"
   )
)

REM ***** ALLOW PC TO GO TO SLEEP AGAIN *****
"C:\000-PStools\pskill.exe" -t "%iFile%"
del ".\%iFile%"
REM ***** ALLOW PC TO GO TO SLEEP AGAIN *****

pause
exit

:NV
@setlocal ENABLEDELAYEDEXPANSION
@setlocal enableextensions
@ECHO Start -----------------------------------------------------------------------------------------
@echo on

SET vs_file1=%~dpnx0%-resize-!header!.vpy
set PARF1=%~f1%
set PARF2=%~dpn1%
set pard2v=%~dpn1%.d2v
set pardgi=%~dpn1%.dgi

REM update: set the output path to a subfolder
set PARaacmp4=%~dp2%~n1.aac.MP4
set PARmp3mp4=%~dp2%~n1.mp3.MP4
REM pause

set PARtemp=%SCRATCHPATH%%~nx1-temp.MP4
set paraac=%SCRATCHPATH%%~nx1-temp.aac
set parmp3=%SCRATCHPATH%%~nx1-temp.mp3

DEL "!vs_file1!"
DEL "!PARF2!"
DEL "!pard2v!"
DEL "!pardgi!"

DEL !PARaacmp4!
DEL !PARmp3mp4!
DEL !PARtemp!
DEL !paraac!
DEL !parmp3!

REM ------------------------------ audio parameters ------------------------------ 
set audiofreq=48000
set audiobitrate=256k
SET audiodelayadjms=010
SET AudioDelayms=000
REM audiodelayadjms delays audio just a tad extra .01s so that it comes out after the lips move
REM AudioDelayms    is the final calculated delay for audio wioth audiodelayadjms added to it
SET lI=-16
SET lTP=0.0
SET lLRA=11
REM ------------------------------ audio parameters ------------------------------ 
REM -------------------- find the audio delay (usually negative) in the .TS file --------------------
REM audiodelayadjms delays audio just a tad extra .01s so that it comes out after the lips move
REM AudioDelayms    is the final calculated delay for audio with audiodelayadjms added to 
"!mediainfoexe!" "--Inform=Audio;%%Video_Delay%%" "%~dpnx1"  > "%tempfile%"
set /p AudioDelayms=< "%tempfile%"
type  "%tempfile%" 
IF /I "!AudioDelayms!" == "" (set AudioDelayms=0)
ECHO unadjusted AudioDelayms=!AudioDelayms! in file "%~dpnx1" 
REM seems to need a 10ms adjustment, so add 10ms to the already negative delay
ECHO AudioDelayms="!AudioDelayms!" 
ECHO audiodelayadjms="!audiodelayadjms!" 
set /a AudioDelayms+=!audiodelayadjms!
ECHO adjusted AudioDelayms=!AudioDelayms! in file "%~dpnx1" 
REM since we use .mp4 intermediary files, simply use the notional 10ms adjustment and nothing more
set AudioDelayms=
set AudioDelayms=!audiodelayadjms!
ECHO re-adjusted AudioDelayms=!AudioDelayms! back to !audiodelayadjms! in file "%~dpnx1" 
REM --------------------------------------------------------------------------

set Fps=25
set FF=TFF
SET CRF=22

REM +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
REM +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

set PARF1tmp=%~dpn1%-temp.txt
"%mediainfoexe%" --Inform=Video;%%FrameCount%% "%PARF1%" 
"%mediainfoexe%" --Inform=Video;%%FrameCount%% "%PARF1%" > "%PARF1tmp%"
FOR /F "usebackq tokens=1" %%G IN ( "%PARF1tmp%" ) DO SET FRAMES=%%G
REM --frames "%FRAMES%" 

"%mediainfoexe%" "--Inform=Video;%%FrameRate%%" "%~dpnx1"
"%mediainfoexe%" "--Inform=Video;%%FrameRate%%" "%~dpnx1" > "%PARF1tmp%"
set /p fr=< "%PARF1tmp%"
type "%PARF1tmp%"
set fr
REM
"%mediainfoexe%" "--Inform=Video;%%Width%%" "%~dpnx1"
"%mediainfoexe%" "--Inform=Video;%%Width%%" "%~dpnx1" > "%PARF1tmp%"
set /p w=< "%PARF1tmp%"
type "%PARF1tmp%"
set w
REM
"%mediainfoexe%" "--Inform=Video;%%Height%%" "%~dpnx1"
"%mediainfoexe%" "--Inform=Video;%%Height%%" "%~dpnx1" > "%PARF1tmp%"
set /p h=< "%PARF1tmp%"
type "%PARF1tmp%"
set h
REM
"%mediainfoexe%" "--Inform=Video;%%DisplayAspectRatio/String%%" "%~dpnx1" 
"%mediainfoexe%" "--Inform=Video;%%DisplayAspectRatio/String%%" "%~dpnx1" > "%PARF1tmp%"
set /p darS=< "%PARF1tmp%"
type "%PARF1tmp%"
set darS
DEL "%PARF1tmp%"

@echo off
set vbs2=.\evalAR.vbs
del "%vbs2%" > NUL: 2>&1
echo option explicit >> "%vbs2%"
echo dim inp, ninp >> "%vbs2%"
echo '''WScript.Echo Eval(WScript.Arguments(0)) >> "%vbs2%"
echo inp = rtrim(ltrim(WScript.Arguments(0))) >> "%vbs2%"
echo if inp="16:9" then >> "%vbs2%"
echo    inp = "16/9" >> "%vbs2%"
echo elseif inp="4:3" then >> "%vbs2%"
echo    inp = "4/3" >> "%vbs2%"
echo else >> "%vbs2%"
echo    ninp = CDbl(inp) >> "%vbs2%"
echo    if ninp ^> 1.6 then >> "%vbs2%"
echo       inp = "16/9" >> "%vbs2%"
echo    else >> "%vbs2%"
echo       inp = "4/3" >> "%vbs2%"
echo    end if >> "%vbs2%"
echo end if >> "%vbs2%"
echo WScript.Echo inp >> "%vbs2%"
for /f %%n in ('cscript //nologo "%vbs2%" "%darS%"') do (set theAR=%%n)
echo theAR="%theAR%"
del "%vbs2%"

REM set the SAR depending on PAL or not (eg NTSC framerate)
REM IF "%fr%"=="25.000" (
REM    IF "%theAR%"=="16/9" (SET theSAR=64:45)
REM    IF "%theAR%"=="16/9" (SET theSARs=64_45)
REM    IF "%theAR%"=="4/3"  (SET theSAR=12:11)
REM    IF "%theAR%"=="4/3"  (SET theSARs=12_11)
REM ) ELSE (
REM    IF "%theAR%"=="16/9" (SET theSAR=40:33)
REM    IF "%theAR%"=="16/9" (SET theSARs=40_33)
REM    IF "%theAR%"=="4/3"  (SET theSAR=10:11)
REM    IF "%theAR%"=="16/9" (SET theSARs=10_11)
REM )
REM IF %w% GEQ 720 (SET theSAR=1:1)
REM IF %w% GEQ 720 (SET theSARs=1_1)
IF "%fr%"=="25.000" (
   IF "%theAR%"=="16/9" (SET theSAR=64:45)
   IF "%theAR%"=="16/9" (SET theSARs=64_45)
   IF "%theAR%"=="4/3"  (SET theSAR=12:11)
   IF "%theAR%"=="4/3"  (SET theSARs=12_11)
REM over-ride 16:9 for the time being with 1:1
   IF "%theAR%"=="16/9" (SET theSAR=1:1)
   IF "%theAR%"=="16/9" (SET theSARs=1_1)
) ELSE (
   IF "%theAR%"=="16/9" (SET theSAR=40:33)
   IF "%theAR%"=="16/9" (SET theSARs=40_33)
   IF "%theAR%"=="4/3"  (SET theSAR=10:11)
   IF "%theAR%"=="16/9" (SET theSARs=10_11)
REM over-ride 16:9 for the time being with 1:1
   IF "%theAR%"=="16/9" (SET theSAR=1:1)
   IF "%theAR%"=="16/9" (SET theSARs=1_1)
)
ECHO adjusting theSAR based on clip width - w="%w%"  h="%h%"
ECHO fr="%fr%"
ECHO darS="%darS%"
ECHO theAR="%theAR%"
ECHO theSAR="%theSAR%"
ECHO theSARs="%theSARs%"
IF %w% GEQ 720 (SET theSAR=1:1)
IF %w% GEQ 720 (SET theSARs=1_1)
ECHO after theSAR="%theSAR%"
ECHO after theSARs="%theSARs%"
REM +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
REM +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

REM ------------------------------ audio conversion ------------------------------ 
@echo on
REM adjust Audio volume.
SET jsonFile=%paraac%.json
REM find the loudness parameters in a first pass
"%ffmpegexex64%" -nostdin -y -hide_banner -threads 0 -i "%PARF1%" -nostats -vn -af loudnorm=I=%lI%:TP=%lTP%:LRA=%lLRA%:print_format=json -nostats -f null - 2> "%jsonFile%"  
SET EL=!ERRORLEVEL!
IF /I "!EL!" NEQ "0" (
   Echo *********  Error !EL! was found 
   Echo *********  Error !EL! was found 
   Echo *********  Error !EL! was found 
   Echo *********  ABORTING ... 
   EXIT !EL!
)
REM
@echo off
REM all the trickery below is simply to remove quotes and tabs and spaces from the json single-level response
set input_i=
set input_tp=
set input_lra=
set input_thresh=
set target_offset=
for /f "tokens=1,2 delims=:, " %%a in (' find ":" ^< "%jsonFile%" ') do (
   set "var="
   for %%c in (%%~a) do set "var=!var!,%%~c"
   set var=!var:~1!
   set "val="
   for %%d in (%%~b) do set "val=!val!,%%~d"
   set val=!val:~1!
REM   echo .!var!.=.!val!.
   IF "!var!" == "input_i"         set !var!=!val!
   IF "!var!" == "input_tp"        set !var!=!val!
   IF "!var!" == "input_lra"       set !var!=!val!
   IF "!var!" == "input_thresh"    set !var!=!val!
   IF "!var!" == "target_offset"   set !var!=!val!
)
echo.input_i=%input_i%
echo.input_tp=%input_tp%
echo.input_lra=%input_lra%
echo.input_thresh=%input_thresh%
echo.target_offset=%target_offset%

REM later, in a second encoding pass we MUST down-convert from 192k (loadnorm upsamples it to 192k whis is way way too high ... use  -ar 48k or -ar 48000

set loudnormfilter=loudnorm=I=%lI%:TP=%lTP%:LRA=%lLRA%:measured_I=%input_i%:measured_LRA=%input_lra%:measured_TP=%input_tp%:measured_thresh=%input_thresh%:offset=%target_offset%:linear=true:print_format=summary
echo %loudnormfilter%

@echo on

"%ffmpegexex64%" -threads 0 -nostats -threads 0 -i "%~1" -threads 0 -vn -threads 0 -map_metadata -1 -af %loudnormfilter% -c:a libfdk_aac -cutoff 18000 -ab %audiobitrate% -ar %audiofreq% -threads 0 -y "%paraac%" 
SET EL=!ERRORLEVEL!
IF /I "!EL!" NEQ "0" (
   Echo *********  Error !EL! was found 
   Echo *********  Error !EL! was found 
   Echo *********  Error !EL! was found 
   Echo *********  ABORTING ... 
   EXIT !EL!
)
REM
REM ------------------------------ audio conversion ------------------------------ 

REM +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
REM +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
REM video conversion
REm
REM change - index using both non-NV and NV, it doesn't take long
REM --- non-NV -----------------------------------------------------------------
REM "%VSdgindexEXE%" -IF=[%PARF1%] -OF=[%PARF2%] -IA=3 -OM=2 -DRC=0 -DSD=0 -EXIT 
REM --- non-NV -----------------------------------------------------------------
REM --- NV -----------------------------------------------------------------
"%VSdgindexnvEXE%" -i "%PARF1%" -a -o "%pardgi%" -e  
del .\*.aac > NUL:
del .\*.aac > NUL:
del .\*.mp2 > NUL:
del .\*.mp3 > NUL:
REM --- NV -----------------------------------------------------------------

REM --- create the .vpy -----------------------------------------------------------------
ECHO off
ECHO #%vs_file1% > "%vs_file1%"
ECHO import vapoursynth as vs >> "%vs_file1%"
ECHO ##import havsfunc as haf # http://forum.doom9.org/showthread.php?t=166582 >> "%vs_file1%"
ECHO import havsfuncTS as haf # this version uses vanilla TemporalSoften instead of TemporalSoften2, as it will be "better" over time >> "%vs_file1%"
ECHO import mvsfunc as mvs  # http://forum.doom9.org/showthread.php?t=172564 >> "%vs_file1%"
ECHO import finesharp as finesharp # http://forum.doom9.org/showthread.php?p=1777815#post1777815 http://avisynth.nl/index.php/FineSharp >> "%vs_file1%"
ECHO import Plum # http://forum.doom9.org/showthread.php?t=173775 >> "%vs_file1%"
ECHO #---------------------------------------------------------------------------------------------------- >> "%vs_file1%"
ECHO # INITIALIZE - load the core and then DLLs >> "%vs_file1%"
ECHO core = vs.get_core(accept_lowercase=True) # leave off threads=8 so it auto-detects threads >> "%vs_file1%"
ECHO #---------------------------------------------------------------------------------------------------- >> "%vs_file1%"
ECHO # LOAD NATIVE plugins which aren't already autoloaded (we don't autoload any) >> "%vs_file1%"
ECHO # the r'' indicates do not treat special characters and accept backslashes >> "%vs_file1%"
ECHO # The functions of MaskTools are included in the Vapoursynth core - http://forum.doom9.org/showthread.php?p=1753688#post1753688 >> "%vs_file1%"
ECHO core.std.LoadPlugin(r'%VSpathDLL64%\AddGrain.dll') # the r'' indicates do not treat special characters and accept backslashes >> "%vs_file1%"
ECHO ##core.std.LoadPlugin(r'%VSpathDLL64%\Bilateral.dll') # the r'' indicates do not treat special characters and accept backslashes >> "%vs_file1%"
ECHO core.std.LoadPlugin(r'%VSpathDLL64%\BM3D.dll') # the r'' indicates do not treat special characters and accept backslashes >> "%vs_file1%"
ECHO core.std.LoadPlugin(r'%VSpathDLL64%\d2vsource.dll') # the r'' indicates do not treat special characters and accept backslashes >> "%vs_file1%"
ECHO core.std.LoadPlugin(r'%VSpathDLL64%\Deblock.dll') # the r'' indicates do not treat special characters and accept backslashes >> "%vs_file1%"
ECHO core.std.LoadPlugin(r'%VSpathDLL64%\DFTTest.dll') # the r'' indicates do not treat special characters and accept backslashes >> "%vs_file1%"
ECHO ##core.std.LoadPlugin(r'%VSpathDLL64%\ffms2.dll') # the r'' indicates do not treat special characters and accept backslashes >> "%vs_file1%"
ECHO core.std.LoadPlugin(r'%VSpathDLL64%\fmtconv.dll') # the r'' indicates do not treat special characters and accept backslashes >> "%vs_file1%"
ECHO core.std.LoadPlugin(r'%VSpathDLL64%\KNLMeansCL.dll') # the r'' indicates do not treat special characters and accept backslashes >> "%vs_file1%"
ECHO core.std.LoadPlugin(r'%VSpathDLL64%\libawarpsharp2.dll') # the r'' indicates do not treat special characters and accept backslashes >> "%vs_file1%"
ECHO ##core.std.LoadPlugin(r'%VSpathDLL64%\libbifrost.dll') # the r'' indicates do not treat special characters and accept backslashes >> "%vs_file1%"
ECHO ##core.std.LoadPlugin(r'%VSpathDLL64%\libfftw3f-3.dll') # the r'' indicates do not treat special characters and accept backslashes >> "%vs_file1%"
ECHO core.std.LoadPlugin(r'%VSpathDLL64%\libmvtools.dll') # the r'' indicates do not treat special characters and accept backslashes >> "%vs_file1%"
ECHO core.std.LoadPlugin(r'%VSpathDLL64%\libnnedi3.dll') # the r'' indicates do not treat special characters and accept backslashes >> "%vs_file1%"
ECHO ##core.std.LoadPlugin(r'%VSpathDLL64%\temporalsoften2.dll') # the r'' indicates do not treat special characters and accept backslashes >> "%vs_file1%"
ECHO core.std.LoadPlugin(r'%VSpathDLL64%\scenechange.dll') # http://forum.doom9.org/showthread.php?t=166769 >> "%vs_file1%"
ECHO core.std.LoadPlugin(r'%VSpathDLL64%\vcfreq.dll') # http://forum.doom9.org/showthread.php?t=171413 http://www.avisynth.nl/users/vcmohan/index.html >> "%vs_file1%"
ECHO ##core.std.LoadPlugin(r'%VSpathDLL64%\vsavsreader.dll') # the r'' indicates do not treat special characters and accept backslashes >> "%vs_file1%"
ECHO core.std.LoadPlugin(r'%VSpathDLL64%\vsdctfilter.dll')  # http://vfrmaniac.fushizen.eu/works >> "%vs_file1%"
ECHO core.std.LoadPlugin(r'%VSpathDLL64%\vsfft3dfilter.dll')  # http://vfrmaniac.fushizen.eu/works >> "%vs_file1%"
ECHO ##core.std.LoadPlugin(r'%VSpathDLL64%\vslsmashsource.dll') # the r'' indicates do not treat special characters and accept backslashes >> "%vs_file1%"
ECHO core.std.LoadPlugin(r'%VSpathDLL64%\Yadifmod.dll') # the r'' indicates do not treat special characters and accept backslashes >> "%vs_file1%"
ECHO #---------------------------------------------------------------------------------------------------- >> "%vs_file1%"
ECHO # LOAD 64 bit AVISYNTH  plugins  into the avs namespace >> "%vs_file1%"
ECHO core.avs.LoadPlugin(r'%VSdgdecodenvDLL64%')  >> "%vs_file1%"
ECHO #core.avs.LoadPlugin(r'%VSdgdecodeDLL64%')  >> "%vs_file1%"
ECHO #---------------------------------------------------------------------------------------------------- >> "%vs_file1%"
ECHO # MAIN CODE GOES BELOW, MY FUNCTIONS ARE DEFINED AT THE END, and the MAIN IF test below that invokes main >> "%vs_file1%"
ECHO # NOTE:: INDENTING IS CRITICAL AND MUST BE PRECISELY THE SAME WITHIN EACH FUNCTION  >> "%vs_file1%"
ECHO # NOTE:: INDENTING IS CRITICAL AND MUST BE PRECISELY THE SAME WITHIN EACH FUNCTION  >> "%vs_file1%"
ECHO # NOTE:: INDENTING IS CRITICAL AND MUST BE PRECISELY THE SAME WITHIN EACH FUNCTION  >> "%vs_file1%"
ECHO # NOTE:: INDENTING IS CRITICAL AND MUST BE PRECISELY THE SAME WITHIN EACH FUNCTION  >> "%vs_file1%"
ECHO # NOTE:: INDENTING IS CRITICAL AND MUST BE PRECISELY THE SAME WITHIN EACH FUNCTION  >> "%vs_file1%"
ECHO # NOTE:: INDENTING IS CRITICAL AND MUST BE PRECISELY THE SAME WITHIN EACH FUNCTION  >> "%vs_file1%"
ECHO # NOTE:: INDENTING IS CRITICAL AND MUST BE PRECISELY THE SAME WITHIN EACH FUNCTION  >> "%vs_file1%"
ECHO # NOTE:: INDENTING IS CRITICAL AND MUST BE PRECISELY THE SAME WITHIN EACH FUNCTION  >> "%vs_file1%"
ECHO # NOTE:: INDENTING IS CRITICAL AND MUST BE PRECISELY THE SAME WITHIN EACH FUNCTION  >> "%vs_file1%"
ECHO # NOTE:: INDENTING IS CRITICAL AND MUST BE PRECISELY THE SAME WITHIN EACH FUNCTION  >> "%vs_file1%"
ECHO # >> "%vs_file1%"
ECHO def main(): >> "%vs_file1%"
ECHO     #---------------------------------------------------------------------------------------------------- >> "%vs_file1%"
ECHO     # LOAD VIDEO >> "%vs_file1%"
ECHO     #  >> "%vs_file1%"
ECHO     # 0. example ffms2 (remember to loadplugin ffms2 if doing this option) >> "%vs_file1%"
ECHO     # video = core.ffms2.Source(source=r'%PARF1%') >> "%vs_file1%"
ECHO     #  >> "%vs_file1%"
ECHO     # 1. example d2vsource from plain dgindex >> "%vs_file1%"
ECHO     # core.std.LoadPlugin(r'%VSpathDLL64%\d2vsource.dll') then >> "%vs_file1%"
ECHO     # video = core.d2v.Source(r'%pard2v%') >> "%vs_file1%"
ECHO     #  >> "%vs_file1%"
ECHO     # 2. example LWLibavSource for mp4s etc >> "%vs_file1%"
ECHO     # core.std.LoadPlugin(r'%VSpathDLL64%\vslsmashsource.dll') then >> "%vs_file1%"
ECHO     # video = core.lsmas.LWLibavSource(r'%PARF1%') >> "%vs_file1%"
ECHO     #  >> "%vs_file1%"
ECHO     # 3. a DGIndexNV file (remember to loadplugin avs DGDecodeNV.dll if doing this option) >> "%vs_file1%"
ECHO     # core.std.LoadPlugin(r'%VSdgdecodenvDLL64%') >> "%vs_file1%"
ECHO     #video = core.avs.DGSource(r'%pardgi%',deinterlace=0) # deinterlace=0 means no deinterlacing >> "%vs_file1%"
ECHO     #video = core.avs.DGSource(r'%pardgi%',deinterlace=1) # deinterlace=1 means single rate deinterlacing >> "%vs_file1%"
ECHO     #  >> "%vs_file1%"
ECHO     video = core.avs.DGSource(r'%pardgi%',deinterlace=1,resize_w=720,resize_h=576) # deinterlace=1 means single rate deinterlacing >> "%vs_file1%"
ECHO     #  >> "%vs_file1%"
ECHO     #video = core.avs.DGSource(r'%pardgi%',deinterlace=2) # deinterlace=2 means double rate deinterlacing, beware extra frame >> "%vs_file1%"
ECHO     # video = core.avs.DGSource(r'%pardgi%',deinterlace=2,resize_w=720,resize_h=576) # deinterlace=2 means double rate deinterlacing, beware extra frame 0 >> "%vs_file1%"
ECHO     # If using double-framerate NV, fix the double-framerate bug in NV per http://forum.doom9.org/showthread.php?p=1391556#post1391556 >> "%vs_file1%"
ECHO     #video = core.std.Trim(video,first=1) # *** fix a double-framerate bug in NV per http://forum.doom9.org/showthread.php?p=1391556#post1391556 *** >> "%vs_file1%"
ECHO     #  >> "%vs_file1%"
ECHO     # 4. a DGIndex file (remember to loadplugin avs DGDecode.dll if doing this option) >> "%vs_file1%"
ECHO     # video = core.avs.MPEG2Source(r'%pard2v%',info=0,iPP=True,cpu=6)  # DEBLOCK and DERING >> "%vs_file1%"
ECHO     # video = core.avs.MPEG2Source(r'%pard2v%',info=0,iPP=True,cpu=4)  # DEBLOCK >> "%vs_file1%"
ECHO     # video = core.avs.MPEG2Source(r'%pard2v%',info=0,iPP=True,cpu=0) >> "%vs_file1%"
ECHO     #    ipp=true interlaced post-processing >> "%vs_file1%"
ECHO     #    cpu=4    DEBLOCK_Y_H, DEBLOCK_Y_V, DEBLOCK_C_H, DEBLOCK_C_V >> "%vs_file1%"
ECHO     #    cpu=6    DEBLOCK_Y_H, DEBLOCK_Y_V, DEBLOCK_C_H, DEBLOCK_C_V, DERING_Y, DERING_C >> "%vs_file1%"
ECHO     #---------------------------------------------------------------------------------------------------- >> "%vs_file1%"
ECHO     # INITIAL TRIM AND DEBUG IF REQUIRED >> "%vs_file1%"
ECHO     # Display the clip information in the top left corner of the video output >> "%vs_file1%"
ECHO     # video = core.text.ClipInfo() >> "%vs_file1%"
ECHO     # --- Trim  >> "%vs_file1%"
ECHO     # video = core.std.Trim(video,first=50,last=100) >> "%vs_file1%"
ECHO     # video = core.std.Trim(video,first=0,length=3000) # 2 minutes >> "%vs_file1%"
ECHO     # --- Crop >> "%vs_file1%"
ECHO     # video = core.std.CropRel(video, left=leftv, right=rightv, top=topv, bottom=bottomv) >> "%vs_file1%"
ECHO     # --- Add borders >> "%vs_file1%"
ECHO     # video = core.std.AddBorders(video, left=leftv, right=rightv, top=topv, bottom=bottomv)     # defaults to black >> "%vs_file1%"
ECHO     #---------------------------------------------------------------------------------------------------- >> "%vs_file1%"
ECHO     #video = core.knlm.KNLMeansCL(video, device_type="gpu", d=1, a=2) >> "%vs_file1%"
ECHO     ##video = core.knlm.KNLMeansCL(video, device_type="gpu", device_id=0, d=1, a=2, info=true) >> "%vs_file1%"
ECHO     #---------------------------------------------------------------------------------------------------- >> "%vs_file1%"
ECHO     # FINISHED - output the video >> "%vs_file1%"
ECHO     video.set_output() >> "%vs_file1%"
ECHO     return True >> "%vs_file1%"
ECHO     #---------------------------------------------------------------------------------------------------- >> "%vs_file1%"
ECHO     # END OF INDENTING  >> "%vs_file1%"
ECHO     #---------------------------------------------------------------------------------------------------- >> "%vs_file1%"
ECHO     # ready for processing by X264 >> "%vs_file1%"
ECHO     #    vspipe --y4m script.vpy - "pipe synbol" x264 --demuxer y4m - --output encoded.mkv >> "%vs_file1%"
ECHO     # ready for processing by FFMPEG >> "%vs_file1%"
ECHO     #    vspipe --y4m script.vpy - "pipe synbol" ffmpeg -i pipe: encoded.mkv >> "%vs_file1%"
ECHO # >> "%vs_file1%"
ECHO #---------------------------------------------------------------------------------------------------- >> "%vs_file1%"
ECHO #---------------------------------------------------------------------------------------------------- >> "%vs_file1%"
ECHO #---------------------------------------------------------------------------------------------------- >> "%vs_file1%"
ECHO #---------------------------------------------------------------------------------------------------- >> "%vs_file1%"
ECHO #if __name__ == "__main__": >> "%vs_file1%"
ECHO #    # execute main only if run as a script >> "%vs_file1%"
ECHO #    main() >> "%vs_file1%"
ECHO main() >> "%vs_file1%"
ECHO #---------------------------------------------------------------------------------------------------- >> "%vs_file1%"
ECHO # >> "%vs_file1%"
@ECHO on
REM --- create the .vpy -----------------------------------------------------------------

REM --- encode the .vpy -----------------------------------------------------------------

echo resized by DGDecodeNV 
SET bfflags=-bf 2 -g 50 -refs 3
REM -refs 3 = We will also limit to 3 reference frames
REM -bf 2 = add B-frames. These are the most efficient frames in the H.264 standard. 
REM -g 50 = limit to GOP size 50
REM
"%VSpipeEXE64%" --y4m "%vs_file1%" - | "%ffmpegexex64%" -threads 0 -i pipe: -threads 0 -an -threads 0 -map_metadata -1 -sws_flags lanczos+accurate_rnd+full_chroma_int+full_chroma_inp -filter:v unsharp=opencl=1:luma_msize_x=3:luma_msize_y=3:luma_amount=0.5:chroma_msize_x=3:chroma_msize_y=3:chroma_amount=0.5,setdar=dar=16/9 -c:v h264_nvenc -preset hq %bfflags% -rc:v constqp -global_quality %CRF% -profile:v high -level 4.1 -pixel_format yuv420p  -threads 0 -movflags +faststart -y "%PARtemp%"
SET EL=!ERRORLEVEL!
IF /I "!EL!" NEQ "0" (
   Echo *********  Error !EL! was found 
   Echo *********  Error !EL! was found 
   Echo *********  Error !EL! was found 
   Echo *********  ABORTING ... 
   EXIT !EL!
)

REM +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

REM MUX the result

"%mp4boxexex32%" -add "%PARtemp%":lang=eng -add "%paraac%":lang=eng:delay=!AudioDelayms! -isma -new "%PARaacmp4%" 
SET EL=!ERRORLEVEL!
IF /I "!EL!" NEQ "0" (
   Echo *********  Error !EL! was found 
   Echo *********  Error !EL! was found 
   Echo *********  Error !EL! was found 
   Echo *********  ABORTING ... 
   EXIT !EL!
)
REM pause

REM +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
REM +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

DEL "!jsonFile!"
DEL "!vs_file1!"
DEL "!PARF2!"
DEL "!pard2v!"
DEL "!pardgi!"
DEL "!PARtemp!"
DEL "!paraac!"
DEL "!parmp3!"

REM SAR = DAR / PAR
REM for 16:9 720x576i   use --sar=64:45 (most 4:3 are in 16:9 shell frame nowadays, so use this)  
REM for 16:9 1440x1080i use --sar=4:3  
REM for 16:9 1920x1080i use --sar=1:1  
REM
REM http://developer.divx.com/docs/divx_plus_hd/Creation_with_x264/
REM Interlaced resolution 	 Support SARs
REM 1920x1080i50                  1:1   (16:9 frame)
REM 1440x1080i50                  1:1   (4:3 frame),  4:3 (16:9 frame)
REM 720x576i50              1:1, 64:45 (16:9 frame), 12:11 (4:3 frame)
REM 704x576i50              1:1, 64:45 (16:9 frame), 12:11 (4:3 frame)
REM 480x576i50              1:1, 24:11 (16:9 frame), 18:11 (4:3 frame)
REM 352x576i50              1:1, 32:11 (16:9 frame), 24:11 (4:3 frame)
REM 
REM 1920x1080i60                  1:1   (16:9 frame)
REM 1440x1080i60                  1:1   (4:3 frame),  4:3 (16:9 frame)
REM 720x480i60              1:1, 40:33 (16:9 frame), 10:11 (4:3 frame)
REM 704x480i60              1:1, 40:33 (16:9 frame), 10:11 (4:3 frame)
REM 640x480i60                    1:1   (4:3 frame)
REM 480x480i60              1:1, 20:11 (16:9 frame), 15:11 (4:3 frame)
REM 352x480i60              1:1, 80:33 (16:9 frame), 20:11 (4:3 frame)
REM 
@echo off
@ECHO End -----------------------------------------------------------------------------------------
GOTO :EOF

:LoCase
:: Subroutine to convert a variable VALUE to all lower case.
:: The argument for this subroutine is the variable NAME.
FOR %%i IN ("A=a" "B=b" "C=c" "D=d" "E=e" "F=f" "G=g" "H=h" "I=i" "J=j" "K=k" "L=l" "M=m" "N=n" "O=o" "P=p" "Q=q" "R=r" "S=s" "T=t" "U=u" "V=v" "W=w" "X=x" "Y=y" "Z=z") DO CALL SET "%1=%%%1:%%~i%%"
GOTO:EOF
:UpCase
:: Subroutine to convert a variable VALUE to all UPPER CASE.
:: The argument for this subroutine is the variable NAME.
FOR %%i IN ("a=A" "b=B" "c=C" "d=D" "e=E" "f=F" "g=G" "h=H" "i=I" "j=J" "k=K" "l=L" "m=M" "n=N" "o=O" "p=P" "q=Q" "r=R" "s=S" "t=T" "u=U" "v=V" "w=W" "x=X" "y=Y" "z=Z") DO CALL SET "%1=%%%1:%%~i%%"
GOTO:EOF
:TCase
:: Subroutine to convert a variable VALUE to Title Case.
:: The argument for this subroutine is the variable NAME.
FOR %%i IN (" a= A" " b= B" " c= C" " d= D" " e= E" " f= F" " g= G" " h= H" " i= I" " j= J" " k= K" " l= L" " m= M" " n= N" " o= O" " p= P" " q= Q" " r= R" " s= S" " t= T" " u= U" " v= V" " w= W" " x= X" " y= Y" " z= Z") DO CALL SET "%1=%%%1:%%~i%%"
GOTO:EOF

EXIT

REM 50fps
REM removed -ar %audiofreq% due to a bug inm ffmpeg it still reports it as 48000 even though downconverted to 44100
REM "%NVENCFFMPEGEXE%" -i "%fINPUT%" -map_metadata -1 -filter:v yadif=1:0:0,setdar=dar=16/9,unsharp -r 50 -c:v nvenc -preset hq -b:v 3500k -minrate 500k -maxrate 6000k -profile:v high -level 4.1 -coder 0 -bf 3 -g 50 -movflags +faststart -c:a libmp3lame -ar %audiofreq% -ab %audiobitrate%  -y "%fOUTPUT%"

REM 25 FPS
REM unsharp masking
REM removed -ar %audiofreq% due to a bug inm ffmpeg it still reports it as 48000 even though downconverted to 44100
"%NVENCFFMPEGEXE%" -i "%fINPUT%" -map_metadata -1 -filter:v yadif=0:0:0,setdar=dar=16/9,unsharp -r 25 -c:v nvenc -preset hq -b:v 3500k -minrate 500k -maxrate 6000k -profile:v high -level 4.1 -coder 0 -bf 3 -g 50 -movflags +faststart -c:a libmp3lame -ar %audiofreq% -ab %audiobitrate% -y "%fOUTPUT%"

--qpmin 8 --qpmax 36 --qpstep 4 

  -preset            <string>     E..V.... Set the encoding preset (one of slow = hq 2pass, medium = hq, fast = hp, hq, hp, bd, ll, llhq, llhp, default) (default "hq")
  -profile           <string>     E..V.... Set the encoding profile (high, main, baseline or high444p)
  -level             <string>     E..V.... Set the encoding level restriction (auto, 1.0, 1.0b, 1.1, 1.2, ..., 4.2, 5.0, 5.1)
  -tier              <string>     E..V.... Set the encoding tier (main or high)
  -cbr               <boolean>    E..V.... Use cbr encoding mode (default false)
  -2pass             <boolean>    E..V.... Use 2pass encoding mode (default auto)
  -gpu               <int>        E..V.... Selects which NVENC capable GPU to use. First GPU is 0, second is 1, and so on. (from 0 to INT_MAX) (default 0)
  -delay             <int>        E..V.... Delays frame output by the given amount of frames. (from 0 to INT_MAX) (default INT_MAX)

REM use -filter:v yadif=0:0:0,setdar=16:9 -r 25
REM use -filter:v yadif=1:1:0,setdar=16:9 -r 50 

REM
REM for interlaced (do not use, tablet doesnt like it)
REM -r rate             set frame rate (Hz value, fraction or abbreviation)
REM -s size             set frame size (WxH or abbreviation)
REM yadif
REM To have a constant quality (but a variable bitrate), use the option ’-qscale n’ when ’n’ is between 1 (excellent quality) and 31 (worst quality)
REM -coder 0 means nocabac
Rem Tff=1 Bff=0
REM set FieldFirst=1
rem INTERLACED FLAGS -ilme -ildct -flags +ildct+ilme -ildctcmp satd -top %FieldFirst% 
REM "C:\SOFTWARE\ffmpeg\0-latest\bin\ffmpeg.exe" -y -i "%fINPUT%" -c:v h264 -preset fast -flags +ildct+ilme -ildctcmp satd -top %FieldFirst% -profile:v high -level 4.1 -qscale:v %qscale% -coder 0 -an "%fOUTPUTv%"
REM
REM http://ffmpeg.org/ffmpeg-filters.html#yadif
REM 8.58 yadif
REM 
REM Deinterlace the input video ("yadif" means "yet another deinterlacing filter").
REM It accepts the optional parameters: mode:parity:auto.
REM mode specifies the interlacing mode to adopt, accepts one of the following values:
REM ‘0’    output 1 frame for each frame 
REM ‘1’    output 1 frame for each field 
REM ‘2’    like 0 but skips spatial interlacing check 
REM ‘3’    like 1 but skips spatial interlacing check 
REM Default value is 0.
REM parity specifies the picture field parity assumed for the input interlaced video, accepts one of the following values:
REM ‘0’    assume top field first 
REM ‘1’    assume bottom field first 
REM ‘-1’   enable automatic detection 
REM Default value is -1. If interlacing is unknown or decoder does not export this information, top field first will be assumed.
REM auto specifies if deinterlacer should trust the interlaced flag and only deinterlace frames marked as interlaced
REM ‘0’    deinterlace all frames 
REM ‘1’    only deinterlace frames marked as interlaced 
REM Default value is 0. 
REM -filter:v yadif=1:0:0 -r 50

Comments welcomed on links to and views on other GPU based filters or facilities such as sharpeners or denoisers.

Last edited by hydra3333; 26th Jan 2017 at 02:09. Reason: clarified x64

Quote

26th Jan 2017 08:55 #3

poisondeathray

Member

Originally Posted by hydra3333

Comments welcomed on links to and views on other GPU based filters or facilities such as sharpeners or denoisers.

I didn't look through everything closely, but there is the other category of VPP filters . I don't think they have made it into ffmpeg yet , but qsvenc has them . Or maybe they are in ffmpeg now, I haven't check recently. I think they are either fully or partially accelerated by iGPU , there might be more discussion at doom9 , maybe in the software playback forum (they are filters meant for playback)

--vpp-detail-enhance would be similar to sharpening, and --vpp-denoise would be denoising . I only tested them briefly, and quality isn't very good (but this is coming from avisynth/vapoursynth). Neither is deinterlacing (compared to QTGMC)

QSVEncC

Code:

VPP Options:
   --vpp-denoise <int>          use vpp denoise, set strength (0-100)
   --vpp-detail-enhance <int>   use vpp detail enahancer, set strength (0-100)
   --vpp-deinterlace <string>   set vpp deinterlace mode
                                 - none     disable deinterlace
                                 - normal   normal deinterlace
                                 - it       inverse telecine
                                 - bob      double framerate
   --vpp-image-stab <string>    set image stabilizer mode
                                 - none, upscale, box
   --vpp-sub [<int>] or [<string>]
                                burn in subtitle into frame
                                set sub track number in input file by integer
                                or set external sub file path by string.
   --vpp-sub-charset [<string>] set subtitle char set
   --vpp-sub-shaping <string>   simple(default), complex
   --vpp-delogo <string>        set delogo file path
   --vpp-delogo-select <string> set target logo name or auto select file
                                 or logo index starting from 1.
   --vpp-delogo-pos <int>:<int> set delogo pos offset
   --vpp-delogo-depth <int>     set delogo depth [default:128]
   --vpp-delogo-y  <int>        set delogo y  param
   --vpp-delogo-cb <int>        set delogo cb param
   --vpp-delogo-cr <int>        set delogo cr param
   --vpp-delogo-add             add logo mode
   --vpp-rotate <int>           rotate image
                                 90, 180, 270.
   --vpp-mirror <string>        mirror image
                                 - h   mirror in horizontal direction
                                 - v   mirror in vertical   direction
   --vpp-scaling <string>       set scaling quality
                                 - auto(default)
                                 - simple   use simple scaling
                                 - fine     use high quality scaling
   --vpp-half-turn              half turn video image
                                 unoptimized and very slow.

NVEncC

I don't know if all of them are GPU for NV . deinterlace and resize are for sure. Not sure about gauss, knn

Code:

   --vbv-bufsize <int>          set vbv buffer size (kbit) / Default: auto
   --vpp-deinterlace <string>   set deinterlace mode / Default: none
                                  none, bob, adaptive (normal)
                                  available only with avcuvid reader
   --vpp-resize <string>        default, nn, npp_linear, cubic
                                cubic_bspline, cubic_catmull, cubic_b05c03
                                super, lanczos, bilinear, spline36
                                 default: default
   --vpp-gauss <int>            disabled, 3, 5, 7
                                 default: disabled
   --vpp-knn [<param1>=<value>][,<param2>=<value>][...]     enable denoise filter by K-nearest neighbor.
    params
      radius=<int>              set radius of knn (default=3)
      strength=<float>          set strength of knn (default=0.08, 0.0-1.0)
      lerp=<float>              set balance of orig and blended pixel (default=0.20)
                                  lower value results strong denoise.
      th_lerp=<float>           set threshold for detecting edge (default=0.80, 0.0-1.0)
                                  higher value will preserve edge.
   --vpp-pmd [<param1>=<value>][,<param2>=<value>][...]     enable denoise filter by pmd.
    params
      apply_count=<int>         set count to apply pmd denoise (default=2)
      strength=<float>          set strength of pmd (default=100.00, 0.0-100.0)
      threshold=<float>         set threshold of pmd (default=100.00, 0.0-255.0)
                                  lower value will preserve edge.
   --vpp-delogo <string>        set delogo file path
   --vpp-delogo-select <string> set target logo name or auto select file
                                 or logo index starting from 1.
   --vpp-delogo-pos <int>:<int> set delogo pos offset
   --vpp-delogo-depth <int>     set delogo depth [default:144]
   --vpp-delogo-y  <int>        set delogo y  param
   --vpp-delogo-cb <int>        set delogo cb param
   --vpp-delogo-cr <int>        set delogo cr param
   --vpp-perf-monitor           check duration of each filter.
                                  may decrease overall transcode performance.

Also, I've found sometimes GPU decoding (non indexed with dgindex, I mean directly with ffmpeg or avcuvid for qsvencc/nvencc) has errors or mixed up frames or wrong frames. For example dropped frames is semi common. It's probably not because of system instability or drivers because I check on several different stock (non OC) computers. CPU decoding works mostly reliably (especially if checked with through an avs or vpy), DGNV works.

Last edited by poisondeathray; 26th Jan 2017 at 09:04.

Quote

26th Jan 2017 21:17 #4

hydra3333

Member

Originally Posted by poisondeathray

... I don't know if all of them are GPU for NV . deinterlace and resize are for sure. Not sure about gauss, knn ...

Ok thank you I'll go look.
I hope vpp is not like nvidia's npp where they apparently provide little or no useful support to the extent that someone said it should be declared deprecated.
https://ffmpeg.zeranoe.com/forum/viewtopic.php?f=3&t=858&p=11934&hilit=npp#p11934

Originally Posted by poisondeathray

... Also, I've found sometimes GPU decoding (non indexed with dgindex, I mean directly with ffmpeg or avcuvid for qsvencc/nvencc) has errors or mixed up frames or wrong frames. For example dropped frames is semi common. It's probably not because of system instability or drivers because I check on several different stock (non OC) computers. CPU decoding works mostly reliably (especially if checked with through an avs or vpy), DGNV works.

Yes, I noticed similar horrible issues with ffmpeg's nvidia cvuda hardware accelerated mpeg2 decoding, to the extent that it was totally unusable almost all of the time.
DGdecodeNV works fine though, go figure.

Yes, ffmpeg's cpu decoding generally works, although it sometimes surprisingly and unexpectedly spits errors all the way through some mpeg2 outputs from quick-stream-fixed" .TS files from VideoReDo (but ffmpeg still manages to produce a usable re-encoded file).

Quote

27th Jan 2017 04:00 #5

hydra3333

Member

Originally Posted by hydra3333

I hope vpp is not like nvidia's npp where they apparently provide little or no useful support to the extent that someone said it should be declared deprecated.
https://ffmpeg.zeranoe.com/forum/viewtopic.php?f=3&t=858&p=11934&hilit=npp#p11934

Well, just on that - rigaya has a website about the tool NVEncC as you know, http://rigaya34589.blog135.fc2.com/blog-entry-739.html
and that makes no bones about using nvidia's npp facilities based on nvidia cuda toolkit v8 at http://docs.nvidia.com/cuda/index.html#axzz4WxBhA7k1 and http://docs.nvidia.com/cuda/npp/index.html#axzz4WxBhA7k1
which may well be about or after the date of complaints about npp in http://ffmpeg.org/pipermail/ffmpeg-trac/2016-September/036160.html and http://ffmpeg.org/pipermail/ffmpeg-trac/2016-September/036609.html
... so nvidia cuda toolkit v8 may have addressed those issues and npp may be "ok" after all ?

I also see you can build NVEncC yourself from code at github https://github.com/rigaya/NVEnc which seems attractive.

Quote

28th Jan 2017 06:06 #6

videoh

Banned

Originally Posted by hydra3333

Yes, I noticed similar horrible issues with ffmpeg's nvidia cvuda hardware accelerated mpeg2 decoding, to the extent that it was totally unusable almost all of the time. DGdecodeNV works fine though, go figure.

This may not be a deficiency of ffmpeg. There is a regression in CUVID in recent nVidia driver versions that breaks MPEG2 streams containing sequence header changes. That includes inconsequential changes such as the bit_rate_value for the sequence. It's a totally brain-dead thing they did. A bug was reported and it will be fixed in a future driver. Meanwhile, DGDecNV implemented a workaround, but it cannot be complete; for example, if the stream changes quant matrices in the sequence header, the GOP will be broken and artifacts produced. See here for more details:

http://rationalqm.us/board/viewtopic.php?f=8&t=541

Of course, there could be issues in ffmpeg itself, but for linear play it seems unlikely.

Last edited by videoh; 28th Jan 2017 at 06:17.

Quote

28th Jan 2017 16:24 #7

hydra3333

Member

Ah. Thank you. We await a good driver, then.

Quote

27th Sep 2017 20:46 #8

hydra3333

Member

Originally Posted by videoh

This may not be a deficiency of ffmpeg. There is a regression in CUVID in recent nVidia driver versions that breaks MPEG2 streams containing sequence header changes. That includes inconsequential changes such as the bit_rate_value for the sequence. It's a totally brain-dead thing they did. A bug was reported and it will be fixed in a future driver.

Hey @videoh, do you know if the bug has been fixed in a "future driver" which has been released now ?

Not sure if this makes any difference to ffmpeg:
http://docs.nvidia.com/cuda/video-decoder/index.html

Video Decoder (PDF) - v9.0.176 (older) - Last updated September 22, 2017

NVIDIA Video Decoder (NVCUVID) is deprecated. Instead, use the NVIDIA Video Codec SDK.

Attempting to cross-compile a static ffmpeg using a variation of rdp's script under ubuntu, with an nvidia windows sdk was a right tight fair b*st*rd of a task for a beginner like me to try out and I gave up on it in the end. If I read this right, a windows target was not supported anyway http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#cross-platform

Table 5. Supported Target Arch/OS Combinations
TARGET OS linux darwin android qnx
TARGET ARCH
x86_64 YES YES NO NO
armv7l YES NO YES YES
aarch64 NO NO YES NO
ppc64le YES NO NO NO

Read more at: http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#ixzz4tw7AFBfX
Follow us: @GPUComputing on Twitter | NVIDIA on Facebook

Last edited by hydra3333; 27th Sep 2017 at 21:23.

Quote

28th Sep 2017 08:22 #9

videoh

Banned

Originally Posted by hydra3333

Originally Posted by videoh

This may not be a deficiency of ffmpeg. There is a regression in CUVID in recent nVidia driver versions that breaks MPEG2 streams containing sequence header changes. That includes inconsequential changes such as the bit_rate_value for the sequence. It's a totally brain-dead thing they did. A bug was reported and it will be fixed in a future driver.

Hey @videoh, do you know if the bug has been fixed in a "future driver" which has been released now ?

It's fixed in the 385.41 driver I am running. I don't know which was the first driver to get the fix.

Quote

28th Nov 2017 07:01 #10

hydra3333

Member

This is just an fyi for ffmpeg users who use OpenCL for simple GPU accelerated decoding/encoding/filtering.

As at 2016.11.28 ,
- ffmpeg git master has changed the way OpenCL is used on the commandline
- additionally, NVDEC now supports all of the following decoding: H.264, HEVC, MPEG-1/2/4, VC1, VP8/9

Links:
https://ffmpeg.zeranoe.com/forum/viewtopic.php?f=7&t=5230&p=12840#p12840
https://ffmpeg.zeranoe.com/forum/viewtopic.php?f=42&p=12837#p12837

So, the old OpenCL approach no longer works, eg unsharp mask is no longer opencl.
An example of a commandline which does work (even if the options are not ideal) :-

".\ffmpeg.exe" -hide_banner -nostats -v verbose -init_hw_device opencl=ocl:1.0 -filter_hw_device ocl -i ".\test.mpg.mpg" -an -map_metadata -1 -sws_flags lanczos+accurate_rnd+full_chroma_int+full_chroma_i np -filter_complex "[0:v]yadif=0:0:0,hwupload,unsharp_opencl=lx=3:ly=3:la=0 .5:cx=3:cy=3:ca=0.5,hwdownload,setdar=dar=16/9" -r 25 -c:v h264_nvenc -preset slow -bf 2 -g 50 -refs 3 -rc:v vbr_hq -rc-lookahead:v 32 -cq 22 -qmin 16 -qmax 25 -coder cabac -movflags +faststart -profile:v high -level 4.1 -pixel_format yuv420p -y ".\test.mpg-temp.MP4"

Note the use of hwupload and hwdownload around opencl usage to avoid abortworthy errors.

The following commandlines are handy to know about (don't worry about the errors they generate) :-
Code:
".\ffmpeg.exe" -hide_banner -v verbose -init_hw_device list 
".\ffmpeg.exe" -hide_banner -v verbose -init_hw_device opencl 
".\ffmpeg.exe" -hide_banner -v verbose -init_hw_device opencl:1.0
This one in particular :-
Code:
".\ffmpeg.exe" -hide_banner -v verbose -init_hw_device opencl
is used to find out your nvidia graphics card's device id (eg 1.0) to be used in this option :-
Code:
-init_hw_device opencl=ocl:1.0
Cheers

Quote

To GPU process, or not to GPU process ? 450fps HD -> SD

Thread Tools

Similar Threads

gpu deinterlacer

Gpu temp

Which GPU is my PC using? I see two

gpu for HTPC

Which GPU for GPU accelerated rendering?