VideoHelp Forum
+ Reply to Thread
Page 2 of 2
FirstFirst 1 2
Results 31 to 48 of 48
Thread
  1. ...now it's back to running at .25fps encode...
    And it was, what beforehand ?

    Did you change the used torch-addon, or did the torch-addon stay the same?

    The video encoding call seems fine.
    1st thing I see:
    I suggested: CAS, DPIRDeblock, DeHalo_alpha, NNEDI3CL
    You used: CAS, DPIRDeblock, QTGMC, DeHalo_alpha, NNEDI3CL
    not sure if you did that in your previous encode.
    (you might want to enable OpenCL in QTGMC, but I doubt that is the reason for much of a slowdown)
    Also, you are using DPIRDeblock with strength 30 not 15 like I suggested.

    You can also try clearing your temp folder, so that any saved TensorRT engine&co files will get recreated.

    Cu Selur
    users currently on my ignore list: deadrats, Stears555
    Quote Quote  
  2. Member
    Join Date
    Jul 2024
    Location
    Tacoma, WA
    Search PM
    I was getting about 2.5-3fps for a full video encode I tested last night, which, while not great, was a lot better than the .25 obviously. I can live with that since most of my encoding is done in the middle of the night and while I’m at work. I used the 30 strength and the QTGMC denoiser on it for that one too and it seemed to work great as far as encoding speed. I don’t think I changed anything in the torch add-on (didn’t go into the folder). I uninstalled and deleted everything off my computer and am working on a reinstall tonight. Probably overkill but I wanted to clear everything out. Shouldn’t have messed with it this morning.
    Last edited by SVAPrjm; 28th Jul 2024 at 02:10.
    Quote Quote  
  3. Member
    Join Date
    Jul 2024
    Location
    Tacoma, WA
    Search PM
    Reinstalled everything and set the filters exactly as you recommended still not replicating what I did last night. I know DPIR is a bottleneck but I can't seem to even figure out how I configured it to run decently well (for my hardware anyways) even with the recommended filter order.

    Code:
    x264 --preset veryfast --crf 18.00 --profile high --level 5.1 --ref 3 --direct auto --b-adapt 0 --sync-lookahead 18 --qcomp 0.50 --rc-lookahead 40 --qpmax 51 --partitions i4x4,p8x8,b8x8 --no-fast-pskip --subme 5 --aq-mode 0 --vbv-maxrate 300000 --vbv-bufsize 300000 --sar 32:27 --qpfile GENERATED_QP_FILE --non-deterministic --range tv --colormatrix bt470bg --demuxer raw --input-res 1920x1280 --input-csp i420 --input-range tv --input-depth 8 --fps 30000/1001 --output-depth 8 --output "C:\Users\Computer\AppData\Local\Temp\Visitors test13.264" -
    
    # Imports
    import vapoursynth as vs
    # getting Vapoursynth core
    import sys
    import os
    core = vs.core
    # Import scripts folder
    scriptPath = 'C:/Program Files/Hybrid/64bit/vsscripts'
    sys.path.insert(0, os.path.abspath(scriptPath))
    # loading plugins
    core.std.LoadPlugin(path="C:/Program Files/Hybrid/64bit/vsfilters/ResizeFilter/nnedi3/vsznedi3.dll")
    core.std.LoadPlugin(path="C:/Program Files/Hybrid/64bit/vsfilters/GrainFilter/RemoveGrain/RemoveGrainVS.dll")
    core.std.LoadPlugin(path="C:/Program Files/Hybrid/64bit/vsfilters/Support/fmtconv.dll")
    core.std.LoadPlugin(path="C:/Program Files/Hybrid/64bit/vsfilters/SharpenFilter/CAS/CAS.dll")
    core.std.LoadPlugin(path="C:/Program Files/Hybrid/64bit/vsfilters/SourceFilter/LSmashSource/LSMASHSource.dll")
    # Import scripts
    import edi_rpow2
    import havsfunc
    import validate
    # Source: 'C:\Users\Computer\Videos\RJ videos\Videograss\(2017) Visitors (1)-003.mkv'
    # Current color space: YUV420P8, bit depth: 8, resolution: 720x480, frame rate: 29.97fps, scanorder: progressive, yuv luminance scale: limited, matrix: 470bg
    # Loading C:\Users\Computer\Videos\RJ videos\Videograss\(2017) Visitors (1)-003.mkv using LWLibavSource
    clip = core.lsmas.LWLibavSource(source="C:/Users/Computer/Videos/RJ videos/Videograss/(2017) Visitors (1)-003.mkv", format="YUV420P8", stream_index=0, cache=0, prefer_hw=0)
    frame = clip.get_frame(0)
    # Setting detected color matrix (470bg).
    clip = core.std.SetFrameProps(clip=clip, _Matrix=5)
    # setting color transfer (170), if it is not set.
    if validate.transferIsInvalid(clip):
      clip = core.std.SetFrameProps(clip=clip, _Transfer=6)
    # setting color primaries info (to 470), if it is not set.
    if validate.primariesIsInvalid(clip):
      clip = core.std.SetFrameProps(clip=clip, _Primaries=5)
    # setting color range to TV (limited) range.
    clip = core.std.SetFrameProps(clip=clip, _ColorRange=1)
    # making sure frame rate is set to 29.97fps
    clip = core.std.AssumeFPS(clip=clip, fpsnum=30000, fpsden=1001)
    # making sure the detected scan type is set (detected: progressive)
    clip = core.std.SetFrameProps(clip=clip, _FieldBased=0) # progressive
    # contrast sharpening using CAS
    clip = core.cas.CAS(clip=clip, sharpness=0.700)
    from vsdpir import dpir as DPIR
    # adjusting color space from YUV420P8 to RGBS for vsDPIRDeblock
    clip = core.resize.Bicubic(clip=clip, format=vs.RGBS, matrix_in_s="470bg", range_s="limited")
    # deblocking using DPIRDeblock
    clip = DPIR(clip=clip, strength=15.000, task="deblock", device_index=0, num_streams=3, trt=True, trt_cache_dir="")
    # adjusting color space from RGBS to YUV444P16 for vsDeHalo_Alpha
    clip = core.resize.Bicubic(clip=clip, format=vs.YUV444P16, matrix_s="470bg", range_s="limited", dither_type="error_diffusion")
    # applying dehalo using DeHalo_alpha
    clip = havsfunc.DeHalo_alpha(clip, rx=2.50)
    # resizing using ZNEDI3
    # current: 720x480 target: 1920x1280 -> pow: 4
    clip = edi_rpow2.nnedi3_rpow2(clip=clip, rfactor=4, nsize=3, nns=4) # 2880x1920
    # adjusting resizing
    clip = core.fmtc.resample(clip=clip, w=1920, h=1280, kernel="spline64", interlaced=False, interlacedd=False)# before YUV444P16 after YUV444P16
    # adjusting output color from: YUV444P16 to YUV420P8 for x264Model
    clip = core.resize.Bicubic(clip=clip, format=vs.YUV420P8, range_s="limited", dither_type="error_diffusion")
    # set output frame rate to 29.97fps (progressive)
    clip = core.std.AssumeFPS(clip=clip, fpsnum=30000, fpsden=1001)
    # output
    clip.set_output()
    Short of buying new hardware, I'm open to any suggestions as to what I am doing wrong.
    Quote Quote  
  4. Enable FP16 to get it faster.
    users currently on my ignore list: deadrats, Stears555
    Quote Quote  
  5. Member
    Join Date
    Jul 2024
    Location
    Tacoma, WA
    Search PM
    Thanks. Just playing around with DPIRdeblockMLRT and was able to get it going with FP16 enabled then crashed.
    Image Attached Files
    Quote Quote  
  6. According to the debug output:
    Code:
    clip = vsmlrt.DPIR(clip, strength=30.000, overlap=16, model=3, backend=Backend.TRT(fp16=False, device_id=0,verbose=True,use_cuda_graph=True, num_streams=3,builder_optimization_level=3,engine_folder="C:/Users/Computer/AppData/Local/Temp"))
    FP16 wasn't enabled.
    Just noticed a bug, fp16 is switched in DPIR DeBlock(mrt) *gig*

    In the debug output I see:
    Code:
    BuilderFlag::kTF32 is set but hardware does not support TF32. Disabling TF32.
    and a few on_stopJobPushButton_clicked,...
    no clue what you are doing, but my guess is that staring and stopping the gpu threads is causing problems.
    Last edited by Selur; 28th Jul 2024 at 14:26.
    users currently on my ignore list: deadrats, Stears555
    Quote Quote  
  7. Member
    Join Date
    Jul 2024
    Location
    Tacoma, WA
    Search PM
    Code:
    # deblocking using DPIRDeblock
    clip = DPIR(clip=clip, strength=30.000, task="deblock", device_index=0, num_streams=3, trt=True, trt_cache_dir="")
    I cleared my temp folder and reset everything to start from scratch. Clicked on DPIR FP16 (no mlrt) and noticed it doesn't show up in the script. Is that normal?

    and a few on_stopJobPushButton_clicked,...
    no clue what you are doing, but my guess is that staring and stopping the gpu threads is causing problems.
    Yeah, one thing that keeps happening with this is the encoding takes several minutes to start whereas the other night when I got it working, it didn't take but a minute or two for the video encode to start. The first time I enabled FP16 today it still wasn't encoding after about 25+ minutes. So I kept starting and stopping jobs while trying different settings to see if that would kickstart it.
    Quote Quote  
  8. Depending on the setting of the filter and the resolution you feed it with, a new engine file will need to be created when using TRT, depending on your hardware that can take quite a while,...
    Best clear your temp folder and create fresh engine files.
    users currently on my ignore list: deadrats, Stears555
    Quote Quote  
  9. Member
    Join Date
    Jul 2024
    Location
    Tacoma, WA
    Search PM
    Dumb question, but how does one "create fresh engine files" and what do the files look like that I should be clearing?
    Quote Quote  
  10. The engine files automatically get created for each filter that uses TRT. (inside the tmp folder)
    Depending on the filter they have one of the following extensions:
    • .ts
    • .engine
    • .engine.cache
    • .ep
    I would recommend:
    a. cleaning the temp folder
    b. not aborting engine&co creations

    Cu Selur
    Image Attached Thumbnails Click image for larger version

Name:	enginefiles.png
Views:	6
Size:	68.0 KB
ID:	81027  

    users currently on my ignore list: deadrats, Stears555
    Quote Quote  
  11. Member
    Join Date
    Jul 2024
    Location
    Tacoma, WA
    Search PM
    Ah ok. I did both earlier. Let an encode create an engine and run, took about an hour (again my hardware is subpar for this) for the encode to start but it ran fine, albeit slow, about .65fps lol, but worked and no crashes. Ran a 2nd one that doubled in speed. Not sure why that would be. I guess the main thing is not shutting it down or stopping it. I can live with the slow speed for now but I guess I need to upgrade my hardware.

    How often would you recommend clearing out my temp folder, if at all?
    Quote Quote  
  12. I usually clear the engine files whenever I update the torch or mlrt addon.
    The other files I usually clean whenever something crashes and I don't need the temp files.
    In general, I would recommend to:
    a. specify a separate, dedicated folder for Hybrids temp
    b. exclude it from virus scanner checks (to not slow down things)

    Cu Selur
    users currently on my ignore list: deadrats, Stears555
    Quote Quote  
  13. Member
    Join Date
    Jul 2024
    Location
    Tacoma, WA
    Search PM
    Where in Hybrid would would I specify those files to be stored in a separate file?
    Quote Quote  
  14. There is no option (atm.; it's planned) to save the engine&co file in a separate folder.
    users currently on my ignore list: deadrats, Stears555
    Quote Quote  
  15. Member
    Join Date
    Jul 2024
    Location
    Tacoma, WA
    Search PM
    Is there a prefetch equivalent in vapoursynth?
    Quote Quote  
  16. Not sure, never needed one so far.
    users currently on my ignore list: deadrats, Stears555
    Quote Quote  
  17. Member
    Join Date
    Jul 2024
    Location
    Tacoma, WA
    Search PM
    Reason I ask is I'm barely getting any CPU usage on x264 encodes. I read somewhere that higher usage doesn't necessarily mean better output in terms of fps because x264 is going to use a predetermined set of threads based on my logical core count. Would enabling process priority to higher value help?
    Quote Quote  
  18. Your processing of the Vapoursynth script is so slow, there is nothing to do for the video encoder.
    Something similar to prefetch wouldn't help at all, and neither would tweaking your x264 settings.
    You either need to use other filters or get more processing power. (DPIR is probably the main culprit,..)
    users currently on my ignore list: deadrats, Stears555
    Quote Quote  



Similar Threads

Visit our sponsor! Try DVDFab and backup Blu-rays!