VideoHelp Forum

Our website is made possible by displaying online advertisements to our visitors. Consider supporting us by disable your adblocker or buy Replay Video Capture or PlayON and record Netflix, HBO, etc! :)
+ Reply to Thread
Results 1 to 14 of 14
Thread
  1. The script below does what I want, but it runs at 7-9fps. I expected this on an older Athlon dual core, but when I dropped it on my Core i5 and saw the same speed, I assume something's wrong. CPU usage on both platforms doesn't exceed 50%, and I have tried moving the source to an SSD, so storage bottleneck isn't an issue.

    I welcome any improvements, be kind--this is my first time using AviSynth

    Running AviSynth 2.6.0.5 and the 2015-02-20 build of MT (avisynth.dll) using HCenc 0.27 21-10-2013 on XP 32bit (on both the Athlon and i5)

    Attached a sample for any input there as well. I know it's oversaturated but I just don't have the time to do meticulous correction there, especially since every scene in a ~2hr video is way different.

    Thanks in advance!

    Code:
    #Denoiser script for interlaced video using MDegrain2
    
    SetMemoryMax(768)
    
    Loadplugin("C:\Program Files\AviSynth 2.5\plugins\MVTools\mvtools2.dll")
    LoadPlugin("c:\Program Files\AviSynth 2.5\plugins\CNR\Cnr2.dll")
    LoadPlugin("C:\Program Files\AviSynth 2.5\plugins\aWarpSharp\aWarpSharp.dll")
    LoadPlugin("c:\Program Files\AviSynth 2.5\plugins\AutoAdjust\AutoAdjust.dll")
    
    SetMTMode(5,6)
    source=AVISource("test.avi").ConvertToYV12().LanczosResize(720,480).killaudio().AssumeBFF()
    SetMTMode(2)
    
    
    
    # Fix Chroma bleed
    bandaged=MergeChroma(source,awarpsharp2(source,depth=40))
    
    # Auto adjust levels and color balance
    adjusted=bandaged.AutoAdjust(auto_gain=true, auto_balance=true)
    
    chroma=bandaged.Cnr2("oxx",8,16,191,100,255,32,255,false) #VHS
    output=MDegrain2i2(chroma,8,2,0)
    
    return output
    
    #-------------------------------
    
    function MDegrain2i2(clip source, int "blksize", int "overlap", int "dct") 
    {
      Vshift=0 # 2 lines per bobbed-field per tape generation (PAL); original=2; copy=4 etc
      Hshift=0 # determine experimentally 
      overlap=default(overlap,0) # overlap value (0 to 4 for blksize=8)
      dct=default(dct,0) # use dct=1 for clip with light flicker
    
      fields=source.SeparateFields() # separate by fields
    
      #This line gets rid of vertical chroma halo. Don't use unless you have the problem
      #fields=MergeChroma(fields,crop(fields,Hshift,Vshift,0,0).addborders(0,0,Hshift,Vshift))
    
      super = fields.MSuper(pel=2, sharp=1)
      backward_vec2 = super.MAnalyse(isb = true, delta = 2, blksize=blksize, overlap=overlap, dct=dct)
      forward_vec2 = super.MAnalyse(isb = false, delta = 2, blksize=blksize, overlap=overlap, dct=dct)
      backward_vec4 = super.MAnalyse(isb = true, delta = 4, blksize=blksize, overlap=overlap, dct=dct)
      forward_vec4 = super.MAnalyse(isb = false, delta = 4, blksize=blksize, overlap=overlap, dct=dct)
    
      #Increase thSAD for more denoising. Won't do much beyone about 1500
      MDegrain2(fields,super, backward_vec2,forward_vec2,backward_vec4,forward_vec4,thSAD=400) 
    
      Weave()
    }
    Image Attached Files
    Quote Quote  
  2. I don't have AutoAdjust so I changed that line to adjusted=bandaged. Without any encoding on my i5 2500K the script ran at about 60 frames per second, ~100 percent cpu usage, with a random 720x576 video.
    Quote Quote  
  3. Originally Posted by jagabo View Post
    I don't have AutoAdjust so I changed that line to adjusted=bandaged. Without any encoding on my i5 2500K the script ran at about 60 frames per second, ~100 percent cpu usage, with a random 720x576 video.
    Thanks for the feedback. I'll try it on mine without AutoAdjust and see how it goes.

    Is it possible something about my avisynth/hcenc setup isn't proper, or at least the same as yours?
    Quote Quote  
  4. hcenc isn't very fast and isn't well multihtheaded. I suspect that's your problem.

    Oh, in your script I changed the AviSource() line to ffVideoSource() for the video I was using. I left out the resizing and other processing on that line. If I include those other options the script runs even faster -- bout 70 fps (the smaller frame processes faster in the rest of the script). But encoding it in HcGUI only got about 18 fps.
    Quote Quote  
  5. Originally Posted by jagabo View Post
    hcenc isn't very fast and isn't well multihtheaded. I suspect that's your problem.

    Oh, in your script I changed the AviSource() line to ffVideoSource() for the video I was using. I left out the resizing and other processing on that line. If I include those other options the script runs even faster -- bout 70 fps (the smaller frame processes faster in the rest of the script). But encoding it in HcGUI only got about 18 fps.
    Hmmm.

    I know I've run HC much faster than 7-8fps on my i5, I want to say around 50fps...but you're saying it slows down dramatically when you use HCgui, which I am using. but isn't that just a frontend to the same encoder? Also, is ffVideoSource() inherently faster than AviSource()?

    I'm pretty confused now
    Quote Quote  
  6. I don't think there's much difference in speed between the CLI and GUI hc encoders. But different settings can make a difference in speed.

    ffVideoSource() and AviSource() don't differ much in speed. Decompression codecs can make a difference. The source I was using was MPEG2 video in MKV container. If your source was a very large frame AVC video decoding it would be slower.

    Try opening your script in VirtualDub. Then select File -> Run Video Analysis Pass. What fps do you get?
    Quote Quote  
  7. Originally Posted by jagabo View Post
    I don't think there's much difference in speed between the CLI and GUI hc encoders. But different settings can make a difference in speed.

    ffVideoSource() and AviSource() don't differ much in speed. Decompression codecs can make a difference. The source I was using was MPEG2 video in MKV container. If your source was a very large frame AVC video decoding it would be slower.
    My source is uncompressed YUY2. At first I suspected disk IO limits, but again, it was still slow even on a modern SSD.

    I will tinker with the script to see what may be slowing it down.
    Quote Quote  
  8. Groucho2004
    Guest
    Originally Posted by diprotic View Post
    Originally Posted by jagabo View Post
    hcenc isn't very fast and isn't well multihtheaded. I suspect that's your problem.

    Oh, in your script I changed the AviSource() line to ffVideoSource() for the video I was using. I left out the resizing and other processing on that line. If I include those other options the script runs even faster -- bout 70 fps (the smaller frame processes faster in the rest of the script). But encoding it in HcGUI only got about 18 fps.
    Hmmm.

    I know I've run HC much faster than 7-8fps on my i5, I want to say around 50fps...but you're saying it slows down dramatically when you use HCgui, which I am using. but isn't that just a frontend to the same encoder? Also, is ffVideoSource() inherently faster than AviSource()?

    I'm pretty confused now
    HCgui is just that - a gui frontend for the hc encoder. The only reason why it would run slower is that hcgui runs hcenc with idle priority by default and another process is taking CPU cycles away from it.

    The encoder can only run as fast as the frames it receives from Avisynth. Test the script without the encoder as already suggested, either with VDub analysis or, even better, with AVSMeter.
    Quote Quote  
  9. Originally Posted by Groucho2004 View Post
    ....or, even better, with AVSMeter.
    Thanks for the tip, I'll give that a shot. It will help isolate either Avisynth or HC.
    Quote Quote  
  10. Originally Posted by jagabo View Post
    Try opening your script in VirtualDub. Then select File -> Run Video Analysis Pass. What fps do you get?
    Well, I tried both VirtualDub and AVCMeter. On my i5 they both top out at around 20fps and crank all four cores to ~75-80%.

    Running the script through both HCenc and ffmpeg yields ~10fps and ~7fps respectively :/

    AviSource() and ffVideoSource() appear to have no effect on speed, but here's an interesting part: if I disable the denoise function, it runs faster (expected), BUT the CPU usage also goes up.

    Changing source material has no apparent effect.

    Ultimately I'm getting the results I want. If I just have to wait longer, that's fine, although I still suspect something is off about my setup.

    Here's my final script:

    Code:
    #Denoiser script for interlaced video using MDegrain2
    
    SetMemoryMax(1024)
    
    Loadplugin("C:\Program Files\AviSynth 2.5\plugins\MVTools\mvtools2.dll")
    LoadPlugin("c:\Program Files\AviSynth 2.5\plugins\CNR\Cnr2.dll")
    LoadPlugin("C:\Program Files\AviSynth 2.5\plugins\aWarpSharp\aWarpSharp.dll")
    LoadPlugin("c:\Program Files\AviSynth 2.5\plugins\AutoAdjust\AutoAdjust.dll")
    LoadPlugin("c:\Program Files\AviSynth 2.5\plugins\ffms2\ffms2.dll")
    
    SetMTMode(3,0)
    source=AviSource("C:\Documents and Settings\Austin\Desktop\test.avi")
    source=source.LanczosResize(720,480)
    source=source.ConvertToYV12()
    source=source.killaudio()
    source=source.AssumeBFF()
    SetMTMode(2)
    
    # Auto adjust levels and color balance
    source=AutoAdjust(source, auto_gain=true, auto_balance=true)
    
    # Fix Chroma bleed
    source=MergeChroma(source,awarpsharp2(source,depth=40))
    
    chroma=source.Cnr2("oxx",8,16,191,100,255,32,255,false) #VHS
    output=MDegrain2i2(chroma,8,2,0)
    
    return output
    
    #-------------------------------
    
    function MDegrain2i2(clip source, int "blksize", int "overlap", int "dct") 
    {
      Vshift=0 # 2 lines per bobbed-field per tape generation (PAL); original=2; copy=4 etc
      Hshift=0 # determine experimentally 
      overlap=default(overlap,0) # overlap value (0 to 4 for blksize=8)
      dct=default(dct,0) # use dct=1 for clip with light flicker
    
      fields=source.SeparateFields() # separate by fields
    
      #This line gets rid of vertical chroma halo. Don't use unless you have the problem
      #fields=MergeChroma(fields,crop(fields,Hshift,Vshift,0,0).addborders(0,0,Hshift,Vshift))
    
      super = fields.MSuper(pel=2, sharp=1)
      backward_vec2 = super.MAnalyse(isb = true, delta = 2, blksize=blksize, overlap=overlap, dct=dct)
      forward_vec2 = super.MAnalyse(isb = false, delta = 2, blksize=blksize, overlap=overlap, dct=dct)
      backward_vec4 = super.MAnalyse(isb = true, delta = 4, blksize=blksize, overlap=overlap, dct=dct)
      forward_vec4 = super.MAnalyse(isb = false, delta = 4, blksize=blksize, overlap=overlap, dct=dct)
    
      #Increase thSAD for more denoising. Won't do much beyone about 1500
      MDegrain2(fields,super, backward_vec2,forward_vec2,backward_vec4,forward_vec4,thSAD=400) 
    
      Weave()
    }
    Last edited by diprotic; 27th Feb 2015 at 20:31. Reason: Post final script
    Quote Quote  
  11. Try adding Distributor() just before return output. Manually set the number of cores to 6, SetMtMode(3,6), or maybe 8.
    Quote Quote  
  12. Originally Posted by jagabo View Post
    Try adding Distributor() just before return output. Manually set the number of cores to 6, SetMtMode(3,6), or maybe 8.
    Distributor() seems to have slowed things down, and no real change on SetMTMode(). Thanks though!
    Quote Quote  
  13. I get 20 fps using a single thread (no encoding). And only 25 percent CPU usage, as expected. What kind of i5 are you using?
    Quote Quote  
  14. Groucho2004
    Guest
    Originally Posted by diprotic View Post
    Distributor() seems to have slowed things down, and no real change on SetMTMode().
    Neither of these observations are surprising. VirtualDub as well as AVSMeter add the Distributor call internally. Adding the call in the script will flood the thread pool with at least twice the number of threads, increase memory usage and slow things down.
    It's rarely necessary to set the number of threads in SetMTMode manually. The more threads you specify, the more memory will be used. Sometimes even specifying less threads than available cores hits the sweet spot.
    Quote Quote