Without distributor(), speed goes back down to 3 fps, I read somewhere that HCenc requires distributor().
I have a decent GPU GTX 760. I definitely should try KNLMeansCL and Nnedi3ocl. Does anyone know the settings for them for my video?
HCEnc 1 pass took 4 hours, HCenc 2 pass took 5 hours, not huge difference. I'll stick to 2 pass for better quality.
+ Reply to Thread
Results 31 to 39 of 39
hcenc doesn't call the distributor function itself. So you need to call it in the script.
When using a slow script and needing to run multiple passes in your MPEG-2 encoder (I use 5 usually, in CCE), it often pays to run only one slow pass in VDub to create a lossless AVI, followed by using AVISource in a new AviSynth script on your new AVI. Then the two passes you're using in HCEnc might go so quickly that even though you're running a total of three passes, you'll still save time in the end. Or run the two passes overnight and not worry about how long it's taking.
Temporaldegrain has partial GPU acceleration - if you set GPU=true. IT will use FFT3DFilterGPU. But it's only a small part of the calculations (marginal speedup) , and not as stable as the CPU run version (Lots of people have various problems with FFT3DFilterGPU)
KNLMeansCL is stable and fast, but alone it won't do as good of a job of temporal filtering as mvtools2 based tools like temporaldegrain, smdegrain, etc.. (But finally there is a capable GPU avisynth filter ...it's been a long long time in waiting. The other ones in the past were either flaky or prone to crash or various other problems.) In terms of VHS / noisy types of sources, the temporal "calming" won't be as good as the mvtools2 based tools. Period. Ie. I don't think you'll be able to get similar effect with any settings, at least not with KNLMeansCL alone. And if you crank up the settings, too much detail will be lost with it IMO. The benefit of using it is almost zero CPU usage which can be used for other things, like encoding faster (provided you don't have other bottlenecks). So if speed is the utmost concern , there is an option to make some sacrifices. If you want to play with it, and you're wondering why it's so slow - I'll tell you right now that on most i7 setups, the dedicated GPU will have an id of "1", and the iGPU (intel) will have an id of "0" , so you would probably use device_type="GPU", device_id=1
But a side effect of the mvtools2 based denoising is the "stuck grain pattern" effect. It's an unnatural , "floating" type of grain pattern - it also sometimes changes the shapes of some edges. But overall, the mvtools2 based denoisers are probably more "pleasing" to look at IMO . If you brighten up the shadows a bit more, you will notice lots of "junk" and a tremendous amount of noise - the "stuck" pattern will be more obvious then
I never tried NNEDI3ocl - after reading the comments it seemed too flaky and apparently sometimes slower so I never tested it yet
You're right - it is slower in avisynth MT . IIRC the author mentioned something about GPU parallelization and threading issues aren't optimized in avisynth MT. The speed delta is more of what you would expect in vapoursynth (64bit, native threading)
The speed also going to depend on what settings you use for the (the higher the temporal frames and spatial search radius, it becomes exponentially slower).
Eitherway it gives lower temporal quality (fluttering noise, unless you crank up the temporal frames and spatial strength, but then it's going to erode the details). You will see signficant differences if you brighten up the shadows a bit. But it's useful for cleaner sources that don't require as much temporal filtering. NLMeans algorithm is really meant for spatial cleaning. It's about 10-20x faster with the GPU variant of NLmeans (TNLMeans) on a midrange GPU
Using jagabo's script with setmtmode(3,8) and distributor with progressive encoding (not hcenc) to ffmpeg libx264 crf 18 I get about 30 FPS, but 25 FPS with KNLMeansCL(device_type="GPU", device_id=1, d=1, a=4, s=4, h=2)
In vapoursynth on a slightly different script with smdegrain, I get about 25 and 40 FPS with KNLMeansCL. (It's not comparable because of different scripts, different qualities between KNLmeans and SMdegrain, but just a very rough idea)
By the way, instead of calling a separate noise reduction filter you can try using QTGMC's built in denoiser. For example:
avisynth is still the most commonly used, easier to use especially for people who have been using it. I'm way more comfortable using it, and despite it's "quirkyness" and initial learning curve it seems a lot more straightforward
vapoursynth is way more "geeky" . It's all python based. It's very picky about things like syntax, case sensitivity much more than avisynth . I'm not a programmer and I'm having a tough time besides doing the very basic things. It's like learning avisynth all over again. Not as many plugins or functions as avisynth, although you can use avs scripts within it. Even simple manipulations and operations take seemingly more lines of code to write (at least to me) because of the strict syntax. Some things are faster, but some are slower (e.g. QTGMC is about 10-20% slower than MT avisynth), but as soon as you stack a few filters, it becomes faster than avisynth MT - it's threading model is apparently better. Native higher bit depth support is one of the big reasons why some people use it, and x64 means no memory related crashes, especially when dealing with HD or above content. (There are x64 avisynth versions, but they too have fewer plugins)