VideoHelp Forum

Try DVDFab and copy Ultra HD Blu-rays and DVDs! Or rip iTunes movies and music! Download free trial !
+ Reply to Thread
Results 1 to 8 of 8
Thread
  1. I'm sharing some directX pixel shader (.hlsl) control code I've written recently: https://github.com/butterw/bShaders
    They are tested in mpc-hc on intel gpu (integrated graphics) and should be easy to use and modify.

    In Mpc-hc, you need to install them in the Shaders subfolder (you need write permission on the source folder to be able to modify them).
    Modifying the source file is necessary to change parameter values (auto-compilation triggers automatically after saving). For this, it's best to use a text editor with C syntaxic coloring option like Notepad++. You can create multiple versions of the same shader based on your prefered parameter values/use cases and switch between them as required.
    With Mpc-hc, you can save Shader Presets (a pre/post resize Shader chain), that you can access directly by Right Click>Shaders. Once set, shaders stay active until you disable them, so creating an empty "Disable" Shader Profile is useful.

    So Far:
    - barMask (Custom Mask, Aspect Ratio Mask, + Shift) v1.2:
    no programming required: select mode and adjust parameters in #define

    - MaskCrop: custom mask (Top, Bottom, Left, Right) the sides of the screen and recenter image, you could use this to mask a logo at the bottom of the screen for instance (mpc-hc doesn't have a built-in crop feature).
    - MaskBox: a rectangular masking box.
    - RatioLetterbox: View a custom Aspect Ratio. Simulate a 21/9 display on a standard monitor.
    - OffsetPillarbox: view video as a vertical aspect ratio on a standard landscape monitor with borders on the side (ex: 9/16, 1/1, etc.). You can offset the window to compensate for a non-centered video.
    - Image Shift.
    The borders are image zones, which means you can apply any effect on them (not just colored borders).

    - bStrobe (repeating Effect with configurable timings) v1.0
    Ready to use timed color frame Effect.
    Parameters: ScreenColor, Time of first run (Tfirst), Duration (Td), Repetition period (Tcycle), End (Number of runs, End time or No End).


    More shaders at https://gist.github.com/butterw/
    - bSide (ZoneShader_mpc) v1.0
    Side by Side and Box display of Effects

    - bTimeEffect_DePixelate
    a Pixelation-DePixelation Effect/glitch.


    An example with SBS (side-by-side) 3D input: https://forum.videohelp.com/showthread.php?t=398955
    Last edited by butterw; 5th Oct 2020 at 10:16.
    Quote Quote  
  2. intro to pixel shaders (.hlsl, .fx)

    Hlsl is a specialized (simple/powerful but limited) programming language with vector operations and c-syntax which runs realtime on the gpu.
    The input video is processed frame by frame. Each frame is processed pixel by pixel by the main function of the .hlsl file.
    the basic scalar datatype is float (float32), float2 and float4 are vector datatypes.
    a pixel has float2 (x, y) coordinates, and float4 (r, g, b, a) color (the alpha channel isn't used by mpc-hc)
    input/output values are in range [0, 1.]

    /* Code for the Invert.hlsl Shader in dx9 */
    Code:
    sampler s0; //the input frame
    
    float4 main(float2 tex: TEXCOORD0): COLOR { //tex: current output pixel coordinates (x, y)   
    	float4 c0 = tex2D(s0, tex);     //sample rgba pixel color (of pixel tex)
    	
    	return float4(1, 1, 1, 1) - c0; //return output pixel color (r, g, b, a)   
    }
    if required you can also access (all floats):
    - the frame dimensions W, H, the pixel width px, py
    - the frame Counter or seconds Clock, which start at 0 and are updated each frame.
    not much more to it except you can use #define expressions and pre-compiler instructions.
    #define Red float4(1., 0, 0, 0) //defines constant color Red.

    Note: dx11 pixel shaders have a more verbose boilerplate syntax. They have to be adapted to run on a dx9 renderer (ex: mpc-hc default EVR-CP renderer).


    Limitations:
    - runs on uncompressed realtime video, the gpu needs to be able to handle the Texture points and Arithmetic operations used: Not a problem at 1080p even on integrated graphics unless you need to access many other pixels to calculate each output pixel.
    - no built-in random number generator (PRNG).
    - if hw linear sampling is not supported by the gpu, uses nearest neighbor (as silent fallback).


    In mpc-hc (EVR-CP):
    - you can only access the current frame !
    - you cannot access any external (texture) file.
    - no external parameters. All parameters values are defined in source file (also no GUI sliders option).
    - no imports in shader source file. #include "filename.hlsl" is not supported.
    - no user-defined persistent variables or textures !
    - in a post-resize shader: coordinate calculation is off if not in fullscreen mode, also you can't access video file resolution, only screen resolution.
    . . . a post-resize shader applies to the black bars in fullscreen.
    - Clock starts when the player is opened, there is no way of getting current playtime (this would be useful to trigger transition effects in a playlist for instance).
    - the shader can't modify the resolution/aspect ratio of the frame provided by the video player (no user-defined resize shaders).
    - shader file is single pass only. You can however create a preset with chained shaders and even pass data between them using the alpha channel.
    - dx11 renderer: doesn't provide backwards compatibility with existing dx9 shaders !


    mpc-hc/be, Pre or Post-Resize Shader chain (can be saved as a shader preset):
    renderer (RGB) > Shader1 > Shader2 > Shader3 > Output (RGB)
    Last edited by butterw; 26th Oct 2020 at 09:03. Reason: Updated Limitations
    Quote Quote  
  3. Thanks for that.

    I know nothing about hlsl, but I did manage to fiddle with one of the existing pixel shaders. Feel free to add it to your collection.... or not.....

    BT.601 to BT.709 [SD][HD]
    Converts the colors regardless of resolution. The original pixel shader only worked for HD as it was intended to correct a video card driver problem around 700 years ago, but it's handy for correcting SD video encoded from HD sources as rec.709, if the renderer is ignoring any colorimetry information, or it was encoded with Xvid so it doesn't have any.
    Image Attached Files
    Quote Quote  
  4. I'm guessing at least some avisynth methods could be ported to pixel shaders.

    I saw your FrostyBorders script in your sig. May I ask what method you use to fill these borders https://i.postimg.cc/yNPx5BKV/Frosty-Borders-2.jpg ?

    A heavy 2D gaussian blur would be too gpu intensive for integrated graphics, I am looking for cheaper alternatives.
    Quote Quote  
  5. Originally Posted by butterw View Post
    A heavy 2D gaussian blur would be too gpu intensive for integrated graphics, I am looking for cheaper alternatives.
    The final resizing and blurring steps for one edge are below (the script defaults).
    The main filtering prior to that is TemporalSoften(7, 255, 255, 40, 2).

    Frosty_Left = Blend_Left.GaussResize(BL, OutHeight)\
    .FastBlur(Blur=50, iterations=3, dither=true)\
    .AddGrain(var=5.0, constant=true)

    AddGrainC.
    http://avisynth.nl/index.php/AddGrainC
    FastBlur.
    https://forum.doom9.org/showthread.php?t=176564
    http://avisynth.nl/index.php/FastBlur
    It's closed source though. I can't remember if there's a reason for that. You might have to contact wonkey_monkey to see if he'll share source code if you need it.

    The script originally blurred by running the borders through the QGaussBlur function a couple of times (buried here), but then I discovered FastBlur and the result was better.
    Last edited by hello_hello; 29th Jun 2020 at 13:35.
    Quote Quote  
  6. Unfocused effect looks nice, but big radius blurs don't come cheap (requires huge kernel size >5*Sigma).

    fastblur uses 3 passes of boxblur to approximate a gaussian blur.
    The implementation of boxblur can be heavily optimized (the optimizations will be different on cpu and gpu)
    Also for gaussian blur, you would need to hardcode the filter kernel coefficients in a pixel shader, which is troublesome when you want to try different parameters.
    Game engines have heavily optimized multipass shader implementations of blur, but even then they rely on the power of an external gpu.
    I don't know if perf-optimized implementation can run even at 720p30 on old integrated graphics.

    naive implementation of single pass 2D-boxblur 3x3kernel with gpu pixel shader:
    - Mean function: effect is barely visible with a single pass with such a small kernel.
    - Per pixel: 9 texture lookups, 15 arithmetic operations.

    Better results with an optimized 2pass gaussian Blur (9tap): bShaders\blurGauss.hlsl >> blurGauss_Y.hlsl
    - Per pixel: 2 passes*(5 texture, 8 arithmetic)
    to increase the blur, you can use the shader multiple times (ex: 5times would be equivalent to a 45tap shader)

    For heavier blur with pixel shaders, Kawase (or Dual Kawase) method seems to be the way to go:
    https://software.intel.com/content/www/us/en/develop/blogs/an-investigation-of-fast-re...lgorithms.html

    Optimized boxblur (3x) separable filter, radius parameter k, kernel size: 4*k-1
    (2*k texture, 17 arithmetic) *6 iterations with downscaling/upscaling.
    Last edited by butterw; 7th Jul 2020 at 18:45. Reason: added intel link, boxblur info
    Quote Quote  
  7. intro to glsl fragment shaders (mpv .hook, .glsl, .frag)

    fragment shaders (.glsl) are the OpenGL equivalent to DirectX pixel shaders (.hlsl).
    A significant amount of open-source .glsl shader code is available because they are used in linux, android and WebGL.
    There are some differences between glsl and hlsl, but the syntax is similar and porting code between hlsl and glsl is typically possible.
    - Ex, vector type names: vec4 instead of float4

    The cross-platform mpv video player allows different types of user pixel shaders (in glsl based mpv .hook format). There is currently only a limited library of effects shaders available for mpv however, probably because some adaptation work is required to create a .hook shader from glsl code.
    More information on .hook shader syntax in the mpv manual: https://mpv.io/manual/master/#options-glsl-shader

    mpv processing pipeline:
    Video file, ex: yuv420 720p > gpu hw-dec >> LUMA, CHROMA source > NATIVE (resizable) >> yuv-to-rgb conversion >> MAIN (resizable) >> (LINEAR) > PREKERNEL >> scale >> OUTPUT > Screen, ex: rgb32, 1080p


    The following example is an Inversion pass applied to source LUMA. Tested with mpv v0.32 (Win10, gpu-api=d3d11):

    Code:
    //!HOOK LUMA
    //!BIND HOOKED
    //!DESC Invert Luma
    
    /* mpv: "./shaders/InvertLuma.hook" */
    vec4 hook() { // main function
        float luma = HOOKED_texOff(0).x;  
        return vec4(1.0 - luma); // return output pixel value
    }
    // uniform program parameters, updated every frame: "random" PRNG [0, 1.0], "frame" is an integer frame counter.

    Mpv shaders have less limitations than user shaders in mpc-hc/be:
    - selective pass execution (HOOK point, WHEN condition)
    - multipass shaders (can use output textures of previous passes)
    - source shaders (ex: LUMA shader working on yuv420 from video decoder)
    - chroma upsampling shader
    - pre-scalers (custom high-quality resizers)
    - compute shaders (code runs by thread in workgroup blocs with shared memory).
    . . . . CS5.0 gpu limitations: number of threads (ex: 32*32=1024), 32KB of shared memory (ex: 88x88*float or 48x48*float3) !!!

    mpv features:
    - different configuration [profile] sections can be triggered based on input

    Limitations
    - Pre-scaler textures can be resized at input or output of a pass (with WIDTH and HEIGHT parameters, I assume using hw linear sampling). The size of the output frame cannot be changed however (textures will be resized by the selected built-in scaler to fit the frame).
    - Shaders can read an embedded hex-encoded texture (ravu uses this for a 2D-LUT), but it looks more like a proof-of-concept than a actual feature. An external png texture would be much easier to use.

    mpv disadvantages:
    - steeper learning curve, not as easy to make changes (ex: no GUI to change shader preset), applying shader (parameter) modification requires player restart.
    - on windows, compiler output is only displayed when there is an error in the shader (blue screen) and mpv is launched from command line. It also doesn't report on performance (texture fetches, number of math ops).
    Last edited by butterw; 12th Nov 2020 at 16:42. Reason: +more details
    Quote Quote  
  8. Compute Shader pass (mpv glsl .hook)

    To demonstrate compute shader workgroups, we color in blue the bloc with ID (2, 1) in the output image.
    Each thread of a Workgroup has local and global (integer vector) IDs and gets executed in parallel.
    The required number of Workgroups will be created to process the input frame.

    //!HOOK MAIN
    //!BIND HOOKED
    //!COMPUTE 16 16 //Workgroup bloc size(x, y) is defined as 16*16 threads, with 1 thread per input pixel.
    //!DESC mpv Compute Example

    #define WkgID gl_WorkGroupID
    #define LoID gl_LocalInvocationID
    #define GlobID gl_GlobalInvocationID
    #define Blue vec4(0 ,.5, 1, 1)

    void hook() { //per thread execution, in WorkGroup blocks
    vec4 color = HOOKED_tex(HOOKED_pos); //Read input texel
    if (WkgID.x==2 && WkgID.y==1) color = Blue;
    ivec2 coords = ivec2(GlobID); //Global (x, y) ID for pixel threads ex: (100, 240)
    imageStore(out_image, coords, color);
    }

    Compute Shaders have a slightly different syntax vs fragment shaders (they don't return a pixel color value, but instead output an image) and threads have integer IDs. Compute shaders (CS) have more control over threads than fragment shaders, leading to better performance in some applications. However this also means they are more complex to write and optimize. The main added feature is fast per workgroup shared memory, which is useful for convolution kernels for instance.

    CS in mpv:
    - input frame resolution / workgroup size: number of workgroups.
    . . . you choose the number of threads in the workgroup (can be less than 1 thread per pixel)
    - input resolution: output resolution
    . . . its possible to do multiple writes to the output image
    - group shared memory must be initialized before it can be used.
    - group threads are executed in parallel: thread concurrency must be handled.
    Last edited by butterw; 12th Nov 2020 at 16:38. Reason: +more details
    Quote Quote  



Similar Threads