VideoHelp Forum
+ Reply to Thread
Results 1 to 21 of 21
Thread
  1. Ok, so I have the following code extracted from ESRGAN:
    Code:
    output = model(torch.from_numpy(np.transpose((cv2.imread(path, cv2.IMREAD_COLOR) * 1.0 / 255)[:, :, [2, 1, 0]], (2, 0, 1))).float().unsqueeze(0).to(cuda)).data.squeeze().float().cpu().clamp_(0, 1).numpy()
    Im looking to replace the cv2.imread() bit with the vapoursynth clip.
    Im having troubles doing so and was wondering if anyone knows how?
    Quote Quote  
  2. numpy does not work with vs.VideoNode type of object (vapoursynths clip) ,so you'd need to change vapoursynths clip into numpy array

    take this as a starter, could not be tested, not sure what that *1/255 is 16bit to 8bit? ffms2 would load it as 8bit with vs.RGB24
    Code:
    import numpy as np
    import cv2
    import vapoursynth as vs
    
    path = r'\some_image.jpg'
    
    rgb_clip = vs.core.ffms2.Source(path, format=vs.RGB24, alpha=False) #1frame clip
    
    #number of planes , for rgb should be 3 anyway
    planes = rgb_clip.format.num_planes
    
    #making numpy array from vapoursynths videonode
    list_of_arrays = [np.array(rgb_clip.get_frame(0).get_read_array(i), copy=False) for i in range(planes)]
    numpy_array = np.dstack(list_of_arrays)
    
    #not sure what that *1/255 is, transcoding from 16bit to 8bit?
    #output = model(torch.from_numpy(np.transpose((cv2.imread(path, cv2.IMREAD_COLOR) * 1.0 / 255)[:, :, [2, 1, 0]], (2, 0, 1))).float().unsqueeze(0).to(cuda)).data.squeeze().float().cpu().clamp_(0, 1).numpy()
    output = model(torch.from_numpy(np.transpose((numpy_array* 1.0 / 255)[:, :, [2, 1, 0]], (2, 0, 1))).float().unsqueeze(0).to(cuda)).data.squeeze().float().cpu().clamp_(0, 1).numpy()
    Last edited by _Al_; 8th Jul 2019 at 14:46.
    Quote Quote  
  3. Originally Posted by _Al_ View Post
    numpy does not work with vs.VideoNode type of object (vapoursynths clip) ,so you'd need to change vapoursynths clip into numpy array

    take this as a starter, could not be tested, not sure what that *1/255 is 16bit to 8bit?
    Code:
    import numpy as np
    import cv2
    import vapoursynth as vs
    
    path = r'\some_image.jpg'
    
    rgb_clip = vs.core.ffms2.Source(path, format=vs.RGB24, alpha=False) #1frame clip
    
    #number of planes , for rgb should be 3 anyway
    planes = rgb_clip.format.num_planes
    
    #making numpy array from vapoursynths videonode
    list_of_arrays = [np.array(rgb_clip.get_frame(0).get_read_array(i), copy=False) for i in range(planes)]
    numpy_array = np.dstack(list_of_arrays)
    
    #not sure what that *1/255 is, transcoding from 16bit to 8bit? ffms would load it as 8bit (vs.RGB24)
    #output = model(torch.from_numpy(np.transpose((cv2.imread(path, cv2.IMREAD_COLOR) * 1.0 / 255)[:, :, [2, 1, 0]], (2, 0, 1))).float().unsqueeze(0).to(cuda)).data.squeeze().float().cpu().clamp_(0, 1).numpy()
    output = model(torch.from_numpy(np.transpose(numpy_array[:, :, [2, 1, 0]], (2, 0, 1))).float().unsqueeze(0).to(cuda)).data.squeeze().float().cpu().clamp_(0, 1).numpy()
    Great! Ill test that in a second, and is it possible to have this work for whatever the current frame is?
    Quote Quote  
  4. you just make it as function and you pass a frame , number of frames, whatever
    Code:
    clip = vs.core.ffms2.Source(path_video , format = ....)
    for frame in range(0, len(clip)):
        call_your_function(frame, ....)
    that should be passed in a threaded function , where you could set up a queue, if it suppose to be for preview etc., you'd need some timer going on as well if it really is for preview
    Quote Quote  
  5. Originally Posted by _Al_ View Post
    you just make it as function and you pass a frame , number of frames, whatever
    Code:
    clip = vs.core.ffms2.Source(path_video , format = ....)
    for frame in range(0, len(clip)):
        call_your_function(frame, ....)
    that should be passed in a threaded function , where you could set up a queue, if it suppose to be for preview etc., you'd need some timer going on as well if it really is for preview
    Yeah the goal is for it to work via the preview and not end up doing all the frames of the clip whenever i open preview or change frame.
    Quote Quote  
  6. Also, with resulting "output", i need to return that back as an image, this is how I do so with cv2:
    Code:
    cv2.imwrite('[] OUTPUT/{:s}.png'.format(base), (np.transpose(output[[2, 1, 0], :, :], (1, 2, 0)) * 255.0).round())
    Not sure how im meant to do that without imwrite.
    Quote Quote  
  7. Also, I believe where it says get_frame(0), that should be (frame) not 0?
    And when thats fixed, I get the following error:
    "Python exception: all the input array dimensions except for the concatenation axis must match exactly"
    Stack:
    Code:
    File "src\cython\vapoursynth.pyx", line 1942, in vapoursynth.vpy_evaluateScript
    File "src\cython\vapoursynth.pyx", line 1943, in vapoursynth.vpy_evaluateScript
    File "C:/Users/PRAGMA/VapourSynth/Scripts/Deinterlace.vpy", line 50, in 
    numpy_array = np.dstack([np.array(c.get_frame(0).get_read_array(i), copy=False) for i in range(planes)])
    File "C:\Program Files\Python37\lib\site-packages\numpy\lib\shape_base.py", line 699, in dstack
    return _nx.concatenate([atleast_3d(_m) for _m in tup], 2)
    ValueError: all the input array dimensions except for the concatenation axis must match exactly
    Quote Quote  
  8. This is my current code:
    Code:
    import vapoursynth as vs
    import functools
    core = vs.get_core()
    
    # --------------------------------------------------
    # CONFIGURATION
    # --------------------------------------------------
    SourcePath = r"C:\Users\PRAGMA\Videos\American Dad Stuff\S01E04 [framerate bit weird].mkv"
    c = core.ffms2.Source(SourcePath)
    
    # --------------------------------------------------
    # Tensor
    # --------------------------------------------------
    import cv2
    import numpy as np
    import torch
    import RRDBNet_arch as arch
    
    #craft the model
    cuda = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    model = arch.RRDBNet(3, 3, 64, 23, gc=32)
    model.load_state_dict(torch.load(r"H:\2; Models\ad 160k_tf.pth"), strict=True)
    model.eval()
    model = model.to(cuda)
    #send the frames through pytorch, through the model, through cuda
    for frame in range(0, len(c)):
    	#number of planes , for rgb should be 3 anyway
    	planes = c.format.num_planes
    	#making numpy array from vapoursynths videonode
    	numpy_array = np.dstack([np.array(c.get_frame(frame).get_read_array(i), copy=False) for i in range(planes)])
    	#run it against pytorch with the model
    	with torch.no_grad():
    		output = model(torch.from_numpy(np.transpose(numpy_array[:, :, [2, 1, 0]], (2, 0, 1))).float().unsqueeze(0).to(cuda)).data.squeeze().float().cpu().clamp_(0, 1).numpy()
    	#transpose result, and write it to a file	
    	cv2.imwrite('{:s}.png'.format(base), (np.transpose(output[[2, 1, 0], :, :], (1, 2, 0)) * 255.0).round())
    
    #still wont have the resulting files, as the line above, im not sure how to appropriate with clip
    c.set_output()
    clip is "c" here to minimize filesize
    RRDBNet_arch.py can be gotten here:
    https://github.com/xinntao/ESRGAN/blob/master/RRDBNet_arch.py
    Quote Quote  
  9. Oh, it seems the rgb format and alpha = false mattered
    Quote Quote  
  10. So that works, however, when you start a preview, it will run all frames at once, which obviously isnt ideal :/
    Im unaware how to proceed here, so if you can help id totally appreciate it! :L
    Quote Quote  


  11. The result though is a way broken color haha xd
    Quote Quote  
  12. I fixed the color by simply doing reversed(range(planes)), seems it expected them in a reversed order.
    But it still requires me to convert to RGB, which seems to cause artifacts and worse quality :/
    Quote Quote  
  13. Ok!
    So, to fix the color issues:
    Add back in the * 1.0 / 255 (seems to handle float conversion and then brightness essentially)
    Then, we load in via ffms2 in original YUV420P8, then after that, convert to RGB24, no idea why, but only seems to work by doing the following (even though there are many other ways)
    Code:
    c = core.resize.Point(c, format=vs.RGB24)
    Heres my script:
    Code:
    import vapoursynth as vs
    import functools
    import havsfunc
    core = vs.get_core()
    
    # --------------------------------------------------
    # CONFIGURATION
    # --------------------------------------------------
    SourcePath = r"C:\Users\PRAGMA\Videos\American Dad Stuff\S01E04 [framerate bit weird].mkv"
    
    # --------------------------------------------------
    # SOURCE
    # --------------------------------------------------
    c = core.ffms2.Source(SourcePath, alpha=False)
    c = core.resize.Point(c, format=vs.RGB24)
    
    # --------------------------------------------------
    # ESRGAN
    # --------------------------------------------------
    import cv2
    import numpy as np
    import torch
    import RRDBNet_arch as arch
    def ESRGAN(c, cuda, model, frame_num):
    	#making numpy array from vapoursynths videonode
    	numpy_array = np.dstack([np.array(c.get_frame(frame_num).get_read_array(i), copy=False) for i in reversed(range(c.format.num_planes))])
    	#run it against pytorch with the model
    	with torch.no_grad():
    		output = model(torch.from_numpy(np.transpose((numpy_array * 1.0 / 255)[:, :, [2, 1, 0]], (2, 0, 1))).float().unsqueeze(0).to(cuda)).data.squeeze().float().cpu().clamp_(0, 1).numpy()
    	return (np.transpose(output[[2, 1, 0], :, :], (1, 2, 0)) * 255.0).round()
    
    #craft the model
    cuda = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    model = arch.RRDBNet(3, 3, 64, 23, gc=32)
    model.load_state_dict(torch.load(r"H:\2; Models\ad 160k_tf.pth"), strict=True)
    model.eval()
    model = model.to(cuda)
    #send the frames through pytorch, through the model, through cuda
    for frame_num in range(0, len(c)):
    	cv2.imwrite(str(frame_num) + ".png", ESRGAN(c, cuda, model, frame_num))
    
    c.set_output()
    I even went ahead and put it into a function as advised earlier, but I do not know how to "queue" it so it only does the requested frame
    I also dont know how to get the result from ESRGAN's return and put it back into the clip in a format the clip expects (VapourSynth<VideoFrame>?)

    For putting back into clip, Maybe something like: clip[frame_num] = ConvertBackToVSVideoFrame(ESRGAN(...))
    Quote Quote  
  14. Code:
    # --------------------------------------------------
    # ESRGAN
    # --------------------------------------------------
    import cv2
    import numpy as np
    import torch
    import RRDBNet_arch as arch
    
    #craft the model
    cuda = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    model = arch.RRDBNet(3, 3, 64, 23, gc=32)
    model.load_state_dict(torch.load(r"H:\2; Models\ad 160k_tf.pth"), strict=True)
    model.eval()
    model = model.to(cuda)
    
    def ESRGAN(n, c, cuda, model):
    	#making numpy array from vapoursynths videonode
    	numpy_array = np.dstack([np.array(c.get_frame(n).get_read_array(i), copy=False) for i in reversed(range(c.format.num_planes))])
    	#run it against pytorch with the model
    	with torch.no_grad():
    		output = model(torch.from_numpy(np.transpose((numpy_array * 1.0 / 255)[:, :, [2, 1, 0]], (2, 0, 1))).float().unsqueeze(0).to(cuda)).data.squeeze().float().cpu().clamp_(0, 1).numpy()
    	cv2.imwrite("test.png", (np.transpose(output[[2, 1, 0], :, :], (1, 2, 0)) * 255.0).round())
    	return c
    
    c = core.std.FrameEval(c, functools.partial(ESRGAN, c=c, cuda=cuda, model=model))
    c.set_output()
    Got the frame only working! FrameEval was the holy grail.

    Now I just need to figure out how to get numpy array to clip :/
    Quote Quote  
  15. I really cant figure out what to do here
    Quote Quote  
  16. numpy works in BGR, vapoursynth has planes lined up as RGB , I forgot that

    So I assume instead of returning c you need to return that new output as vs.VideoNode? Besides storing images on disk,? That is what cv2.imgwrite does.

    That is why I dropped using numpy early as well because of that other conversion, back to videonode from numpy. I use numpy only for "visual", something quick on screen, effect etc., but it actually is never needed as outcome.

    There is someone who who published codes for this though, I was looking at that code a year ago, I have to find it again. I'll find it.
    Quote Quote  
  17. I found it here, https://github.com/KotoriCANOE/MyTF/blob/master/utils/vshelper.py
    so, what is your output, float? # convert float32 np.ndarray to float32 vs.VideoNode

    something:
    Code:
    import vshelper
    .
    .
    #within that eval()
    out = vshelper.float32_vsclip(output, clip=c)
    return out
    .
    .
    c = core.std.FrameEval(c, functools.partial(ESRGAN, c=c, cuda=cuda, model=model))
    #conversion from float to YUV420P8 or whatever 
    c.set_output()
    that would not work most likely, you might create copy of that c clip first, new = c.copy() within that eval or just one frame copy
    Last edited by _Al_; 8th Jul 2019 at 21:41.
    Quote Quote  
  18. this would give me ok output , using FrameEval() to change vs.VideoNode to numpy and back:
    Code:
    import vapoursynth as vs
    from vapoursynth import core
    import numpy as np
    import functools
    import vshelper
    
    path = r'videofile.mp4'
    clip = core.ffms2.Source(path)
    clip = core.resize.Point(clip, format = vs.RGBS)
    
    def to_numpy_and_back(n, clip):
       list_of_arrays = [np.array(clip.get_frame(n).get_read_array(i), copy=False) for i in range(3)]
       numpy_array = np.dstack(list_of_arrays)
       out = vshelper.float32_vsclip(numpy_array, clip=None)
       return out
    
    clip = core.std.FrameEval(clip, functools.partial(to_numpy_and_back, clip=clip))
    clip = core.resize.Point(clip, matrix_s = '709', format = vs.YUV420P8)
    clip.set_output()
    Quote Quote  
  19. so if there was real processing in numpy image , planes would need to be read in reverse order, then Shuffle planes back to correct order before clip.output()
    Quote Quote  
  20. Originally Posted by _Al_ View Post
    so if there was real processing in numpy image , planes would need to be read in reverse order, then Shuffle planes back to correct order before clip.output()
    Thank you the above worked, I had to edit it a bit to fix the planes but it works.
    I put my resulting optimized code onto GitHub incase others are looking to do the same thing as I:
    https://github.com/imPRAGMA/VSGAN
    Quote Quote  
  21. looking at that github link, I guess it would work without reversing plane order while creating numpy, and then not to trying to shuffle it back to original order,

    sorry for confusion on my part, you were using opencv for feedback, and it is actually opencv that works with BGR, not numpy, numpy is just bunch of data,
    if there is no use for opencv in the middle, plane order could be the same the whole flow
    Quote Quote  



Similar Threads

Visit our sponsor! Try DVDFab and backup Blu-rays!