Ok, so I have the following code extracted from ESRGAN:
Im looking to replace the cv2.imread() bit with the vapoursynth clip.Code:output = model(torch.from_numpy(np.transpose((cv2.imread(path, cv2.IMREAD_COLOR) * 1.0 / 255)[:, :, [2, 1, 0]], (2, 0, 1))).float().unsqueeze(0).to(cuda)).data.squeeze().float().cpu().clamp_(0, 1).numpy()
Im having troubles doing so and was wondering if anyone knows how?
+ Reply to Thread
Results 1 to 21 of 21
-
-
numpy does not work with vs.VideoNode type of object (vapoursynths clip) ,so you'd need to change vapoursynths clip into numpy array
take this as a starter, could not be tested, not sure what that *1/255 is 16bit to 8bit? ffms2 would load it as 8bit with vs.RGB24
Code:import numpy as np import cv2 import vapoursynth as vs path = r'\some_image.jpg' rgb_clip = vs.core.ffms2.Source(path, format=vs.RGB24, alpha=False) #1frame clip #number of planes , for rgb should be 3 anyway planes = rgb_clip.format.num_planes #making numpy array from vapoursynths videonode list_of_arrays = [np.array(rgb_clip.get_frame(0).get_read_array(i), copy=False) for i in range(planes)] numpy_array = np.dstack(list_of_arrays) #not sure what that *1/255 is, transcoding from 16bit to 8bit? #output = model(torch.from_numpy(np.transpose((cv2.imread(path, cv2.IMREAD_COLOR) * 1.0 / 255)[:, :, [2, 1, 0]], (2, 0, 1))).float().unsqueeze(0).to(cuda)).data.squeeze().float().cpu().clamp_(0, 1).numpy() output = model(torch.from_numpy(np.transpose((numpy_array* 1.0 / 255)[:, :, [2, 1, 0]], (2, 0, 1))).float().unsqueeze(0).to(cuda)).data.squeeze().float().cpu().clamp_(0, 1).numpy()
Last edited by _Al_; 8th Jul 2019 at 14:46.
-
-
you just make it as function and you pass a frame , number of frames, whatever
Code:clip = vs.core.ffms2.Source(path_video , format = ....) for frame in range(0, len(clip)): call_your_function(frame, ....)
-
-
Also, with resulting "output", i need to return that back as an image, this is how I do so with cv2:
Code:cv2.imwrite('[] OUTPUT/{:s}.png'.format(base), (np.transpose(output[[2, 1, 0], :, :], (1, 2, 0)) * 255.0).round())
-
Also, I believe where it says get_frame(0), that should be (frame) not 0?
And when thats fixed, I get the following error:
"Python exception: all the input array dimensions except for the concatenation axis must match exactly"
Stack:
Code:File "src\cython\vapoursynth.pyx", line 1942, in vapoursynth.vpy_evaluateScript File "src\cython\vapoursynth.pyx", line 1943, in vapoursynth.vpy_evaluateScript File "C:/Users/PRAGMA/VapourSynth/Scripts/Deinterlace.vpy", line 50, in numpy_array = np.dstack([np.array(c.get_frame(0).get_read_array(i), copy=False) for i in range(planes)]) File "C:\Program Files\Python37\lib\site-packages\numpy\lib\shape_base.py", line 699, in dstack return _nx.concatenate([atleast_3d(_m) for _m in tup], 2) ValueError: all the input array dimensions except for the concatenation axis must match exactly
-
This is my current code:
Code:import vapoursynth as vs import functools core = vs.get_core() # -------------------------------------------------- # CONFIGURATION # -------------------------------------------------- SourcePath = r"C:\Users\PRAGMA\Videos\American Dad Stuff\S01E04 [framerate bit weird].mkv" c = core.ffms2.Source(SourcePath) # -------------------------------------------------- # Tensor # -------------------------------------------------- import cv2 import numpy as np import torch import RRDBNet_arch as arch #craft the model cuda = torch.device('cuda' if torch.cuda.is_available() else 'cpu') model = arch.RRDBNet(3, 3, 64, 23, gc=32) model.load_state_dict(torch.load(r"H:\2; Models\ad 160k_tf.pth"), strict=True) model.eval() model = model.to(cuda) #send the frames through pytorch, through the model, through cuda for frame in range(0, len(c)): #number of planes , for rgb should be 3 anyway planes = c.format.num_planes #making numpy array from vapoursynths videonode numpy_array = np.dstack([np.array(c.get_frame(frame).get_read_array(i), copy=False) for i in range(planes)]) #run it against pytorch with the model with torch.no_grad(): output = model(torch.from_numpy(np.transpose(numpy_array[:, :, [2, 1, 0]], (2, 0, 1))).float().unsqueeze(0).to(cuda)).data.squeeze().float().cpu().clamp_(0, 1).numpy() #transpose result, and write it to a file cv2.imwrite('{:s}.png'.format(base), (np.transpose(output[[2, 1, 0], :, :], (1, 2, 0)) * 255.0).round()) #still wont have the resulting files, as the line above, im not sure how to appropriate with clip c.set_output()
RRDBNet_arch.py can be gotten here:
https://github.com/xinntao/ESRGAN/blob/master/RRDBNet_arch.py -
So that works, however, when you start a preview, it will run all frames at once, which obviously isnt ideal :/
Im unaware how to proceed here, so if you can help id totally appreciate it! :L -
I fixed the color by simply doing reversed(range(planes)), seems it expected them in a reversed order.
But it still requires me to convert to RGB, which seems to cause artifacts and worse quality :/ -
Ok!
So, to fix the color issues:
Add back in the * 1.0 / 255 (seems to handle float conversion and then brightness essentially)
Then, we load in via ffms2 in original YUV420P8, then after that, convert to RGB24, no idea why, but only seems to work by doing the following (even though there are many other ways)
Code:c = core.resize.Point(c, format=vs.RGB24)
Code:import vapoursynth as vs import functools import havsfunc core = vs.get_core() # -------------------------------------------------- # CONFIGURATION # -------------------------------------------------- SourcePath = r"C:\Users\PRAGMA\Videos\American Dad Stuff\S01E04 [framerate bit weird].mkv" # -------------------------------------------------- # SOURCE # -------------------------------------------------- c = core.ffms2.Source(SourcePath, alpha=False) c = core.resize.Point(c, format=vs.RGB24) # -------------------------------------------------- # ESRGAN # -------------------------------------------------- import cv2 import numpy as np import torch import RRDBNet_arch as arch def ESRGAN(c, cuda, model, frame_num): #making numpy array from vapoursynths videonode numpy_array = np.dstack([np.array(c.get_frame(frame_num).get_read_array(i), copy=False) for i in reversed(range(c.format.num_planes))]) #run it against pytorch with the model with torch.no_grad(): output = model(torch.from_numpy(np.transpose((numpy_array * 1.0 / 255)[:, :, [2, 1, 0]], (2, 0, 1))).float().unsqueeze(0).to(cuda)).data.squeeze().float().cpu().clamp_(0, 1).numpy() return (np.transpose(output[[2, 1, 0], :, :], (1, 2, 0)) * 255.0).round() #craft the model cuda = torch.device('cuda' if torch.cuda.is_available() else 'cpu') model = arch.RRDBNet(3, 3, 64, 23, gc=32) model.load_state_dict(torch.load(r"H:\2; Models\ad 160k_tf.pth"), strict=True) model.eval() model = model.to(cuda) #send the frames through pytorch, through the model, through cuda for frame_num in range(0, len(c)): cv2.imwrite(str(frame_num) + ".png", ESRGAN(c, cuda, model, frame_num)) c.set_output()
I also dont know how to get the result from ESRGAN's return and put it back into the clip in a format the clip expects (VapourSynth<VideoFrame>?)
For putting back into clip, Maybe something like: clip[frame_num] = ConvertBackToVSVideoFrame(ESRGAN(...)) -
Code:
# -------------------------------------------------- # ESRGAN # -------------------------------------------------- import cv2 import numpy as np import torch import RRDBNet_arch as arch #craft the model cuda = torch.device('cuda' if torch.cuda.is_available() else 'cpu') model = arch.RRDBNet(3, 3, 64, 23, gc=32) model.load_state_dict(torch.load(r"H:\2; Models\ad 160k_tf.pth"), strict=True) model.eval() model = model.to(cuda) def ESRGAN(n, c, cuda, model): #making numpy array from vapoursynths videonode numpy_array = np.dstack([np.array(c.get_frame(n).get_read_array(i), copy=False) for i in reversed(range(c.format.num_planes))]) #run it against pytorch with the model with torch.no_grad(): output = model(torch.from_numpy(np.transpose((numpy_array * 1.0 / 255)[:, :, [2, 1, 0]], (2, 0, 1))).float().unsqueeze(0).to(cuda)).data.squeeze().float().cpu().clamp_(0, 1).numpy() cv2.imwrite("test.png", (np.transpose(output[[2, 1, 0], :, :], (1, 2, 0)) * 255.0).round()) return c c = core.std.FrameEval(c, functools.partial(ESRGAN, c=c, cuda=cuda, model=model)) c.set_output()
FrameEval was the holy grail.
Now I just need to figure out how to get numpy array to clip :/ -
numpy works in BGR, vapoursynth has planes lined up as RGB , I forgot that
So I assume instead of returning c you need to return that new output as vs.VideoNode? Besides storing images on disk,? That is what cv2.imgwrite does.
That is why I dropped using numpy early as well because of that other conversion, back to videonode from numpy. I use numpy only for "visual", something quick on screen, effect etc., but it actually is never needed as outcome.
There is someone who who published codes for this though, I was looking at that code a year ago, I have to find it again. I'll find it. -
I found it here, https://github.com/KotoriCANOE/MyTF/blob/master/utils/vshelper.py
so, what is your output, float? # convert float32 np.ndarray to float32 vs.VideoNode
something:
Code:import vshelper . . #within that eval() out = vshelper.float32_vsclip(output, clip=c) return out . . c = core.std.FrameEval(c, functools.partial(ESRGAN, c=c, cuda=cuda, model=model)) #conversion from float to YUV420P8 or whatever c.set_output()
Last edited by _Al_; 8th Jul 2019 at 21:41.
-
this would give me ok output , using FrameEval() to change vs.VideoNode to numpy and back:
Code:import vapoursynth as vs from vapoursynth import core import numpy as np import functools import vshelper path = r'videofile.mp4' clip = core.ffms2.Source(path) clip = core.resize.Point(clip, format = vs.RGBS) def to_numpy_and_back(n, clip): list_of_arrays = [np.array(clip.get_frame(n).get_read_array(i), copy=False) for i in range(3)] numpy_array = np.dstack(list_of_arrays) out = vshelper.float32_vsclip(numpy_array, clip=None) return out clip = core.std.FrameEval(clip, functools.partial(to_numpy_and_back, clip=clip)) clip = core.resize.Point(clip, matrix_s = '709', format = vs.YUV420P8) clip.set_output()
-
so if there was real processing in numpy image , planes would need to be read in reverse order, then Shuffle planes back to correct order before clip.output()
-
Thank you the above worked, I had to edit it a bit to fix the planes but it works.
I put my resulting optimized code onto GitHub incase others are looking to do the same thing as I:
https://github.com/imPRAGMA/VSGAN -
looking at that github link, I guess it would work without reversing plane order while creating numpy, and then not to trying to shuffle it back to original order,
sorry for confusion on my part, you were using opencv for feedback, and it is actually opencv that works with BGR, not numpy, numpy is just bunch of data,
if there is no use for opencv in the middle, plane order could be the same the whole flow
Similar Threads
-
encoding with vapoursynth.
By zanzar in forum Newbie / General discussionsReplies: 31Last Post: 7th Mar 2019, 14:57 -
Qt Vapoursynth simple viewer example?
By Selur in forum ProgrammingReplies: 3Last Post: 20th Mar 2018, 11:02 -
Vapoursynth: How to getting BilateralGPU to work?
By Selur in forum Newbie / General discussionsReplies: 0Last Post: 17th Feb 2018, 05:58 -
StaxRip 1.7.0.0 with Vapoursynth ?
By locky in forum Video ConversionReplies: 6Last Post: 24th Jan 2018, 18:08 -
vapoursynth
By logicom in forum Video ConversionReplies: 0Last Post: 1st Dec 2014, 07:26