Replacing cv2.imread operation in VapourSynth?

Thread

8th Jul 2019 11:33 #1
PRAGMA

View Profile

View Forum Posts

Private Message
Member

Join Date
Dec 2018
Ok, so I have the following code extracted from ESRGAN:

Code:

output = model(torch.from_numpy(np.transpose((cv2.imread(path, cv2.IMREAD_COLOR) * 1.0 / 255)[:, :, [2, 1, 0]], (2, 0, 1))).float().unsqueeze(0).to(cuda)).data.squeeze().float().cpu().clamp_(0, 1).numpy()

Im looking to replace the cv2.imread() bit with the vapoursynth clip.
Im having troubles doing so and was wondering if anyone knows how?
Quote

8th Jul 2019 14:41 #2

Member

numpy does not work with vs.VideoNode type of object (vapoursynths clip) ,so you'd need to change vapoursynths clip into numpy array

take this as a starter, could not be tested, not sure what that *1/255 is 16bit to 8bit? ffms2 would load it as 8bit with vs.RGB24

Code:

import numpy as np
import cv2
import vapoursynth as vs

path = r'\some_image.jpg'

rgb_clip = vs.core.ffms2.Source(path, format=vs.RGB24, alpha=False) #1frame clip

#number of planes , for rgb should be 3 anyway
planes = rgb_clip.format.num_planes

#making numpy array from vapoursynths videonode
list_of_arrays = [np.array(rgb_clip.get_frame(0).get_read_array(i), copy=False) for i in range(planes)]
numpy_array = np.dstack(list_of_arrays)

#not sure what that *1/255 is, transcoding from 16bit to 8bit?
#output = model(torch.from_numpy(np.transpose((cv2.imread(path, cv2.IMREAD_COLOR) * 1.0 / 255)[:, :, [2, 1, 0]], (2, 0, 1))).float().unsqueeze(0).to(cuda)).data.squeeze().float().cpu().clamp_(0, 1).numpy()
output = model(torch.from_numpy(np.transpose((numpy_array* 1.0 / 255)[:, :, [2, 1, 0]], (2, 0, 1))).float().unsqueeze(0).to(cuda)).data.squeeze().float().cpu().clamp_(0, 1).numpy()

Last edited by _Al_; 8th Jul 2019 at 14:46.

Quote

8th Jul 2019 14:46 #3

PRAGMA

Member

Originally Posted by _Al_

Code:

import numpy as np
import cv2
import vapoursynth as vs

path = r'\some_image.jpg'

rgb_clip = vs.core.ffms2.Source(path, format=vs.RGB24, alpha=False) #1frame clip

#number of planes , for rgb should be 3 anyway
planes = rgb_clip.format.num_planes

#making numpy array from vapoursynths videonode
list_of_arrays = [np.array(rgb_clip.get_frame(0).get_read_array(i), copy=False) for i in range(planes)]
numpy_array = np.dstack(list_of_arrays)

#not sure what that *1/255 is, transcoding from 16bit to 8bit? ffms would load it as 8bit (vs.RGB24)
#output = model(torch.from_numpy(np.transpose((cv2.imread(path, cv2.IMREAD_COLOR) * 1.0 / 255)[:, :, [2, 1, 0]], (2, 0, 1))).float().unsqueeze(0).to(cuda)).data.squeeze().float().cpu().clamp_(0, 1).numpy()
output = model(torch.from_numpy(np.transpose(numpy_array[:, :, [2, 1, 0]], (2, 0, 1))).float().unsqueeze(0).to(cuda)).data.squeeze().float().cpu().clamp_(0, 1).numpy()

Great! Ill test that in a second, and is it possible to have this work for whatever the current frame is?

Quote

8th Jul 2019 15:01 #4
_Al_

View Profile

View Forum Posts

Private Message
Member

Join Date
Feb 2011
you just make it as function and you pass a frame , number of frames, whatever

Code:

clip = vs.core.ffms2.Source(path_video , format = ....) for frame in range(0, len(clip)): call_your_function(frame, ....)

that should be passed in a threaded function , where you could set up a queue, if it suppose to be for preview etc., you'd need some timer going on as well if it really is for preview
Quote
8th Jul 2019 15:13 #5
PRAGMA

View Profile

View Forum Posts

Private Message
Member

Join Date
Dec 2018
Originally Posted by _Al_

you just make it as function and you pass a frame , number of frames, whatever

Code:

clip = vs.core.ffms2.Source(path_video , format = ....) for frame in range(0, len(clip)): call_your_function(frame, ....)

that should be passed in a threaded function , where you could set up a queue, if it suppose to be for preview etc., you'd need some timer going on as well if it really is for preview

Yeah the goal is for it to work via the preview and not end up doing all the frames of the clip whenever i open preview or change frame.
Quote
8th Jul 2019 15:18 #6
PRAGMA

View Profile

View Forum Posts

Private Message
Member

Join Date
Dec 2018
Also, with resulting "output", i need to return that back as an image, this is how I do so with cv2:

Code:

cv2.imwrite('[] OUTPUT/{:s}.png'.format(base), (np.transpose(output[[2, 1, 0], :, :], (1, 2, 0)) * 255.0).round())

Not sure how im meant to do that without imwrite.
Quote

8th Jul 2019 15:25 #7

PRAGMA

Member

Also, I believe where it says get_frame(0), that should be (frame) not 0?
And when thats fixed, I get the following error:
"Python exception: all the input array dimensions except for the concatenation axis must match exactly"
Stack:

Code:

File "src\cython\vapoursynth.pyx", line 1942, in vapoursynth.vpy_evaluateScript
File "src\cython\vapoursynth.pyx", line 1943, in vapoursynth.vpy_evaluateScript
File "C:/Users/PRAGMA/VapourSynth/Scripts/Deinterlace.vpy", line 50, in 
numpy_array = np.dstack([np.array(c.get_frame(0).get_read_array(i), copy=False) for i in range(planes)])
File "C:\Program Files\Python37\lib\site-packages\numpy\lib\shape_base.py", line 699, in dstack
return _nx.concatenate([atleast_3d(_m) for _m in tup], 2)
ValueError: all the input array dimensions except for the concatenation axis must match exactly

Quote

8th Jul 2019 15:27 #8

PRAGMA

Member

This is my current code:

Code:

import vapoursynth as vs
import functools
core = vs.get_core()

# --------------------------------------------------
# CONFIGURATION
# --------------------------------------------------
SourcePath = r"C:\Users\PRAGMA\Videos\American Dad Stuff\S01E04 [framerate bit weird].mkv"
c = core.ffms2.Source(SourcePath)

# --------------------------------------------------
# Tensor
# --------------------------------------------------
import cv2
import numpy as np
import torch
import RRDBNet_arch as arch

#craft the model
cuda = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = arch.RRDBNet(3, 3, 64, 23, gc=32)
model.load_state_dict(torch.load(r"H:\2; Models\ad 160k_tf.pth"), strict=True)
model.eval()
model = model.to(cuda)
#send the frames through pytorch, through the model, through cuda
for frame in range(0, len(c)):
	#number of planes , for rgb should be 3 anyway
	planes = c.format.num_planes
	#making numpy array from vapoursynths videonode
	numpy_array = np.dstack([np.array(c.get_frame(frame).get_read_array(i), copy=False) for i in range(planes)])
	#run it against pytorch with the model
	with torch.no_grad():
		output = model(torch.from_numpy(np.transpose(numpy_array[:, :, [2, 1, 0]], (2, 0, 1))).float().unsqueeze(0).to(cuda)).data.squeeze().float().cpu().clamp_(0, 1).numpy()
	#transpose result, and write it to a file	
	cv2.imwrite('{:s}.png'.format(base), (np.transpose(output[[2, 1, 0], :, :], (1, 2, 0)) * 255.0).round())

#still wont have the resulting files, as the line above, im not sure how to appropriate with clip
c.set_output()

clip is "c" here to minimize filesize
RRDBNet_arch.py can be gotten here:
https://github.com/xinntao/ESRGAN/blob/master/RRDBNet_arch.py

Quote

8th Jul 2019 15:35 #9
PRAGMA

View Profile

View Forum Posts

Private Message
Member

Join Date
Dec 2018
Oh, it seems the rgb format and alpha = false mattered

Quote
8th Jul 2019 15:36 #10
PRAGMA

View Profile

View Forum Posts

Private Message
Member

Join Date
Dec 2018
So that works, however, when you start a preview, it will run all frames at once, which obviously isnt ideal :/
Im unaware how to proceed here, so if you can help id totally appreciate it! :L

Quote
8th Jul 2019 15:50 #11
PRAGMA

View Profile

View Forum Posts

Private Message
Member

Join Date
Dec 2018
The result though is a way broken color haha xd

Quote
8th Jul 2019 16:35 #12
PRAGMA

View Profile

View Forum Posts

Private Message
Member

Join Date
Dec 2018
I fixed the color by simply doing reversed(range(planes)), seems it expected them in a reversed order.
But it still requires me to convert to RGB, which seems to cause artifacts and worse quality :/

Quote

8th Jul 2019 16:46 #13

PRAGMA

Member

Ok!
So, to fix the color issues:
Add back in the * 1.0 / 255 (seems to handle float conversion and then brightness essentially)
Then, we load in via ffms2 in original YUV420P8, then after that, convert to RGB24, no idea why, but only seems to work by doing the following (even though there are many other ways)

Code:

c = core.resize.Point(c, format=vs.RGB24)

Heres my script:

Code:

import vapoursynth as vs
import functools
import havsfunc
core = vs.get_core()

# --------------------------------------------------
# CONFIGURATION
# --------------------------------------------------
SourcePath = r"C:\Users\PRAGMA\Videos\American Dad Stuff\S01E04 [framerate bit weird].mkv"

# --------------------------------------------------
# SOURCE
# --------------------------------------------------
c = core.ffms2.Source(SourcePath, alpha=False)
c = core.resize.Point(c, format=vs.RGB24)

# --------------------------------------------------
# ESRGAN
# --------------------------------------------------
import cv2
import numpy as np
import torch
import RRDBNet_arch as arch
def ESRGAN(c, cuda, model, frame_num):
	#making numpy array from vapoursynths videonode
	numpy_array = np.dstack([np.array(c.get_frame(frame_num).get_read_array(i), copy=False) for i in reversed(range(c.format.num_planes))])
	#run it against pytorch with the model
	with torch.no_grad():
		output = model(torch.from_numpy(np.transpose((numpy_array * 1.0 / 255)[:, :, [2, 1, 0]], (2, 0, 1))).float().unsqueeze(0).to(cuda)).data.squeeze().float().cpu().clamp_(0, 1).numpy()
	return (np.transpose(output[[2, 1, 0], :, :], (1, 2, 0)) * 255.0).round()

#craft the model
cuda = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = arch.RRDBNet(3, 3, 64, 23, gc=32)
model.load_state_dict(torch.load(r"H:\2; Models\ad 160k_tf.pth"), strict=True)
model.eval()
model = model.to(cuda)
#send the frames through pytorch, through the model, through cuda
for frame_num in range(0, len(c)):
	cv2.imwrite(str(frame_num) + ".png", ESRGAN(c, cuda, model, frame_num))

c.set_output()

I even went ahead and put it into a function as advised earlier, but I do not know how to "queue" it so it only does the requested frame

I also dont know how to get the result from ESRGAN's return and put it back into the clip in a format the clip expects (VapourSynth<VideoFrame>?)

For putting back into clip, Maybe something like: clip[frame_num] = ConvertBackToVSVideoFrame(ESRGAN(...))

Quote

8th Jul 2019 17:26 #14

PRAGMA

Member

Code:

# --------------------------------------------------
# ESRGAN
# --------------------------------------------------
import cv2
import numpy as np
import torch
import RRDBNet_arch as arch

#craft the model
cuda = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = arch.RRDBNet(3, 3, 64, 23, gc=32)
model.load_state_dict(torch.load(r"H:\2; Models\ad 160k_tf.pth"), strict=True)
model.eval()
model = model.to(cuda)

def ESRGAN(n, c, cuda, model):
	#making numpy array from vapoursynths videonode
	numpy_array = np.dstack([np.array(c.get_frame(n).get_read_array(i), copy=False) for i in reversed(range(c.format.num_planes))])
	#run it against pytorch with the model
	with torch.no_grad():
		output = model(torch.from_numpy(np.transpose((numpy_array * 1.0 / 255)[:, :, [2, 1, 0]], (2, 0, 1))).float().unsqueeze(0).to(cuda)).data.squeeze().float().cpu().clamp_(0, 1).numpy()
	cv2.imwrite("test.png", (np.transpose(output[[2, 1, 0], :, :], (1, 2, 0)) * 255.0).round())
	return c

c = core.std.FrameEval(c, functools.partial(ESRGAN, c=c, cuda=cuda, model=model))
c.set_output()

Got the frame only working!

FrameEval was the holy grail.

Now I just need to figure out how to get numpy array to clip :/

Quote

8th Jul 2019 18:11 #15
PRAGMA

View Profile

View Forum Posts

Private Message
Member

Join Date
Dec 2018
I really cant figure out what to do here

Quote
8th Jul 2019 20:44 #16
_Al_

View Profile

View Forum Posts

Private Message
Member

Join Date
Feb 2011
numpy works in BGR, vapoursynth has planes lined up as RGB , I forgot that

So I assume instead of returning c you need to return that new output as vs.VideoNode? Besides storing images on disk,? That is what cv2.imgwrite does.

That is why I dropped using numpy early as well because of that other conversion, back to videonode from numpy. I use numpy only for "visual", something quick on screen, effect etc., but it actually is never needed as outcome.

There is someone who who published codes for this though, I was looking at that code a year ago, I have to find it again. I'll find it.

Quote
8th Jul 2019 21:32 #17
_Al_

View Profile

View Forum Posts

Private Message
Member

Join Date
Feb 2011
I found it here, https://github.com/KotoriCANOE/MyTF/blob/master/utils/vshelper.py
so, what is your output, float? # convert float32 np.ndarray to float32 vs.VideoNode

something:

Code:

import vshelper . . #within that eval() out = vshelper.float32_vsclip(output, clip=c) return out . . c = core.std.FrameEval(c, functools.partial(ESRGAN, c=c, cuda=cuda, model=model)) #conversion from float to YUV420P8 or whatever c.set_output()

that would not work most likely, you might create copy of that c clip first, new = c.copy() within that eval or just one frame copy
Last edited by _Al_; 8th Jul 2019 at 21:41.
Quote

8th Jul 2019 23:03 #18

_Al_

Member

this would give me ok output , using FrameEval() to change vs.VideoNode to numpy and back:

Code:

import vapoursynth as vs
from vapoursynth import core
import numpy as np
import functools
import vshelper

path = r'videofile.mp4'
clip = core.ffms2.Source(path)
clip = core.resize.Point(clip, format = vs.RGBS)

def to_numpy_and_back(n, clip):
   list_of_arrays = [np.array(clip.get_frame(n).get_read_array(i), copy=False) for i in range(3)]
   numpy_array = np.dstack(list_of_arrays)
   out = vshelper.float32_vsclip(numpy_array, clip=None)
   return out

clip = core.std.FrameEval(clip, functools.partial(to_numpy_and_back, clip=clip))
clip = core.resize.Point(clip, matrix_s = '709', format = vs.YUV420P8)
clip.set_output()

Quote

8th Jul 2019 23:18 #19
_Al_

View Profile

View Forum Posts

Private Message
Member

Join Date
Feb 2011
so if there was real processing in numpy image , planes would need to be read in reverse order, then Shuffle planes back to correct order before clip.output()

Quote
9th Jul 2019 08:39 #20
PRAGMA

View Profile

View Forum Posts

Private Message
Member

Join Date
Dec 2018
Originally Posted by _Al_

so if there was real processing in numpy image , planes would need to be read in reverse order, then Shuffle planes back to correct order before clip.output()

Thank you the above worked, I had to edit it a bit to fix the planes but it works.
I put my resulting optimized code onto GitHub incase others are looking to do the same thing as I:
https://github.com/imPRAGMA/VSGAN

Quote
9th Jul 2019 11:41 #21
_Al_

View Profile

View Forum Posts

Private Message
Member

Join Date
Feb 2011
looking at that github link, I guess it would work without reversing plane order while creating numpy, and then not to trying to shuffle it back to original order,

sorry for confusion on my part, you were using opencv for feedback, and it is actually opencv that works with BGR, not numpy, numpy is just bunch of data,
if there is no use for opencv in the middle, plane order could be the same the whole flow

Quote

Replacing cv2.imread operation in VapourSynth?

Thread Tools

Similar Threads

encoding with vapoursynth.

Qt Vapoursynth simple viewer example?

Vapoursynth: How to getting BilateralGPU to work?

StaxRip 1.7.0.0 with Vapoursynth ?

vapoursynth