I've got Blackmagic UltraStudio SDI. It captures video in one of these formats:
AVI 8-bit YUV
AVI 10-bit YUVCode:Format : YUV Codec ID : UYVY Codec ID/Info : Uncompressed 16bpp. YUV 4:2:2 (Y sample at every pixel, U and V sampled at every second pixel horizontally on each line). A macropixel contains 2 pixels in 1 u_int32. Width : 1 280 pixels Height : 720 pixels Display aspect ratio : 16:9 Frame rate : 59.940 fps Color space : YUV Chroma subsampling : 4:2:2 Compression mode : Lossless Bits/(Pixel*Frame) : 16.000
AVI 10-bit RGBCode:Format : YUV Codec ID : v210 Codec ID/Hint : AJA Video Systems Xena Bit rate : 1 193 Mbps Width : 1 280 pixels Height : 720 pixels Display aspect ratio : 16:9 Frame rate : 59.940 fps Color space : YUV Chroma subsampling : 4:2:2 Bit depth : 10 bits Compression mode : Lossless
It's not possible to save and work with such video files (more than 300Gb for 1 hour) so I'm looking for right way to compress it without visible loss of quality. FFmpeg works great, but I'd like to use Nvidia graphic card (k600) instead of CPU. FFmpeg with nvenc enabled works exelent for not raw video files. Up to 5x faster than CPU.Code:Format : r210 Codec ID : r210 Bit rate : 1 768 Mbps Width : 1 280 pixels Height : 720 pixels Display aspect ratio : 16:9 Frame rate : 59.940 fps Bits/(Pixel*Frame) : 32.000
I'm trying to fing the right parameters for ffmpeg with nvenc to use with any of listed formats (raw video). Something like this (this example makes only interferences):If I just useCode:ffmpeg_nvenc.exe -f rawvideo -vcodec rawvideo -s 1280x720 -r 59.94 -pix_fmt yuv422p10le -i inputfile.avi -b:v 5000k -an -vcodec nvenc_h264 outfile.mp4then I've got an errorCode:ffmpeg inputfile.avi -b 5000k -an -vcodec nvenc_h264 output.mp4UsingNo NVENC capable devices found. Error while opening encoder for output stream - maybe incorrect parametrs such as bit_rate, rate, width or heght.works fine, but not so fast as GPU can.Code:ffmpeg -vcodec h264
+ Reply to Thread
Results 1 to 26 of 26
The reason is nvenc only supports 8bit 4:2:0 . Your other successful tests probably used 8bit 4:2:0 input formats
You would have to convert, add this
If I just add -pix_fmt yuv420p and getCode:
ffmpeg_nvenc -pix_fmt yuv420p -i input.avi -an -vcodec nvenc_h264 -b 5000k output.mp4Option pixel_format not foundCode:
ffmpeg_nvenc -f rawvideo -s:v 1280x720 -pix_fmt yuv420p -i input.avi -an -vcodec nvenc_h264 -b 5000k output.mp4
Add it after the -i (ie. you cannot add it as an input command, but as an output command, because you are converting from input to output)
Adding commands before the -i deal with specifying parameters about the input(s)
The order matters in the ffmpeg commandline
It should look something like this
ffmpeg -i input.avi -pix_fmt yuv420p -c:v nvenc -b:v 50000k -an output.mp4
Is this a quadro k600 (kepler generation)? Because they are relatively slow for nvenc compared to newer generation cards. It's independent of cuda cores or shaders, because the encoding is done on a separate part of the card. A $50 card will encode as fast as a $1000 card of the same generation
Another option is you can run CPU encoding a few times faster than the default "medium" by using a faster preset. eg.
-c:v libx264 -preset:v superfast
What is the usage scenario? How are you going to be using this or what applications ? For example, if RGB full color was required, nvenc wouldn't be an option. Or if you needed 4:2:2 (maybe for broadcast scenarios or better interlaced handling), or to keep 10bit, nvenc wouldn't be an option. x264 can be compiled with all those options
Quality/compression wise, nvenc is one of the worst for h.264 encoding. You need much higher bitrates than CPU x264 which is best in class for h.264, for similar quality. You can improve NVEnc slightly by modifying b-frames (-bf) and GOP (-g) size according to your scenario, they might not be optimal at defaults. It also has a -preset switch, I think medium at default. Look at -h full under the nvenc section for available switches. But really the quality isn't very good. You really need almost 1.5-2x the filesizes (bitrate) for equivalent compression/quality
Can you do a quick test?
With the device connected to an active signal, can you run this and see if ffmpeg "sees" the BM UltraStudio SDI ?
ffmpeg -list_devices true -f dshow -i dummy
ffmpeg -list_devices true -f dshow -i dummy
It should work for 10bit RGB as well, are you saying it didn't ?
Let me explain the usage scenario.
At the hospital, every day carried out surgery using laparoscopic equipment. All surgical procedures are recorded. About 8 ours can be recorded in one day. Sometimes less.
At the moment, the only equipment that is available for video capture - it Blackmagic UltraStudio SDI.
As I wrote, Blackmagic UltraStudio LEDs captures raw video with bitrate more than 50 Mb/sec.
Keep these amounts is unrealistic. And there is only one night between working days, to compress the video. Accordingly, the compression of the video should be faster than 1x of original video. Plus we need some time to move files from capture workstantion using 1Gb lan.
We have a server with two E5-2609v2 and some hdd in raid 0 on this server for temporary storage and conversion. I also added one Nvidia Quadro k600 in this server.
I compared different variants, and the main results are:
-c:v libx264 -preset:v superfast gets the same conversion speed and output file size as -pix_fmt yuv420p -c:v nvenc / About 180fps (original video is 59.94 fps)
I get the best conversion speed when convert AVI 8-bit YUV
It needs some time to compare the quality, but at present moment I can't see the real different. And the very best quality is not the main aspect.
Also I tested conversion speed on some video files. Not raw video (some camera captures and other medical equipment). -c:v nvenc converts up to 5 times faster than just -vcodec h264 on this server
NVEnc should be able to run 2 instances in parallel if it's not saturated (I've done this before for SD encodes, but for 720p59.94 you might be close to saturated with 1 instance).
Just to put things in perspective, if Kepler is 1x speed, Maxwell is about 2x speed, Maxwell2 is about 3.75x speed
I noticed that when converting 700 GB file size the process is 2-3 times slower than converting the a file (the same video input, but shorter time) size of 70GB. And no processor or graphics card are not loaded more than 50% when converting 700 GB file instead of 80-90% load when converting 70GB.
Is this 1pass VBR ? same settings ?
What OS ?
You said Raid-0 , but is it possible there are some I/O issues? What are src/destination setups ?
How are you measuring CPU/GPU load ?
You might try NVEncC by rigaya, the standalone commandline implementation version to see if it makes a difference
I compare this:
ffmpeg_nvenc -i input.avi -pix_fmt yuv420p -c:v nvenc -an -b:v 5000k output.mp4
ffmpeg_nvenc -i input.avi -pix_fmt yuv420p -c:v libx264 -preset:v superfast -an -b:v 5000k output.mp4
on the same file.
OS: Windows Server 2012 r2 standart
No, there is no I/O issues. More than that, raid 0 provides better perfomance than a single disk (but without fault tolerance in case of failure of one of the disks instead of raid 5,6 or 10).
I'm monitoring cpu/gpu load with any lightweight system tool. For example Open hardware monitor. There is about 80-90% load on conversion start. And then load falls to about 30% in 1-2 minutes.
Ok, I'll try NVEncC by rigaya in a moment.
Is it possible to use -pix_fmt yuv420p in NVEncC by rigaya?
I realize raid-0 offers better throughput , but you can still run into I/O issues . For example , if you are near capacity even a 4 disk raid-0 can be insufficient. Your observations suggest a bottleneck, you have to rule out things, including transfer I/O
Have you looked at cooling / thermal load and throttling by CPU or GPU ?
I can't recall offhand if NVEncC will do the 4:2:0 conversion for you, there are some new versions (it's more frequently updated than ffmpeg nvenc). I'll have a look later today if I have time. One possible issue for you is it only outputs elementary streams, so muxing is required (you can do a batch file for example)
Try a random youtube video 1080p30 or even 1080p60, that is more than 10minutes long. If duration/throttling is an issue, you should experience that after a while. The purpose is not to emulate your source characteristics, but to test how robust your encoding setup is
What was the content of your tests ? ie. were they similar? Higher complexity content can encode at a slower rate because motion vectors are more complex. For example, a still image or duplicate frames will typically encode faster than some action scene in a movie
Last edited by poisondeathray; 6th Apr 2016 at 09:25.
The video shows the movement of the surgical laparoscopic instruments inside the patient's body during surgery. Videos can not be called very dynamic. But it is not a static.
I will closely monitor raid I/O but now there are no issues.
There are no cooling or temperature problems.
Okey, give me a time, I'll test cpu/cpu ffmpeg/NVEncC by rigaya on 30 minutes youtube video 1080p30.
Okey, I've got a random youtube video 1080p30 no audio
youtube_1080p.mp4 size:988 MB
Format : AVC Format/Info : Advanced Video Codec Format profile : High@L4 Format settings, CABAC : Yes Format settings, ReFrames : 3 frames Codec ID : avc1 Codec ID/Info : Advanced Video Coding Duration : 35mn 16s Bit rate : 3 915 Kbps Width : 1 920 pixels Height : 1 080 pixels Display aspect ratio : 16:9 Frame rate mode : Constant Frame rate : 30.000 fps Color space : YUV Chroma subsampling : 4:2:0 Bit depth : 8 bits Scan type : Progressive Bits/(Pixel*Frame) : 0.063 Stream size : 988 MiB (100%)
ffmpeg_nvenc -i D:\youtube_1080p.mp4 -c:v nvenc D:\youtube_1080p_converted.mp4
gpu load 98% fps=79 bitrate about 1900 kb/sec speed=2.64x
ffmpeg_nvenc -i D:\youtube_1080p.mp4 -c:v libx264 D:\youtube_1080p_converted.mp4
cpu load 97% fps=35 bitrate about 4500 kb/sec speed=1.12x
==FFmpeg oficial build without nvenc==
ffmpeg -i D:\youtube_1080p.mp4 -c:v libx264 D:\youtube_1080p_converted.mp4
cpu load 99% fps=38 bitrate about 4700 kb/sec speed=1.2x
==NVEncC by rigaya==
NVEncC64 -i D:\youtube_1080p.mp4 -c h264 -o D:\youtube_1080p_converted.mp4
gpu load 98% fps=93 bitrate about 10500 kb/sec
Maybe the problem is the size of raw video files I need to convert? From 250 GB up to 1,5 TB
Even with an infinitely fast encoder you can only encode as fast as it can read the source file.
Did you watch the FPS as it went by for the YT tests? Was it pretty much constant or a dip as it progressed ?
Obviously the YT videos were smaller in filesize, but if your raid-0 setup cannot sustain the transfer rates for the uncompressed files you would expect a dip as well. The accompanying drop in utilization % suggests bottleneck, and most likely culprit is I/O. Initial speed might be faster because of buffer and cached read, then it might dip
Another process to look at is the 4:2:2 => 4:2:0 conversion (I'm assuming the UYVY 4:2:2 files). But it's less likely a bottleneck
You can do a NUL test to test if it dips and is the culprit (ie. speed test with no encoding, just everything prior to NVEnc), just let it run and watch speed
ffmpeg -i input.avi -c:v rawvideo -pix_fmt yuv420p -an -f null NUL
But is your 250 R/W sustained STR ? usually the access pattern of large files like uncompressed videos are sequential, but you need sustained rates
Last edited by jagabo; 6th Apr 2016 at 12:23.
When converted youtube 1080p30 no audio mp4 size:988 MB video file FPS was constant.
Okey, lets run some test
Quadro K600 Nvidia driver 361.91 & 2 x Intel xeon e5-2609 v2
youtube 1080p30 no audio mp4 size:988 MB video file
ffmpeg -i input.avi -c:v rawvideo -an -f null NUL
ffmpeg_nvenc -i input.avi -c:v rawvideo -an -f null NUL
my work raw video file: YUV 4:2:2, 59.94 fps, 70+Mb/sec bitrate, size:1 TB
ffmpeg -i 1tb.avi -c:v libx264 -pix_fmt yuv420p -an -f null NUL
ffmpeg_nvenc -i 1tb.avi -c:v nvenc -pix_fmt yuv420p -an -f null NULL
----------------------------------------------------------------------- CrystalDiskMark 5.1.2 x64 (C) 2007-2016 hiyohiyo Crystal Dew World : http://crystalmark.info/ ----------------------------------------------------------------------- * MB/s = 1,000,000 bytes/s [SATA/600 = 600,000,000 bytes/s] * KB = 1000 bytes, KiB = 1024 bytes Sequential Read (Q= 32,T= 1) : 754.907 MB/s Sequential Write (Q= 32,T= 1) : 557.620 MB/s Random Read 4KiB (Q= 32,T= 1) : 10.746 MB/s [ 2623.5 IOPS] Random Write 4KiB (Q= 32,T= 1) : 5.211 MB/s [ 1272.2 IOPS] Sequential Read (T= 1) : 618.010 MB/s Sequential Write (T= 1) : 102.340 MB/s Random Read 4KiB (Q= 1,T= 1) : 0.972 MB/s [ 237.3 IOPS] Random Write 4KiB (Q= 1,T= 1) : 0.615 MB/s [ 150.1 IOPS] Test : 1024 MiB [D: 24.4% (1820.4/7449.9 GiB)] (x5) [Interval=5 sec] Date : 2016/04/06 22:31:30 OS : Windows Server 2012 R2 [6.3 Build 9600] (x64)Code:
----------------------------------------------------------------------- CrystalDiskMark 5.1.2 x64 (C) 2007-2016 hiyohiyo Crystal Dew World : http://crystalmark.info/ ----------------------------------------------------------------------- * MB/s = 1,000,000 bytes/s [SATA/600 = 600,000,000 bytes/s] * KB = 1000 bytes, KiB = 1024 bytes Sequential Read (Q= 32,T= 1) : 575.726 MB/s Sequential Write (Q= 32,T= 1) : 551.552 MB/s Random Read 4KiB (Q= 32,T= 1) : 4.243 MB/s [ 1035.9 IOPS] Random Write 4KiB (Q= 32,T= 1) : 3.480 MB/s [ 849.6 IOPS] Sequential Read (T= 1) : 566.860 MB/s Sequential Write (T= 1) : 102.541 MB/s Random Read 4KiB (Q= 1,T= 1) : 0.524 MB/s [ 127.9 IOPS] Random Write 4KiB (Q= 1,T= 1) : 0.448 MB/s [ 109.4 IOPS] Test : 32768 MiB [D: 24.4% (1820.4/7449.9 GiB)] (x1) [Interval=5 sec] Date : 2016/04/06 22:40:26 OS : Windows Server 2012 R2 [6.3 Build 9600] (x64)
And what about using -pix_fmt yuv420p with NVEncC by rigaya? Is it possible?
No it's doesn't look like NVEncC can do it internally. You can let ffmpeg do the conversion and pipe to nvencc, if ffmpeg nvenc was the problem
Something is inconsistent with your results, or maybe there is a typo or I'm reading it incorrectly : the ffmpeg build should give similar result as the ffmpeg_nvenc when using no codec. There should be no GPU usage either, it's just a pure reading/decoding test. Nothing is being encoded. GPU isn't being used. If input.avi was small, then it might be a cached read skewing the observation. Input.avi shouldn't be a YT video.
When you add -pix_fmt yuv420p to that speed test as I had it written above, it's read + 4:2:0 conversion, that should be slower . The difference in time will be the effect of the 422=>420 conversion only
Not sure what's going on in your case. Seems to work ok for large files
721GB low motion content test 1280x720p59.94 8bit 4:2:2 UYVY
source SSD R/W ~500MB/s to a destination slow 5400rpm mechanical HDD, Maxwell 1 card
~250-255 FPS per instance, but only ~30-35% GPU load. No slowdown.
Hmmm 35% load? So I ran another instance about 75% though. ~250-255 FPS per instance each, ~60-70% GPU total . Total additive FPS would be ~500-510. No slowdowns. IIRC 2 instances is max per card for NVEnc, even if you have idle/spare GPU capacity
I used default ffmpeg nvenc settings, which are no b-frames (no b-frames speed up encoding, but negatively impact compression), 12 frame GOP length (same - shorter keyframe interval speed ups encoding, but worse compression), 5000kbps same as your (higher bitrates slows encoding too)
I think the reason for the low GPU% is the low motion and low compression settings by ffmpeg nvenc defaults. NVEnCC defaults are higher, higher quality.
I ran two separate conversions. GPU load raised from 40% (one converison) up to 55%. Total fps raised up only about 20 FPS.
I will look for hdd speed optimisation. But there are no SSD at the moment.