VideoHelp Forum
+ Reply to Thread
Results 1 to 26 of 26
Thread
  1. Hello,

    I've got Blackmagic UltraStudio SDI. It captures video in one of these formats:

    AVI 8-bit YUV
    MediaInfo
    Code:
    Format                      : YUV
    Codec ID                    : UYVY
    Codec ID/Info               : Uncompressed 16bpp. YUV 4:2:2 (Y sample at every pixel, U and V sampled at every second pixel horizontally on each line). A macropixel contains 2 pixels in 1 u_int32.
    Width                       : 1 280 pixels
    Height                      : 720 pixels
    Display aspect ratio        : 16:9
    Frame rate                  : 59.940 fps
    Color space                 : YUV
    Chroma subsampling          : 4:2:2
    Compression mode            : Lossless
    Bits/(Pixel*Frame)          : 16.000
    AVI 10-bit YUV
    MediaInfo
    Code:
    Format                      : YUV
    Codec ID                    : v210
    Codec ID/Hint               : AJA Video Systems Xena
    Bit rate                    : 1 193 Mbps
    Width                       : 1 280 pixels
    Height                      : 720 pixels
    Display aspect ratio        : 16:9
    Frame rate                  : 59.940 fps
    Color space                 : YUV
    Chroma subsampling          : 4:2:2
    Bit depth                   : 10 bits
    Compression mode            : Lossless
    AVI 10-bit RGB
    MediaInfo
    Code:
    Format                      : r210
    Codec ID                    : r210
    Bit rate                    : 1 768 Mbps
    Width                       : 1 280 pixels
    Height                      : 720 pixels
    Display aspect ratio        : 16:9
    Frame rate                  : 59.940 fps
    Bits/(Pixel*Frame)          : 32.000
    It's not possible to save and work with such video files (more than 300Gb for 1 hour) so I'm looking for right way to compress it without visible loss of quality. FFmpeg works great, but I'd like to use Nvidia graphic card (k600) instead of CPU. FFmpeg with nvenc enabled works exelent for not raw video files. Up to 5x faster than CPU.

    I'm trying to fing the right parameters for ffmpeg with nvenc to use with any of listed formats (raw video). Something like this (this example makes only interferences):
    Code:
    ffmpeg_nvenc.exe -f rawvideo -vcodec rawvideo -s 1280x720 -r 59.94 -pix_fmt yuv422p10le -i inputfile.avi -b:v 5000k -an -vcodec nvenc_h264 outfile.mp4
    If I just use
    Code:
    ffmpeg inputfile.avi -b 5000k -an -vcodec nvenc_h264 output.mp4
    then I've got an error
    No NVENC capable devices found. Error while opening encoder for output stream - maybe incorrect parametrs such as bit_rate, rate, width or heght.
    Using
    Code:
    ffmpeg -vcodec h264
    works fine, but not so fast as GPU can.

    Please, help.
    Quote Quote  
  2. The reason is nvenc only supports 8bit 4:2:0 . Your other successful tests probably used 8bit 4:2:0 input formats

    You would have to convert, add this
    Code:
    -pix_fmt yuv420p
    Quote Quote  
  3. Originally Posted by poisondeathray View Post
    The reason is nvenc only supports 8bit 4:2:0 . Your other successful tests probably used 8bit 4:2:0 input formats

    You would have to convert, add this
    Code:
    -pix_fmt yuv420p
    To which of these three formats is it possible to apply? AVI 8-bit YUV, AVI 10-bit YUV or AVI 10-bit RGB?

    If I just add -pix_fmt yuv420p and get
    Code:
    ffmpeg_nvenc -pix_fmt yuv420p -i input.avi -an -vcodec nvenc_h264 -b 5000k output.mp4
    then I get an error
    Option pixel_format not found
    If I use
    Code:
    ffmpeg_nvenc -f rawvideo -s:v 1280x720 -pix_fmt yuv420p -i input.avi -an -vcodec nvenc_h264 -b 5000k output.mp4
    to any of possible captured files then I get something like this Click image for larger version

Name:	1.png
Views:	426
Size:	2.34 MB
ID:	36406
    Quote Quote  
  4. Add it after the -i (ie. you cannot add it as an input command, but as an output command, because you are converting from input to output)

    Adding commands before the -i deal with specifying parameters about the input(s)

    The order matters in the ffmpeg commandline

    It should look something like this
    Code:
    ffmpeg -i input.avi -pix_fmt yuv420p -c:v nvenc -b:v 50000k -an output.mp4
    Quote Quote  
  5. Originally Posted by poisondeathray View Post
    Add it after the -i
    Oh, my mistake. Thanks a lot, now it works both for AVI 8 & 10-bit YUV
    Quote Quote  
  6. Originally Posted by vedroide2e4 View Post
    Originally Posted by poisondeathray View Post
    Add it after the -i
    Oh, my mistake. Thanks a lot, now it works both for AVI 8 & 10-bit YUV
    It should work for 10bit RGB as well, are you saying it didn't ?





    Is this a quadro k600 (kepler generation)? Because they are relatively slow for nvenc compared to newer generation cards. It's independent of cuda cores or shaders, because the encoding is done on a separate part of the card. A $50 card will encode as fast as a $1000 card of the same generation

    Another option is you can run CPU encoding a few times faster than the default "medium" by using a faster preset. eg.

    e.g
    -c:v libx264 -preset:v superfast

    What is the usage scenario? How are you going to be using this or what applications ? For example, if RGB full color was required, nvenc wouldn't be an option. Or if you needed 4:2:2 (maybe for broadcast scenarios or better interlaced handling), or to keep 10bit, nvenc wouldn't be an option. x264 can be compiled with all those options

    Quality/compression wise, nvenc is one of the worst for h.264 encoding. You need much higher bitrates than CPU x264 which is best in class for h.264, for similar quality. You can improve NVEnc slightly by modifying b-frames (-bf) and GOP (-g) size according to your scenario, they might not be optimal at defaults. It also has a -preset switch, I think medium at default. Look at -h full under the nvenc section for available switches. But really the quality isn't very good. You really need almost 1.5-2x the filesizes (bitrate) for equivalent compression/quality






    Can you do a quick test?

    With the device connected to an active signal, can you run this and see if ffmpeg "sees" the BM UltraStudio SDI ?

    Code:
    ffmpeg -list_devices true -f dshow -i dummy
    If your device is listed, it should be possible to compress directly if your card/system is fast enough. If it's not, it's safer to do what you're doing now, otherwise you risk framedrops
    Quote Quote  
  7. Yes,
    Code:
    ffmpeg -list_devices true -f dshow -i dummy
    shows me some devices and Blackmagic WDM Capture. But I don't sure that Intel core I3-4130 is fast enough to capture online.

    It should work for 10bit RGB as well, are you saying it didn't ?
    Yes, it works for 10bit RGB, but about 10 times slower.

    Let me explain the usage scenario.
    At the hospital, every day carried out surgery using laparoscopic equipment. All surgical procedures are recorded. About 8 ours can be recorded in one day. Sometimes less.

    At the moment, the only equipment that is available for video capture - it Blackmagic UltraStudio SDI.
    As I wrote, Blackmagic UltraStudio LEDs captures raw video with bitrate more than 50 Mb/sec.
    Keep these amounts is unrealistic. And there is only one night between working days, to compress the video. Accordingly, the compression of the video should be faster than 1x of original video. Plus we need some time to move files from capture workstantion using 1Gb lan.

    We have a server with two E5-2609v2 and some hdd in raid 0 on this server for temporary storage and conversion. I also added one Nvidia Quadro k600 in this server.

    I compared different variants, and the main results are:
    -c:v libx264 -preset:v superfast gets the same conversion speed and output file size as -pix_fmt yuv420p -c:v nvenc / About 180fps (original video is 59.94 fps)
    I get the best conversion speed when convert AVI 8-bit YUV

    It needs some time to compare the quality, but at present moment I can't see the real different. And the very best quality is not the main aspect.
    Quote Quote  
  8. Also I tested conversion speed on some video files. Not raw video (some camera captures and other medical equipment). -c:v nvenc converts up to 5 times faster than just -vcodec h264 on this server
    Quote Quote  
  9. NVEnc should be able to run 2 instances in parallel if it's not saturated (I've done this before for SD encodes, but for 720p59.94 you might be close to saturated with 1 instance).

    Just to put things in perspective, if Kepler is 1x speed, Maxwell is about 2x speed, Maxwell2 is about 3.75x speed
    Quote Quote  
  10. I noticed that when converting 700 GB file size the process is 2-3 times slower than converting the a file (the same video input, but shorter time) size of 70GB. And no processor or graphics card are not loaded more than 50% when converting 700 GB file instead of 80-90% load when converting 70GB.
    Quote Quote  
  11. Is this 1pass VBR ? same settings ?

    What OS ?

    You said Raid-0 , but is it possible there are some I/O issues? What are src/destination setups ?

    How are you measuring CPU/GPU load ?

    You might try NVEncC by rigaya, the standalone commandline implementation version to see if it makes a difference
    Quote Quote  
  12. I compare this:
    ffmpeg_nvenc -i input.avi -pix_fmt yuv420p -c:v nvenc -an -b:v 5000k output.mp4
    and
    ffmpeg_nvenc -i input.avi -pix_fmt yuv420p -c:v libx264 -preset:v superfast -an -b:v 5000k output.mp4
    on the same file.

    OS: Windows Server 2012 r2 standart

    No, there is no I/O issues. More than that, raid 0 provides better perfomance than a single disk (but without fault tolerance in case of failure of one of the disks instead of raid 5,6 or 10).

    I'm monitoring cpu/gpu load with any lightweight system tool. For example Open hardware monitor. There is about 80-90% load on conversion start. And then load falls to about 30% in 1-2 minutes.

    Ok, I'll try NVEncC by rigaya in a moment.
    Quote Quote  
  13. Is it possible to use -pix_fmt yuv420p in NVEncC by rigaya?
    Quote Quote  
  14. I realize raid-0 offers better throughput , but you can still run into I/O issues . For example , if you are near capacity even a 4 disk raid-0 can be insufficient. Your observations suggest a bottleneck, you have to rule out things, including transfer I/O

    Have you looked at cooling / thermal load and throttling by CPU or GPU ?

    I can't recall offhand if NVEncC will do the 4:2:0 conversion for you, there are some new versions (it's more frequently updated than ffmpeg nvenc). I'll have a look later today if I have time. One possible issue for you is it only outputs elementary streams, so muxing is required (you can do a batch file for example)

    Try a random youtube video 1080p30 or even 1080p60, that is more than 10minutes long. If duration/throttling is an issue, you should experience that after a while. The purpose is not to emulate your source characteristics, but to test how robust your encoding setup is

    What was the content of your tests ? ie. were they similar? Higher complexity content can encode at a slower rate because motion vectors are more complex. For example, a still image or duplicate frames will typically encode faster than some action scene in a movie
    Last edited by poisondeathray; 6th Apr 2016 at 09:25.
    Quote Quote  
  15. The video shows the movement of the surgical laparoscopic instruments inside the patient's body during surgery. Videos can not be called very dynamic. But it is not a static.

    I will closely monitor raid I/O but now there are no issues.

    There are no cooling or temperature problems.

    Okey, give me a time, I'll test cpu/cpu ffmpeg/NVEncC by rigaya on 30 minutes youtube video 1080p30.
    Quote Quote  
  16. Okey, I've got a random youtube video 1080p30 no audio

    youtube_1080p.mp4 size:988 MB
    Code:
    Format                      : AVC
    Format/Info                 : Advanced Video Codec
    Format profile              : High@L4
    Format settings, CABAC      : Yes
    Format settings, ReFrames   : 3 frames
    Codec ID                    : avc1
    Codec ID/Info               : Advanced Video Coding
    Duration                    : 35mn 16s
    Bit rate                    : 3 915 Kbps
    Width                       : 1 920 pixels
    Height                      : 1 080 pixels
    Display aspect ratio        : 16:9
    Frame rate mode             : Constant
    Frame rate                  : 30.000 fps
    Color space                 : YUV
    Chroma subsampling          : 4:2:0
    Bit depth                   : 8 bits
    Scan type                   : Progressive
    Bits/(Pixel*Frame)          : 0.063
    Stream size                 : 988 MiB (100%)
    ==FFmpeg with nvenc enabled build==
    GPU
    ffmpeg_nvenc -i D:\youtube_1080p.mp4 -c:v nvenc D:\youtube_1080p_converted.mp4
    gpu load 98% fps=79 bitrate about 1900 kb/sec speed=2.64x

    CPU
    ffmpeg_nvenc -i D:\youtube_1080p.mp4 -c:v libx264 D:\youtube_1080p_converted.mp4
    cpu load 97% fps=35 bitrate about 4500 kb/sec speed=1.12x
    =====

    ==FFmpeg oficial build without nvenc==
    ffmpeg -i D:\youtube_1080p.mp4 -c:v libx264 D:\youtube_1080p_converted.mp4
    cpu load 99% fps=38 bitrate about 4700 kb/sec speed=1.2x
    =====

    ==NVEncC by rigaya==
    GPU
    NVEncC64 -i D:\youtube_1080p.mp4 -c h264 -o D:\youtube_1080p_converted.mp4
    gpu load 98% fps=93 bitrate about 10500 kb/sec
    =====


    Maybe the problem is the size of raw video files I need to convert? From 250 GB up to 1,5 TB
    Quote Quote  
  17. Even with an infinitely fast encoder you can only encode as fast as it can read the source file.
    Quote Quote  
  18. Did you watch the FPS as it went by for the YT tests? Was it pretty much constant or a dip as it progressed ?

    Obviously the YT videos were smaller in filesize, but if your raid-0 setup cannot sustain the transfer rates for the uncompressed files you would expect a dip as well. The accompanying drop in utilization % suggests bottleneck, and most likely culprit is I/O. Initial speed might be faster because of buffer and cached read, then it might dip

    Another process to look at is the 4:2:2 => 4:2:0 conversion (I'm assuming the UYVY 4:2:2 files). But it's less likely a bottleneck

    You can do a NUL test to test if it dips and is the culprit (ie. speed test with no encoding, just everything prior to NVEnc), just let it run and watch speed

    Code:
    ffmpeg -i input.avi -c:v rawvideo -pix_fmt yuv420p -an -f null NUL
    Quote Quote  
  19. Originally Posted by jagabo View Post
    Even with an infinitely fast encoder you can only encode as fast as it can read the source file.
    What do you mean? I have problems with hdd read speed? Is there any different in reading 1GB file and 1TB file?
    I've tested hdd and get more than 250 MB/s read/write speed
    Quote Quote  
  20. But is your 250 R/W sustained STR ? usually the access pattern of large files like uncompressed videos are sequential, but you need sustained rates
    Quote Quote  
  21. Originally Posted by vedroide2e4 View Post
    I've tested hdd and get more than 250 MB/s read/write speed
    Uncompressed, 1920x1080, 8 bit YUV 4:2:2 , 30 fps is about 125 MB/s. So the best you can hope for is to encode at about 60 fps. If the source is compressed with a lossless codec to half that size, an infinitely fast encoder will double that throughput.
    Last edited by jagabo; 6th Apr 2016 at 12:23.
    Quote Quote  
  22. When converted youtube 1080p30 no audio mp4 size:988 MB video file FPS was constant.

    Okey, lets run some test

    Quadro K600 Nvidia driver 361.91 & 2 x Intel xeon e5-2609 v2


    youtube 1080p30 no audio mp4 size:988 MB video file
    Code:
    ffmpeg -i input.avi -c:v rawvideo -an -f null NUL
    FPS= 36, CPU load=99%

    Code:
    ffmpeg_nvenc -i input.avi -c:v rawvideo -an -f null NUL
    FPS=78, GPU load=98%

    my work raw video file: YUV 4:2:2, 59.94 fps, 70+Mb/sec bitrate, size:1 TB
    Code:
    ffmpeg -i 1tb.avi -c:v libx264 -pix_fmt yuv420p -an -f null NUL
    FPS=85, CPU load=99% suddenly

    Code:
    ffmpeg_nvenc -i 1tb.avi -c:v nvenc -pix_fmt yuv420p -an -f null NULL
    FPS starts at 160 and falls to 110 and continue falls, GPU load starts at 97% and falls to 48%

    Disk benchmarks
    Code:
    -----------------------------------------------------------------------
    CrystalDiskMark 5.1.2 x64 (C) 2007-2016 hiyohiyo
                               Crystal Dew World : http://crystalmark.info/
    -----------------------------------------------------------------------
    * MB/s = 1,000,000 bytes/s [SATA/600 = 600,000,000 bytes/s]
    * KB = 1000 bytes, KiB = 1024 bytes
    
       Sequential Read (Q= 32,T= 1) :   754.907 MB/s
      Sequential Write (Q= 32,T= 1) :   557.620 MB/s
      Random Read 4KiB (Q= 32,T= 1) :    10.746 MB/s [  2623.5 IOPS]
     Random Write 4KiB (Q= 32,T= 1) :     5.211 MB/s [  1272.2 IOPS]
             Sequential Read (T= 1) :   618.010 MB/s
            Sequential Write (T= 1) :   102.340 MB/s
       Random Read 4KiB (Q= 1,T= 1) :     0.972 MB/s [   237.3 IOPS]
      Random Write 4KiB (Q= 1,T= 1) :     0.615 MB/s [   150.1 IOPS]
    
      Test : 1024 MiB [D: 24.4% (1820.4/7449.9 GiB)] (x5)  [Interval=5 sec]
      Date : 2016/04/06 22:31:30
        OS : Windows Server 2012 R2  [6.3 Build 9600] (x64)
    Code:
    -----------------------------------------------------------------------
    CrystalDiskMark 5.1.2 x64 (C) 2007-2016 hiyohiyo
                               Crystal Dew World : http://crystalmark.info/
    -----------------------------------------------------------------------
    * MB/s = 1,000,000 bytes/s [SATA/600 = 600,000,000 bytes/s]
    * KB = 1000 bytes, KiB = 1024 bytes
    
       Sequential Read (Q= 32,T= 1) :   575.726 MB/s
      Sequential Write (Q= 32,T= 1) :   551.552 MB/s
      Random Read 4KiB (Q= 32,T= 1) :     4.243 MB/s [  1035.9 IOPS]
     Random Write 4KiB (Q= 32,T= 1) :     3.480 MB/s [   849.6 IOPS]
             Sequential Read (T= 1) :   566.860 MB/s
            Sequential Write (T= 1) :   102.541 MB/s
       Random Read 4KiB (Q= 1,T= 1) :     0.524 MB/s [   127.9 IOPS]
      Random Write 4KiB (Q= 1,T= 1) :     0.448 MB/s [   109.4 IOPS]
    
      Test : 32768 MiB [D: 24.4% (1820.4/7449.9 GiB)] (x1)  [Interval=5 sec]
      Date : 2016/04/06 22:40:26
        OS : Windows Server 2012 R2  [6.3 Build 9600] (x64)
    Quote Quote  
  23. And what about using -pix_fmt yuv420p with NVEncC by rigaya? Is it possible?
    Quote Quote  
  24. No it's doesn't look like NVEncC can do it internally. You can let ffmpeg do the conversion and pipe to nvencc, if ffmpeg nvenc was the problem

    Something is inconsistent with your results, or maybe there is a typo or I'm reading it incorrectly : the ffmpeg build should give similar result as the ffmpeg_nvenc when using no codec. There should be no GPU usage either, it's just a pure reading/decoding test. Nothing is being encoded. GPU isn't being used. If input.avi was small, then it might be a cached read skewing the observation. Input.avi shouldn't be a YT video.

    When you add -pix_fmt yuv420p to that speed test as I had it written above, it's read + 4:2:0 conversion, that should be slower . The difference in time will be the effect of the 422=>420 conversion only
    Quote Quote  
  25. Not sure what's going on in your case. Seems to work ok for large files



    721GB low motion content test 1280x720p59.94 8bit 4:2:2 UYVY

    source SSD R/W ~500MB/s to a destination slow 5400rpm mechanical HDD, Maxwell 1 card


    ~250-255 FPS per instance, but only ~30-35% GPU load. No slowdown.

    Hmmm 35% load? So I ran another instance about 75% though. ~250-255 FPS per instance each, ~60-70% GPU total . Total additive FPS would be ~500-510. No slowdowns. IIRC 2 instances is max per card for NVEnc, even if you have idle/spare GPU capacity

    I used default ffmpeg nvenc settings, which are no b-frames (no b-frames speed up encoding, but negatively impact compression), 12 frame GOP length (same - shorter keyframe interval speed ups encoding, but worse compression), 5000kbps same as your (higher bitrates slows encoding too)

    I think the reason for the low GPU% is the low motion and low compression settings by ffmpeg nvenc defaults. NVEnCC defaults are higher, higher quality.
    Quote Quote  
  26. I ran two separate conversions. GPU load raised from 40% (one converison) up to 55%. Total fps raised up only about 20 FPS.

    I will look for hdd speed optimisation. But there are no SSD at the moment.
    Quote Quote  



Similar Threads

Visit our sponsor! Try DVDFab and backup Blu-rays!