Screenshots and respect of the YUV>RGB colorspace conversion

29th Oct 2018 21:50 #1

Member

I'm creating another thread as the original one might have been too confusing...
What I want to do is : create screenshots / thumbnails / preview / contact sheets for a large number of video files, with 4 screenshots taken at approximately even intervals and a header with a few technical informations.
My issue is : among the few tools I've tried for that purpose, I have noticed discrepancies with regard to the colors and contrast/brightness of the screenshots.
I've done some tests with a video I created which at some point near the begining shows a still picture I took of a red car. Here's the native picture (just resized by 50% and saved with XnView to JPG q90 with the best sub-sampling setting) :

Now, I opened the video in question (which I had encoded in 1280x720 using MeGUI with “ConvertToYV12(matrix="Rec709")” in the Avisynth script) with several video players / converters and made a screenshot : it turns out that they treat the color conversion differently, which would explain the issue described above.
VLC Media Player :

SMPlayer :

MPC-HC :

ffmpeg (-i input.mp4 -vframes 1 -ss 00:00:05 output.png) :

SMPlayer and MPC-HC reproduce the colors accurately, VLC and ffmpeg do not (the car's color is a darker shade of red, while the leaves in the background are a lighter shade of green).
With the tools I tried to create thumbnails preview, I get different and seemingly more accurate colors with SMPlayer, compared with the others, some of which are based on ffmpeg (even the commercial one...).
Why this behaviour ? According to what I've read here repeatedly, “HD” video is supposed to be converted with the Bt.709 matrix, apparently several prominent softwares do not respect this convention.
Is there a ffmpeg switch which could make it generate screenshots with accurate colors ?

As a side question : I notice a significant discrepancy between the sizes of the PNG screenhots, going from 829KB to 1314KB ; although I didn't go to the trouble of making them all at the exact same frame, they all show the same still picture, with the same resolution, so shouldn't they have approximately the same size, within a few % of each other ?
Another strange discrepancy is the number of used colors : with XnView I get the values 75587 (VLC), 75594 (ffmpeg), 166581 (SMplayer), 164201 (MPC-HC).

Thanks.

Last edited by abolibibelot; 29th Oct 2018 at 22:03. Reason: side question about the sizes of the PNG screenshots

Quote

29th Oct 2018 22:11 #2

poisondeathray

Member

You would either use -vf scale with out_color_matrix , or -vf zscale with matrix, or -vf colormatrix . You can look them up in the documentation, those are the 3 main methods

I'll give one example here with zscale to 8bit RGB using 709 . You can look up the others, it's been discussed a zillion times before, pros/cons , switches etc...
Code:
ffmpeg -i "INPUT.ext" -vf zscale=matrix=709,format=rgb24 "OUTPUT.bmp"
What hasn't been discussed that much ... notice I used "BMP" . A potential problem with ffmpeg is the PNG output can sometimes specifies a PNG gAMA tag (but not always). This means it can look completely different in different programs, it depends on the program and how they handle the tag . You can examine PNG's with a program called tweakpng

As a side question : I notice a significant discrepancy between the sizes of the PNG screenhots, going from 829KB to 1314KB ; although I didn't go to the trouble of making them all at the exact same frame, they all show the same still picture, with the same resolution, so shouldn't they have approximately the same size, within a few % of each other ?
Another strange discrepancy is the number of used colors : with XnView I get the values 75587 (VLC), 75594 (ffmpeg), 166581 (SMplayer), 164201 (MPC-HC).

1) different chroma upsampling algorithms can yield different number of colors. eg. if one is using bilinear, but another is using bicubic , etc...

2) different png compression can result in different filesizes

Quote

29th Oct 2018 23:03 #3

abolibibelot

Member

Thanks for the quick reply.

You would either use -vf scale with out_color_matrix , or -vf zscale with matrix, or -vf colormatrix . You can look them up in the documentation, those are the 3 main methods

Well, those switches seem to be nowhere in the (awfully long) integrated help...

I'll give one example here with zscale to 8bit RGB using 709 . You can look up the others, it's been discussed a zillion times before, pros/cons , switches etc...

And there are yet *other* methods, three are not enough ? In a nutshell, so I don't have to read a zillion threads, what are the pros/cons, and are they any relevant for such a task ? And why doesn't ffmpeg respect the convention by default to begin with ?

With -vf zscale=matrix=709,format=rgb24 I get this error :
“code 3074: no path between colorspaces
Error while filtering: Generic error in an external library
Failed to inject frame into filter network: Generic error in an external library
Error while processing the decoded data for stream #0:0”

Regarding the file size : I re-compressed the 4 screenshot in PNG level 9 with XnView, and interestingly the sizes of the re-compressed VLC and ffmpeg screenshots are very close to one another (746KB, 747KB), while the sizes of the re-compressed SMPlayer and MPC-HC screenshots are very close to one another but significantly higher than the other two (889KB, 881KB). Which would seem consistent with the higher number of colors (about twice as much which seems like a big difference). Can this behaviour (the chroma upsampling algorithm) also be altered in ffmpeg with the right switches ? What are the pros/cons of the most commonly used algorithms ?

Quote

29th Oct 2018 23:30 #4

poisondeathray

Member

Originally Posted by abolibibelot

Thanks for the quick reply.

You would either use -vf scale with out_color_matrix , or -vf zscale with matrix, or -vf colormatrix . You can look them up in the documentation, those are the 3 main methods

Well, those switches seem to be nowhere in the (awfully long) integrated help...

https://ffmpeg.org/ffmpeg-filters.html#scale-1
https://ffmpeg.org/ffmpeg-filters.html#colormatrix
https://ffmpeg.org/ffmpeg-filters.html#zscale-1

I'll give one example here with zscale to 8bit RGB using 709 . You can look up the others, it's been discussed a zillion times before, pros/cons , switches etc...

And there are yet *other* methods, three are not enough ? In a nutshell, so I don't have to read a zillion threads, what are the pros/cons, and are they any relevant for such a task ? And why doesn't ffmpeg respect the convention by default to begin with ?

The convention for general use is Rec601 . You can make a request to behaviour handling / patches etc.. on their message board if you want HD detection , or maybe width or height detection etc....

vf colormatrix is generally considered the worst for a number of reasons, zscale is the newest and most popular these days. swscale (-vf scale) is the most common, because it was used since almost the beginning. A few years ago , additions were added including out_color_matrix . Back then , the only option was -vf colormatrix .

With -vf zscale=matrix=709,format=rgb24 I get this error :
“code 3074: no path between colorspaces
Error while filtering: Generic error in an external library
Failed to inject frame into filter network: Generic error in an external library
Error while processing the decoded data for stream #0:0”

Maybe older ffmpeg build ? Try updating. Or what is your "input" ?

or you can use colormatrix , or scale instead

-vf scale=out_color_matrix=bt709
-vf colormatrix=bt601:bt709

Regarding the file size : I re-compressed the 4 screenshot in PNG level 9 with XnView, and interestingly the sizes of the re-compressed VLC and ffmpeg screenshots are very close to one another (746KB, 747KB), while the sizes of the re-compressed SMPlayer and MPC-HC screenshots are very close to one another but significantly higher than the other two (889KB, 881KB). Which would seem consistent with the higher number of colors (about twice as much which seems like a big difference). Can this behaviour (the chroma upsampling algorithm) also be altered in ffmpeg with the right switches ? What are the pros/cons of the most commonly used algorithms ?

You can read about scaling algorithms, it's a complex, big topic discussed often. Use default if you don't know .

Yes, you can use switches for the algorithm - for example it's "filter" for zscale
Code:
filter, f

    Set the resize filter type.

    Possible values are:

    point
    bilinear
    bicubic
    spline16
    spline36
    lanczos

    Default is bilinear.
swscale has more algorithms (-vf scale uses swscale) , you can set with -sws_flags
https://ffmpeg.org/ffmpeg-scaler.html#scaler_005foptions
Code:
sws_flags

    Set the scaler flags. This is also used to set the scaling algorithm. Only a single algorithm should be selected. Default value is ‘bicubic’.

    It accepts the following values:

    ‘fast_bilinear’

        Select fast bilinear scaling algorithm.
    ‘bilinear’

        Select bilinear scaling algorithm.
    ‘bicubic’

        Select bicubic scaling algorithm.
    ‘experimental’

        Select experimental scaling algorithm.
    ‘neighbor’

        Select nearest neighbor rescaling algorithm.
    ‘area’

        Select averaging area rescaling algorithm.
    ‘bicublin’

        Select bicubic scaling algorithm for the luma component, bilinear for chroma components.
    ‘gauss’

        Select Gaussian rescaling algorithm.
    ‘sinc’

        Select sinc rescaling algorithm.
    ‘lanczos’

        Select Lanczos rescaling algorithm.
    ‘spline’

        Select natural bicubic spline rescaling algorithm.
Some of them have suboptions (but not necessarily in ffmpeg) , for example you can use a 3-tap lanczos or 4-tap, 5-tap etc.... it becomes more sharp with more ringing. Bicubic has many varieties because of a b,c value which can take values . You can read about some common avisynth ones here
http://avisynth.nl/index.php/Resize

Scaling is complex stuff, there are dozens of slightly different variations on scaling, chroma, chroma location, chroma interpretation. Lots of academic work, research papers on different scaling algorithms . Not all options are necessarily available in ffmpeg. For example, some more options/switches are available in avisynth and vapoursynth plugins/filters

That list above for zscale is also increasing roughly in general sharpness (e.g. lanczos will be sharper than bilinear) . Point is same as nearest neighbor. But sharp chroma scaling is usually very bad. Usually blurry is preferred because of chroma aliasing artifacts

Quote

30th Oct 2018 00:08 #5

poisondeathray

Member

And another common difference is some of the media players may have filters enabled, dithering , noise etc.. for decoder and/or renderer - that can also result in differences. eg. A random dither pattern will result in larger filesizes compared to order dither . It's especially important difference for higher bit depth videos dithering down

Quote

30th Oct 2018 13:41 #6

abolibibelot

Member

So, tried again, using (as earlier) ffmpeg-20181025-bf32435-win64-static (updated very recently) :
– with -vf zscale=matrix=709,format=rgb24 I still get the same error as above ;
– with -vf scale=out_color_matrix=bt709 I get the exact same file as without any -vf switch (same size, same CRC) ;
– with -vf colormatrix=bt601:bt709 the file size is slightly different (1115KB instead of 1116) but the colors are the same (= wrong), only the quality seems slightly reduced in the red areas.
The input is that same MP4 file. The complete command is :

Code:

ffmpeg -i "E:\pathname\filename.mp4" -vframes 1 -ss 00:00:05 [-vf something] "E:\filename ffmpeg [option].png"

And the complete output :

Code:

C:\>ffmpeg -i "E:\...\20140821\20140821.mp4" -vframes 1 - ss 00:00:05 "E:\20140821 ffmpeg.png"
ffmpeg version N-92266-gbf324359be Copyright (c) 2000-2018 the FFmpeg developers

  built with gcc 8.2.1 (GCC) 20181017
  configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfi
g --enable-gnutls --enable-iconv --enable-libass --enable-libbluray --enable-lib
freetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amr
wb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --
enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-l
ibwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --
enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --en
able-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --en
able-libxvid --enable-libaom --enable-libmfx --enable-amf --enable-ffnvcodec --e
nable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enab
le-avisynth
  libavutil      56. 20.100 / 56. 20.100
  libavcodec     58. 33.102 / 58. 33.102
  libavformat    58. 19.102 / 58. 19.102
  libavdevice    58.  4.106 / 58.  4.106
  libavfilter     7. 37.100 /  7. 37.100
  libswscale      5.  2.100 /  5.  2.100
  libswresample   3.  2.100 /  3.  2.100
  libpostproc    55.  2.100 / 55.  2.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'E:\...\20140821\20140821.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 1
    compatible_brands: isomavc1
    creation_time   : 2015-07-01T00:52:57.000000Z
  Duration: 03:03:00.05, start: 0.000000, bitrate: 2881 kb/s
    Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1280x720
[SAR 1:1 DAR 16:9], 2697 kb/s, 25 fps, 25 tbr, 25k tbn, 50 tbc (default)
    Metadata:
      creation_time   : 2015-07-01T00:52:57.000000Z
    Stream #0:1(fra): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, flt
p, 179 kb/s (default)
    Metadata:
      creation_time   : 2015-06-30T06:17:02.000000Z
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> png (native))
Press [q] to stop, [?] for help
Output #0, image2, to 'E:\20140821 ffmpeg.png':
  Metadata:
    major_brand     : isom
    minor_version   : 1
    compatible_brands: isomavc1
    encoder         : Lavf58.19.102
    Stream #0:0(und): Video: png, rgb24, 1280x720 [SAR 1:1 DAR 16:9], q=2-31, 20
0 kb/s, 25 fps, 25 tbn, 25 tbc (default)
    Metadata:
      creation_time   : 2015-07-01T00:52:57.000000Z
      encoder         : Lavc58.33.102 png
frame=    1 fps=0.0 q=-0.0 Lsize=N/A time=00:00:00.04 bitrate=N/A speed=0.147x

video:1116kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing
overhead: unknown

C:\>ffmpeg -i "E:\...\20140821\20140821.mp4" -vframes 1 -ss 00:00:05 -vf scale=out_color_matrix=bt709 "E:\20140821 ffmpeg scale bt709.png"
ffmpeg version N-92266-gbf324359be Copyright (c) 2000-2018 the FFmpeg developers

  built with gcc 8.2.1 (GCC) 20181017
  configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfi
g --enable-gnutls --enable-iconv --enable-libass --enable-libbluray --enable-lib
freetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amr
wb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --
enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-l
ibwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --
enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --en
able-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --en
able-libxvid --enable-libaom --enable-libmfx --enable-amf --enable-ffnvcodec --e
nable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enab
le-avisynth
  libavutil      56. 20.100 / 56. 20.100
  libavcodec     58. 33.102 / 58. 33.102
  libavformat    58. 19.102 / 58. 19.102
  libavdevice    58.  4.106 / 58.  4.106
  libavfilter     7. 37.100 /  7. 37.100
  libswscale      5.  2.100 /  5.  2.100
  libswresample   3.  2.100 /  3.  2.100
  libpostproc    55.  2.100 / 55.  2.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'E:\...\20140821\20140821.mp4
':
  Metadata:
    major_brand     : isom
    minor_version   : 1
    compatible_brands: isomavc1
    creation_time   : 2015-07-01T00:52:57.000000Z
  Duration: 03:03:00.05, start: 0.000000, bitrate: 2881 kb/s
    Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1280x720
[SAR 1:1 DAR 16:9], 2697 kb/s, 25 fps, 25 tbr, 25k tbn, 50 tbc (default)
    Metadata:
      creation_time   : 2015-07-01T00:52:57.000000Z
    Stream #0:1(fra): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, flt
p, 179 kb/s (default)
    Metadata:
      creation_time   : 2015-06-30T06:17:02.000000Z
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> png (native))
Press [q] to stop, [?] for help
Output #0, image2, to 'E:\20140821 ffmpeg scale bt709.png':
  Metadata:
    major_brand     : isom
    minor_version   : 1
    compatible_brands: isomavc1
    encoder         : Lavf58.19.102
    Stream #0:0(und): Video: png, rgb24, 1280x720 [SAR 1:1 DAR 16:9], q=2-31, 20
0 kb/s, 25 fps, 25 tbn, 25 tbc (default)
    Metadata:
      creation_time   : 2015-07-01T00:52:57.000000Z
      encoder         : Lavc58.33.102 png
frame=    1 fps=0.0 q=-0.0 Lsize=N/A time=00:00:00.04 bitrate=N/A speed=0.153x

video:1116kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing
overhead: unknown

C:\>ffmpeg -i "E:\...\20140821\20140821.mp4" -vframes 1 -ss 00:00:05 -vf zscale=matrix=709,format=rgb24 "E:\20140821 ffmpeg zscale bt709.png"
ffmpeg version N-92266-gbf324359be Copyright (c) 2000-2018 the FFmpeg developers

  built with gcc 8.2.1 (GCC) 20181017
  configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfi
g --enable-gnutls --enable-iconv --enable-libass --enable-libbluray --enable-lib
freetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amr
wb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --
enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-l
ibwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --
enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --en
able-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --en
able-libxvid --enable-libaom --enable-libmfx --enable-amf --enable-ffnvcodec --e
nable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enab
le-avisynth
  libavutil      56. 20.100 / 56. 20.100
  libavcodec     58. 33.102 / 58. 33.102
  libavformat    58. 19.102 / 58. 19.102
  libavdevice    58.  4.106 / 58.  4.106
  libavfilter     7. 37.100 /  7. 37.100
  libswscale      5.  2.100 /  5.  2.100
  libswresample   3.  2.100 /  3.  2.100
  libpostproc    55.  2.100 / 55.  2.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'E:\...\20140821\20140821.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 1
    compatible_brands: isomavc1
    creation_time   : 2015-07-01T00:52:57.000000Z
  Duration: 03:03:00.05, start: 0.000000, bitrate: 2881 kb/s
    Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1280x720
[SAR 1:1 DAR 16:9], 2697 kb/s, 25 fps, 25 tbr, 25k tbn, 50 tbc (default)
    Metadata:
      creation_time   : 2015-07-01T00:52:57.000000Z
    Stream #0:1(fra): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, flt
p, 179 kb/s (default)
    Metadata:
      creation_time   : 2015-06-30T06:17:02.000000Z
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> png (native))
Press [q] to stop, [?] for help
code 3074: no path between colorspaces
Error while filtering: Generic error in an external library
Failed to inject frame into filter network: Generic error in an external library

Error while processing the decoded data for stream #0:0
Conversion failed!

C:\Users\Gabriel>ffmpeg -i "E:\...\20140821\20140821.mp4" -vframes 1 -ss 00:00:05 -vf colormatrix=bt601:bt709 "E:\20140821 ffmpeg colormatrix bt709.png"
ffmpeg version N-92266-gbf324359be Copyright (c) 2000-2018 the FFmpeg developers

  built with gcc 8.2.1 (GCC) 20181017
  configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfi
g --enable-gnutls --enable-iconv --enable-libass --enable-libbluray --enable-lib
freetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amr
wb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --
enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-l
ibwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --
enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --en
able-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --en
able-libxvid --enable-libaom --enable-libmfx --enable-amf --enable-ffnvcodec --e
nable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enab
le-avisynth
  libavutil      56. 20.100 / 56. 20.100
  libavcodec     58. 33.102 / 58. 33.102
  libavformat    58. 19.102 / 58. 19.102
  libavdevice    58.  4.106 / 58.  4.106
  libavfilter     7. 37.100 /  7. 37.100
  libswscale      5.  2.100 /  5.  2.100
  libswresample   3.  2.100 /  3.  2.100
  libpostproc    55.  2.100 / 55.  2.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'E:\...\20140821\20140821.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 1
    compatible_brands: isomavc1
    creation_time   : 2015-07-01T00:52:57.000000Z
  Duration: 03:03:00.05, start: 0.000000, bitrate: 2881 kb/s
    Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1280x720
[SAR 1:1 DAR 16:9], 2697 kb/s, 25 fps, 25 tbr, 25k tbn, 50 tbc (default)
    Metadata:
      creation_time   : 2015-07-01T00:52:57.000000Z
    Stream #0:1(fra): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, flt
p, 179 kb/s (default)
    Metadata:
      creation_time   : 2015-06-30T06:17:02.000000Z
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> png (native))
Press [q] to stop, [?] for help
Output #0, image2, to 'E:\20140821 ffmpeg colormatrix bt709.png':
  Metadata:
    major_brand     : isom
    minor_version   : 1
    compatible_brands: isomavc1
    encoder         : Lavf58.19.102
    Stream #0:0(und): Video: png, rgb24, 1280x720 [SAR 1:1 DAR 16:9], q=2-31, 20
0 kb/s, 25 fps, 25 tbn, 25 tbc (default)
    Metadata:
      creation_time   : 2015-07-01T00:52:57.000000Z
      encoder         : Lavc58.33.102 png
frame=    1 fps=0.0 q=-0.0 Lsize=N/A time=00:00:00.04 bitrate=N/A speed=0.123x

video:1114kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing
overhead: unknown

That list above for zscale is also increasing roughly in general sharpness (e.g. lanczos will be sharper than bilinear) . Point is same as nearest neighbor. But sharp chroma scaling is usually very bad. Usually blurry is preferred because of chroma aliasing artifacts

So all this, the “scaling” process, only affects the color conversion, right ? Or also the general sharpness of the picture ?
Another issue when comparing the output of those thumbnail generation tools is that they have a varying level of sharpness / blurriness. With this 2x2 tiles pattern (4 tiles of about 960x540 generated from a 1920x1080 video in a frame of ~1920 pixels width), MPC-HC produces the sharpest captures (despite the fact that the tiles aren't exactly 960x540 but 946x527 so it's not a simple 50% reduction and the aspect ratio is a bit off, there's no control other than the frame width, the lines/columns pattern and the level of compression – and anyway it's not suitable for this task as it provides no way of batch-processing an entire directory), then SMPlayer, VCSI, Scorp Video Thumbnail Maker are about on par on that aspect, then MTN generates quite blurry captures. Could it be related with the resizing algorithm used by each of these tools ? The only one I could (relatively) easily manipulate is VCSI, which is a Python script, and contains the following command for the generation of the thumbnails :

Code:

                ffmpeg_command = [
                    "ffmpeg",
                    "-ss", skip_time,
                    "-i", self.path,
                    "-ss", skip_delay,
                    "-vframes", "1",
                    "-s", "%sx%s" % (width, height),
                ]

I don't know much about programming, and just installed Python to get this to work, but I figured that I could improve the output just by adding the right switches to that simple command. Although none of them is working so far for a basic test...

Quote

30th Oct 2018 14:13 #7

poisondeathray

Member

Originally Posted by abolibibelot

So, tried again, using (as earlier) ffmpeg-20181025-bf32435-win64-static (updated very recently) :
– with -vf zscale=matrix=709,format=rgb24 I still get the same error as above ;
– with -vf scale=out_color_matrix=bt709 I get the exact same file as without any -vf switch (same size, same CRC) ;
– with -vf colormatrix=bt601:bt709 the file size is slightly different (1115KB instead of 1116) but the colors are the same (= wrong), only the quality seems slightly reduced in the red areas.

You're using PNG output, so it's unreliable with ffmpeg because the results varies on what you use to view it. So I don't trust what you are "seeing". Go re-read my 1st reply. It's easy to demonstrate and prove the problem. For example firefox will treat the BMP and PNG differently. The problem is the PNG metadata, the gAMA tag and cHRM chromacity tag. Imagesource vs. CoronaSequence in avisynth will treat it differently. Windows picture viewer will treat them the same,e tc... the point it it's inconsistent.

Check if it's been compiled with zscale. Print out the filter list
ffmpeg -filters 1>filters.txt

So all this, the “scaling” process, only affects the color conversion, right ? Or also the general sharpness of the picture ?
Another issue when comparing the output of those thumbnail generation tools is that they have a varying level of sharpness / blurriness.

If you're scaling down, both, because the color information is scaled. If it's just RGB conversion, the color "sharpness" only because you're scaling it back to full color resolution

You have a yuv420p source. This means 4:2:0 chroma subsampling . Your example in the console is 1280x720. 4:2:0 means the color information (the U,V in YUV, or technically Cb,Cr) is actually only 640x360 . When you convert to RGB, that gets UPscaled to 1280x720. IF it was 4:4:4 that would indicate full color - All Y,U,V planes are full resolution at 1280x720

When making smaller previews, contact sheets etc... you are downscaling . You can use a sharp resizer, a blurry resizer, or anything in between. "Sharper" isn't necessarily better - Ringing artifacts, haloing

Quote

30th Oct 2018 14:26 #8

poisondeathray

Member

If the source is flagged with metadata correctly, it looks like ffmpeg will do the conversion correctly , sometimes if you write to bmp as-is (ie it reads the flags). But if you use colormatrix in that case, it will reverse it, sometimes making it wrong. So that's another reason to avoid -vf colormatrix unless you know for sure it's flagged or not and if it's correct or not. The quality is also lower, because it's actually converted twice (2 generations of rounding errors) .

For -vf scale , it also matters if source is flagged or not. To get around that, you can specify both in and out colormatrix explicitly, and that should work in all cases (except full range, but you can specify in_range, out_range too)
Code:
-vf scale=in_color_matrix=bt709:out_color_matrix=bt709
Same with zscale, you can specify matrixin, and matrix explicitly , if you need to , so most types of videos are covered (flagged and unflagged) , and you can specify rangein and range if you need to (for full range)
Code:
-vf zscale=matrixin=709:matrix=709,format=rgb24
Note there are some HD videos that are known 601. For example many canon DSLR's shoot this way as a notable exception to the "HD is 709" rule

Last edited by poisondeathray; 30th Oct 2018 at 14:42.

Quote

30th Oct 2018 16:52 #9

abolibibelot

Member

You're using PNG output, so it's unreliable with ffmpeg because the results varies on what you use to view it. So I don't trust what you are "seeing". Go re-read my 1st reply. It's easy to demonstrate and prove the problem. For example firefox will treat the BMP and PNG differently. The problem is the PNG metadata, the gAMA tag and cHRM chromacity tag. Imagesource vs. CoronaSequence in avisynth will treat it differently. Windows picture viewer will treat them the same,e tc... the point it it's inconsistent.

I've read that, but the goal here is to modify an existing Python script which, as I understand it, calls ffmpeg to create temporary PNG screenshots, then assembles them to form the final image (which can be PNG or JPG). I'm not even sure that I can add switches to the ffmpeg command and change nothing else without breaking the script, so changing the format of the temporary files may increase that risk.
Anyway, the results are exactly the same visually with BMP output and Windows 7 viewer or XnView.

Check if it's been compiled with zscale. Print out the filter list
ffmpeg -filters 1>filters.txt

Apparently yes :
Code:
 ..C zscale            V->V       Apply resizing, colorspace and bit depth conversion.
Note there are some HD videos that are known 601. For example many canon DSLR's shoot this way as a notable exception to the "HD is 709" rule

Yes, I've read this when searching about that issue. But in this particular case I know how that particular frame is supposed to look like (the original source being a still picture that I have).

-vf zscale=matrixin=709:matrix=709,format=rgb24

With this switch it worked as expected, the colors are now correct, even in PNG :

And with “-vf scale=in_color_matrix=bt709:out_color_matrix=bt709 ” the output is exactly the same in PNG or BMP (same size, same checksum).

Now, would it be enough to add one of those switches to each ffmpeg call in the script ?
Here is what seems to be the part relevant to the frame capture :
Code:
class MediaCapture(object):
    """Capture frames of a video
    """

    def __init__(self, path, accurate=False, skip_delay_seconds=DEFAULT_ACCURATE_DELAY_SECONDS, frame_type=DEFAULT_FRAME_TYPE):
        self.path = path
        self.accurate = accurate
        self.skip_delay_seconds = skip_delay_seconds
        self.frame_type = frame_type

    def make_capture(self, time, width, height, out_path="out.png"):
        """Capture a frame at given time with given width and height using ffmpeg
        """
        skip_delay = MediaInfo.pretty_duration(self.skip_delay_seconds, show_millis=True)

        ffmpeg_command = [
            "ffmpeg",
            "-ss", time,
            "-i", self.path,
            "-vframes", "1",
            "-s", "%sx%s" % (width, height),
        ]

        if self.frame_type is not None:
            select_args = [
                "-vf", "select='eq(frame_type\\," + self.frame_type + ")'"
            ]

        if self.frame_type == "key":
            select_args = [
                "-vf", "select=key"
            ]

        if self.frame_type is not None:
            ffmpeg_command += select_args

        ffmpeg_command += [
            "-y",
            out_path
        ]

        if self.accurate:
            time_seconds = MediaInfo.pretty_to_seconds(time)
            skip_time_seconds = time_seconds - self.skip_delay_seconds

            if skip_time_seconds < 0:
                ffmpeg_command = [
                    "ffmpeg",
                    "-i", self.path,
                    "-ss", time,
                    "-vframes", "1",
                    "-s", "%sx%s" % (width, height),
                ]

                if self.frame_type is not None:
                    ffmpeg_command += select_args

                ffmpeg_command += [
                    "-y",
                    out_path
                ]
            else:
                skip_time = MediaInfo.pretty_duration(skip_time_seconds, show_millis=True)
                ffmpeg_command = [
                    "ffmpeg",
                    "-ss", skip_time,
                    "-i", self.path,
                    "-ss", skip_delay,
                    "-vframes", "1",
                    "-s", "%sx%s" % (width, height),
                ]

                if self.frame_type is not None:
                    ffmpeg_command += select_args

                ffmpeg_command += [
                    "-y",
                    out_path
                ]

        try:
            subprocess.call(ffmpeg_command, stderr=DEVNULL, stdout=DEVNULL)
        except FileNotFoundError:
            error = "Could not find 'ffmpeg' executable. Please make sure ffmpeg/ffprobe is installed and is in your PATH."
            error_exit(error)

    def compute_avg_color(self, image_path):
        """Computes the average color of an image
        """
        i = Image.open(image_path)
        i = i.convert('P')
        p = i.getcolors()

        # compute avg color
        total_count = 0
        avg_color = 0
        for count, color in p:
            total_count += count
            avg_color += count * color

        avg_color /= total_count

        return avg_color

    def compute_blurriness(self, image_path):
        """Computes the blurriness of an image. Small value means less blurry.
        """
        i = Image.open(image_path)
        i = i.convert('L')  # convert to grayscale

        a = numpy.asarray(i)
        b = abs(numpy.fft.rfft2(a))
        max_freq = self.avg9x(b)

        if max_freq is not 0:
            return 1 / max_freq
        else:
            return 1

    def avg9x(self, matrix, percentage=0.05):
        """Computes the median of the top n% highest values.
        By default, takes the top 5%
        """
        xs = matrix.flatten()
        srt = sorted(xs, reverse=True)
        length = int(math.floor(percentage * len(srt)))

        matrix_subset = srt[:length]
        return numpy.median(matrix_subset)

    def max_freq(self, matrix):
        """Returns the maximum value in the matrix
        """
        m = 0
        for row in matrix:
            mx = max(row)
            if mx > m:
                m = mx

        return m
I see that there are already “-vf” arguments, but I don't quite understand how it operates. Perhaps I should try to ask the author...
The complete script is here (or here) :
vcsi.zip

Quote

30th Oct 2018 17:19 #10

poisondeathray

Member

Originally Posted by abolibibelot

Perhaps I should try to ask the author...

That's what I would do. Describe the issue briefly, and post an enhancement feature request under issues
Maybe include a switch for 601/709/2020 , seeing as how there are other optional arguments already

Another could be a switch to control the scaling algoithm (e.g. lanczos vs. bicubic etc....)

https://github.com/amietn/vcsi/issues

The problem with randomly trying or inserting stuff is you might break other things (or at least I do all the time...)

Quote

30th Oct 2018 17:39 #11

pandy

Member

zscale in ffmpeg can be tricky (you may need to linearize video before any processing) - this thread can be interesting https://forum.doom9.org/showthread.php?t=175125 and this https://stevens.li/guides/video/converting-hdr-to-sdr-with-ffmpeg/ - i know, seem it is not directly related to your issue but it may have informations useful to understand how to deal with zscale in ffmpeg.

Quote

30th Oct 2018 18:17 #12

poisondeathray

Member

Originally Posted by abolibibelot

Anyway, the results are exactly the same visually with BMP output and Windows 7 viewer or XnView.

And with “-vf scale=in_color_matrix=bt709ut_color_matrix=bt709 ” the output is exactly the same in PNG or BMP

Beware, this only worked in your case, because your video was unflagged.

Adding the in_color_matrix (or matrixin for zscale) , will make the PNG wrong again for flagged videos but ok for BMP.

You can demonstrate this behaviour with known values (e.g. colorbars) . 2 versions - one flagged video, and one unflagged . Let me know if you want some examples I can prepare something

Quote

31st Oct 2018 10:45 #13

JVRaines

Member

Originally Posted by abolibibelot

SMPlayer and MPC-HC reproduce the colors accurately, VLC and ffmpeg do not (the car's color is a darker shade of red, while the leaves in the background are a lighter shade of green).

VLC's color rendition can be seriously affected by the Video/Use hardware YUV->RGB conversions setting and the consequent settings in your graphics driver.

Quote

3rd Nov 2018 14:46 #14

abolibibelot

Member

@poisondeathray

That's what I would do. Describe the issue briefly, and post an enhancement feature request under issues
Maybe include a switch for 601/709/2020 , seeing as how there are other optional arguments already
Another could be a switch to control the scaling algoithm (e.g. lanczos vs. bicubic etc....)
https://github.com/amietn/vcsi/issues
The problem with randomly trying or inserting stuff is you might break other things (or at least I do all the time...)

I did add an issue (couln't find how to label it as “enhancement” or “help wanted”), but it's unlikely to be adressed in a timely manner, since the previous one got a reply more than a month later (and I haven't been exactly “brief”...).

Anyway, I managed to get the intended result (screenshots here) by modifying the ffmpeg calls in the script as follows :
Code:
                ffmpeg_command = [
                    "ffmpeg",
                    "-i", self.path,
                    "-ss", time,
                    "-vframes", "1",
                    "-vf", "zscale=matrixin=709:matrix=709,format=rgb24",
                    "-s", "%sx%s" % (width, height),
                ]
I wanted to install both versions side by side, tried to install the modified script as “vcsimod”, by replacing all mentions of “vcsi” by “vcsimod” in “setup.py” file and some other files in the package, but it didn't work, I got a “ModuleNotFoundError” when calling vcsimod (although the modified script was apparently correctly compiled as vcsi.exe in the “Python3\Scripts” directory), so I had to install it over the native one. Problem is, the colors might be inaccurate for “SD” content, which usually legitimately use Bt.601 color conversion matrix. So, does anyone know what I should modify where in the original package to install my modified script with a different name ?

Beware, this only worked in your case, because your video was unflagged.
Adding the in_color_matrix (or matrixin for zscale) , will make the PNG wrong again for flagged videos but ok for BMP.
You can demonstrate this behaviour with known values (e.g. colorbars) . 2 versions - one flagged video, and one unflagged . Let me know if you want some examples I can prepare something

Yes that would be interesting, if only for the sake of learning something !

@JVRaines

VLC's color rendition can be seriously affected by the Video/Use hardware YUV->RGB conversions setting and the consequent settings in your graphics driver.

I found this setting under “DirectX (DirectDraw)”, but it says that this mode is incompatible with the Aero interface in Windows Vista, so isn't it the same with Windows 7 ? Currently “modules de sortie” (don't know the exact name in english, apparently there's no way of switching language internally) is set to “Automatic”. How can I know which setting it actually uses ? And is there a particular setting which is known to be both generally reliable and to display accurate colors ?

Quote

3rd Nov 2018 16:12 #15

poisondeathray

Member

Originally Posted by abolibibelot

Problem is, the colors might be inaccurate for “SD” content, which usually legitimately use Bt.601 color conversion matrix.

That's why switches would be useful . Also for Bt.2020

Beware, this only worked in your case, because your video was unflagged.
Adding the in_color_matrix (or matrixin for zscale) , will make the PNG wrong again for flagged videos but ok for BMP.
You can demonstrate this behaviour with known values (e.g. colorbars) . 2 versions - one flagged video, and one unflagged . Let me know if you want some examples I can prepare something

Yes that would be interesting, if only for the sake of learning something !

In this zip package are the original TIF, 2 videos, flagged and unflagged. That's the only difference, the metadata , and various screenshots (which you can do yourself)
Code:
Color primaries                          : BT.709
Transfer characteristics                 : BT.709
Matrix coefficients                      : BT.709
So your screenshots will be incorrect and inconsistent quite frequently , because rough ballpark >50% of random videos will have at least matrix flags e.g youtube, vimeo, BD, most cameras etc.... Unless you put in some logic for different commands based on different video characteristics or only allow certain viewers to be used... or maybe a workaround might be a temp BMP file, then compress that . PNG is just too unreliable in ffmpeg, because various circumstances will cause it to write some tags

The issue is the png chunk metadata , the cHRM, and gAMA specifically , and how certain programs handle that

=> Programs that ignore them e.g xnview, coronasequence in avisynth, a few others etc.. they will look the same

=> Programs that read them e.g. firefox, chrome, windows photo viewer, imagesource in avisynth, etcc... they will look different

These videos are created from known RGB values, so if you create a screenshot, you should get back the known values +/-2 or 3 (expected rounding errors from multiple RGB<=>YUV conversions ) . For example, the "red" bar should be RGB 180,16,16 . A proper screenshot shows it close, maybe +/-1 . For example, the avisynth/avspmod screenshot shows 180,15,16, which is within acceptable limits (and the PNG is written without tags, so it looks the same everywhere; that' s the key). But specifying in/out matrix as you have in ffmpeg on the flagged video show it to be significantly off 187,20,20, - that's more than expected from just rounding errors

open up the screenshots in your browser eg. firefox, chroma, and swap, you will notice a significant shift for the ffmpeg in/out specified pngs when source video is flagged . Or Imagesource() in avsiynth.

Attached Files

testvideos.zip (8.17 MB, 46 views)

Quote

3rd Nov 2018 17:37 #16

_Al_

Member

Btw. This looks like a nice project for Vapoursynth, make it RGB, it can handle to stack more images into one if needed , export image with other modules like numpy, opencv.
Avoiding any players out there and their different treatments for color spaces
for example:
Code:
from vapoursynth import core, RGB24
from numpy import array
from cv2 import merge, imshow, imwrite, waitKey

file=r'C:\test\test.mp4'
#number of the frame to be shown
FRAME=60

clip = core.lsmas.LibavSMASHSource(file)
planes = clip.format.num_planes
if clip.height <= 576:
    parameters = {"format": RGB24, "matrix_in_s":"470bg"}
else:
    parameters = {"format": RGB24, "matrix_in_s":"709"}
rgb = core.resize.Point(clip, **parameters)
rgb_frame = rgb.get_frame(FRAME)
img = merge([array(rgb_frame.get_read_array(i), copy=False) for i in reversed(range(planes))])
imshow("uncompressed RGB frame", img)
waitKey(0)
moduls numpy and opencv are not part of Vapoursynth, those moduls need to be installed , but opencv needs matplotlib as well
pip3.6 install matplotlib
pip3.6 install numpy
that pip is in Python Script folder, you might have different versions like 3.7 or just pip might just work, but I'm not sure now how I installed opencv modul, check web for it

Quote

Screenshots and respect of the YUV>RGB colorspace conversion

Thread Tools

Similar Threads

Inaccurate YUV -> RGB Conversion.

Help Converting YUV to RGB

RGB to YUV to RGB

ffmpeg/x264 RGB to YUV

is this YUV or RGB?