The trivial video is a video of 1 frame. This is how images are interpreted by scikit-video. Let’s walk through the following example for interpreting images:
1import skvideo.io
2
3# a frame from the bigbuckbunny sequence
4vid = skvideo.io.vread("vid_luma_frame1.png")
5T, M, N, C = vid.shape
6
7print("Number of frames: %d" % (T,))
8print("Number of rows: %d" % (M,))
9print("Number of cols: %d" % (N,))
10print("Number of channels: %d" % (C,))
Running this code yields this output:
Number of frames: 1
Number of rows: 720
Number of cols: 1280
Number of channels: 3
As you can see, the 1280x720 sized image has loaded without problems, and is treated as a rgb video with 1 frame.
If you’d like to upscale this image during loading, you can run the following:
1import skvideo.io
2
3# upscale frame from the bigbuckbunny sequence by a factor of 2
4vid = skvideo.io.vread("vid_luma_frame1.png",
5 outputdict={
6 "-sws_flags": "bilinear",
7 "-s": "2560x1440"
8 }
9)
10T, M, N, C = vid.shape
11
12print("Number of frames: %d" % (T,))
13print("Number of rows: %d" % (M,))
14print("Number of cols: %d" % (N,))
15print("Number of channels: %d" % (C,))
Running this code yields this output:
Number of frames: 1
Number of rows: 1440
Number of cols: 2560
Number of channels: 3
Notice that the upscaling type is set to “bilinear” by simply writing it out in plain English. You can also upscale using other parameters that ffmpeg/avconv support.
Note that although ffmpeg/avconv supports relative scaling, scikit-video doesn’t support that yet. Future support can be added by parsing the video filter “-vf” commands, so that scikit-video is aware of the buffer size expected from the ffmpeg/avconv subprocess.
Of course, images can be written just as easily as they can be read.
1import skvideo.io
2import numpy as np
3
4# create random data, sized 1280x720
5image = np.random.random(size=(720, 1280))*255
6print("Random image, shape (%d, %d)" % image.shape)
7
8skvideo.io.vwrite("output.png", image)
Again, the output:
Random image, shape (720, 1280)
First, notice that the shape of the image is height x width. Scikit-Video always interprets images and video matrices as a height then a width, which is a standard matrix format. Second, notice that writing images does not require them to be in the same format as videos. Scikit-Video will interpret shapes of (1, M, N), (M, N), (M, N, C) as images where M is height, N is width, and C is the number of channels. Internally, scikit-video standardizes shapes and datatypes for accurate reading and writing through the ffmpeg/avconv subprocess.