Image Reading and Writing
Fork me on GitHub

Image Reading and Writing

The trivial video is a video of 1 frame. This is how images are interpreted by scikit-video. Let’s walk through the following example for interpreting images:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
import skvideo.io

# a frame from the bigbuckbunny sequence
vid = skvideo.io.vread("vid_luma_frame1.png")
T, M, N, C = vid.shape

print("Number of frames: %d" % (T,))
print("Number of rows: %d" % (M,))
print("Number of cols: %d" % (N,))
print("Number of channels: %d" % (C,))

Running this code yields this output:

Number of frames: 1
Number of rows: 720
Number of cols: 1280
Number of channels: 3

As you can see, the 1280x720 sized image has loaded without problems, and is treated as a rgb video with 1 frame.

If you’d like to upscale this image during loading, you can run the following:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
import skvideo.io

# upscale frame from the bigbuckbunny sequence by a factor of 2
vid = skvideo.io.vread("vid_luma_frame1.png",
                       outputdict={
                           "-sws_flags": "bilinear",
                           "-s": "2560x1440"
                       }
)
T, M, N, C = vid.shape

print("Number of frames: %d" % (T,))
print("Number of rows: %d" % (M,))
print("Number of cols: %d" % (N,))
print("Number of channels: %d" % (C,))

Running this code yields this output:

Number of frames: 1
Number of rows: 1440
Number of cols: 2560
Number of channels: 3

Notice that the upscaling type is set to “bilinear” by simply writing it out in plain English. You can also upscale using other parameters that ffmpeg/avconv support.

Note that although ffmpeg/avconv supports relative scaling, scikit-video doesn’t support that yet. Future support can be added by parsing the video filter “-vf” commands, so that scikit-video is aware of the buffer size expected from the ffmpeg/avconv subprocess.

Of course, images can be written just as easily as they can be read.

1
2
3
4
5
6
7
8
import skvideo.io
import numpy as np

# create random data, sized 1280x720
image = np.random.random(size=(720, 1280))*255
print("Random image, shape (%d, %d)" % image.shape)

skvideo.io.vwrite("output.png", image)

Again, the output:

Random image, shape (720, 1280)

First, notice that the shape of the image is height x width. Scikit-Video always interprets images and video matrices as a height then a width, which is a standard matrix format. Second, notice that writing images does not require them to be in the same format as videos. Scikit-Video will interpret shapes of (1, M, N), (M, N), (M, N, C) as images where M is height, N is width, and C is the number of channels. Internally, scikit-video standardizes shapes and datatypes for accurate reading and writing through the ffmpeg/avconv subprocess.