Generating movie barcodes with OpenCV and Python

movie_barcode_header

In last week’s blog post I demonstrated how to count the number of frames in a video file.

Today we are going to use this knowledge to help us with a computer vision and image processing task — visualizing movie barcodes, similar to the one at the top of this post.

I first became aware of movie barcodes a few years back from this piece of software which was used to generate posters and trailers for the 2013 Brooklyn Film Festival.

Since I started PyImageSearch I’ve received a handful of emails regarding generating movie barcodes, and awhile I normally don’t cover visualization methods, I ended up deciding to write a blog post on it. It is a pretty neat technique after all!

In the remainder of this tutorial I’ll be demonstrating how to write your own Python + OpenCV application to generate movie barcodes of your own.

Looking for the source code to this post?
Jump right to the downloads section.

Generating movie barcodes with OpenCV and Python

In order to construct movie barcodes we need to accomplish three tasks:

  • Task #1: Determine the number of frames in a video file. Computing the total number of frames in a movie allows us to get a sense of how many frames we should be including in the movie barcode visualization. Too many frames and our barcode will be gigantic; too little frames and the movie barcode will be aesthetically unpleasing.
  • Task #2: Generating the movie barcode data. Once we know the total number of video frames we want to include in the movie barcode, we can loop over every N-th frame and compute the RGB average, maintaining a list of averages as we go. This serves as our actual movie barcode data.
  • Task #3: Displaying the movie barcode. Given the list of RGB averages for a set of frames, we can take this data and create the actual movie barcode visualization that is displayed to our screen.

The rest of this post will demonstrate how to accomplish each of these tasks.

Movie barcode project structure

Before we get too far in this tutorial, let’s first discuss our project/directory structure detailed below:

The output  directory will store our actual movie barcodes (both the generated movie barcode images and the serialized RGB averages).

We then have the videos  folder where our input video files reside on disk.

Finally, we need three helper scripts: count_frames.py , generate_barcode.py , and visualize_barcode.py . We’ll be discussing each of these Python files in the following sections.

Installing prerequisites

I’ll assume you already have OpenCV installed on your system (if not, please refer to this page where I provide tutorials for installing OpenCV on a variety of different platforms).

Besides OpenCV you’ll also need scikit-image and imutils. You can install both using pip :

Take the time to install/upgrade these packages now as we’ll need them later in this tutorial.

Counting the number of frames in a video

In last week’s blog post I discussed how to (efficiently) determine the number of frames in a video file. Since I’ve already discussed the topic in-depth, I’m not going to provide a complete overview of the code today.

That said, you can find the source code for count_frames.py  below:

As the name suggests, this script simply counts the number of frames in a video file.

As an example, let’s take the trailer to my favorite movie, Jurassic Park:

After downloading the .mp4 file of this trailer (included in the “Downloads” section at the bottom of this tutorial), I can execute count_frames.py  on the video:

As my output demonstrates, there are 4,790 frames in the video file.

Why does the frame count matter?

I’ll discuss the answer in the following section.

Generating a movie barcode with OpenCV

At this point we know how to determine the total number of frames in a video file — although the exact reasoning as to why we need to know this information is unclear.

To understand why it’s important to know the total number of frames in a video file before generating your movie barcode, let’s dive into generate_barcodes.py :

Lines 2-4 import our required Python packages while Lines 7-14 parse our command line arguments.

We’ll require two command line arguments along with an optional switch, each of which are detailed below:

  • --video : This is the path to our input video file that we are going to generate the movie barcode for.
  • --output : We’ll be looping over the frames in the input video file and computing the RGB average for every N-th frame. These RGB averages will be serialized to a JSON file so we can use this data for the actual movie barcode visualization in the next section.
  • --skip : This parameter controls the number of frames to skip when processing the video. Why might we want to skip frames? Consider the Jurassic Park trailer above: There are over 4,700 frames in a movie clip under 3m30s. If we used only one pixel to visualize the RGB average for each frame, our movie barcode would be over 4,700 pixels wide! Thus, it’s important that we skip every N-th barcode to reduce the output visualization size.

Our next code block handles initializing our list of frame averages and opening a pointer to our video file via the cv2.VideoCapture  method:

Now that our variables are initialized, we can loop over the frames and compute our averages:

We use the .read  method to grab the next frame from the video file (Line 28) and increment the total number of frames processed (Line 36).

We then apply the --skip  command line argument to determine if the current frame should be included in the avgs  list or not (Line 40).

Provided the frame should be kept, we compute the RGB average of the frame  and update the avgs  list (Lines 41 and 42).

After processing all frames from the video file, we can serialize the RGB averages to disk:

Executing this script on our Jurassic Park trailer, you’ll see the following output:

Figure 1: Generating our movie barcode using computer vision and image processing.

Figure 1: Generating our movie barcode using computer vision and image processing.

Notice here that 199 frames have been saved to disk using a frame skip of 25.

Going back to the output of count_frames.py  we can see that 4,790 / 25 = ~199 (the calculation actually equals ~192, but that’s due to a discrepancy in count_frames.py  — to learn more about this behavior, please see last week’s blog post).

You should also now see a file named jurassic_park_trailer.json  in your output  directory:

Visualizing the movie barcode with OpenCV

Now that we have our RGB averages for the frames in the movie, we can actually visualize them.

Open up the visualize_barcode.py  file and insert the following code:

This particular script requires two command line arguments followed by two optional ones:

  • --avgs : This switch is the path to our serialized JSON file that contains the average RGB values for every N-th frame in our video.
  • --barcode : Here we supply the path to our output movie barcode visualization image.
  • --height : This parameter controls the height of the movie barcode visualization. We’ll default this value to a height of 250 pixels.
  • --barcode-width : Each individual bar (i.e., RGB average) in the movie barcode needs to have a width in pixels. We set a default value of 1 pixel per bar, but we can change the width by supplying a different value for this command line argument.

We are now ready to visualize the barcode:

Lines 20 and 21 load the serialized RGB means from disk.

Lines 26 and 27 utilize the --height  switch along with the number of entries in the avgs  list and the --barcode-width  to allocate memory for a NumPy array large enough to store the movie barcode.

For each of the RGB averages we loop over them individually (Line 31) and use the cv2.rectangle  function to draw the each bar in the movie barcode (Lines 32 and 33).

Finally, Lines 37-39 write the movie barcode  to disk and display the visualization to our screen.

Movie barcode visualizations

To see the movie barcode visualization for the Jurassic Park trailer, make sure you have downloaded the source code + example videos using the “Downloads” section of this post. From there, execute the following commands:

Figure 2: Generating a movie barcode for the Jurassic Park trailer.

Figure 2: Generating a movie barcode for the Jurassic Park trailer.

The large green bars at the beginning of the barcode correspond to the green preview screen required by the Motion Picture Association of America, Inc.

The blues in the middle of the movie barcode refer to:

  1. The heavy downpours and the bluish tint cast by the strong thunderstorm when things start to go downhill for our park visitors.
  2. Tim’s blue shirt while in the jeep.
  3. Denny Nedry in the blue-tinted embryo cold storage room.

Let’s now visualize the movie trailer to The Matrix, another one of my all-time favorite movies:

Below you can see the output from running count_frames.py  and  generate_barcode.py  on the movie clip:

Figure 3: Determining the number of frames in a movie followed by generating the video barcode for The Matrix trailer.

Figure 3: Determining the number of frames in a movie followed by generating the video barcode for The Matrix trailer.

And here follows the actual movie barcode:

Figure 4: Visualizing the movie barcode using OpenCV and Python.

Figure 4: Visualizing the movie barcode using OpenCV and Python.

Perhaps unsurprisingly, this movie barcode is heavily dominated by the The Matrix-style greens and blacks.

Finally, here is one last example from the movie trailer for The Dark Knight:

Here are the commands I used to generate the barcode:

Along with the final visualization:

Figure 5: Building a movie barcode using computer vision and image processing techniques.

Figure 5: Building a movie barcode using computer vision and image processing techniques.

Note: I used the website keepvid.com to download the trailers to the movies mentioned above. I do not own the copyrights to these videos nor do I claim to — this blog post is for educational and example purposes only. Please use wisely.

Summary

In today’s blog post I demonstrated how to generate video barcodes using OpenCV and Python.

Generating video barcodes is normally used for design aesthetics and admittedly doesn’t have a far-reaching computer vision/image processing purpose (other than visualization itself).

That said, you could treat the movie barcodes as a form of “feature vector” and use them to compare other movie barcodes/movie clips for similarity or even determine where in a video a given clip appears.

I hope you enjoyed this blog post!

Be sure to enter your email address in the form below to be notified when future blog posts are published.

Downloads:

If you would like to download the code and images used in this post, please enter your email address in the form below. Not only will you get a .zip of the code, I’ll also send you a FREE 11-page Resource Guide on Computer Vision and Image Search Engines, including exclusive techniques that I don’t post on this blog! Sound good? If so, enter your email address and I’ll send you the code immediately!

, ,

5 Responses to Generating movie barcodes with OpenCV and Python

  1. Michael February 26, 2017 at 1:07 pm #

    Hi Adrian,

    Is there a reason for not setting CAP_PROP_POS_FRAMES to be the frame you want to calculate the mean of i.e. indexing into the video at the desired frame number instead of reading entire video frame by frame and deciding if you want to capture its mean or not

    • Adrian Rosebrock February 27, 2017 at 11:11 am #

      There is no real reason other than I find the capture properties of OpenCV to be buggy depending on OpenCV version and versions of the video I/O libraries installed.

  2. Akshay April 1, 2017 at 12:35 pm #

    Hi,

    Like you said, there isn’t much of a use of the generated barcode as a good hashing technique by using movie name can get us good results to get unique ID for movies. However, I found the use of this barcode as a feature descriptor of that movie trailer. I wanted to know is this is something that one can seriously look for? The reason I ask is that histograms are usually very good at capturing the image frame info, so does the barcode have a chance at competing with that ?

    • Adrian Rosebrock April 3, 2017 at 2:08 pm #

      The barcode could serve as a feature vector, that is certainly possible. However, just because two movies have similar color distributions does not guarantee they have similar contents. That said, for simple applications computing the Euclidean distance between two barcodes would serve as a simple comparison method.

      • akshay April 4, 2017 at 3:31 am #

        Seems interesting. I shall give it a try and explore more. Thanks!

Leave a Reply