Deep learning on the Raspberry Pi with OpenCV

I’ve received a number of emails from PyImageSearch readers who are interested in performing deep learning in their Raspberry Pi. Most of the questions go something like this:

Hey Adrian, thanks for all the tutorials on deep learning. You’ve really made deep learning accessible and easy to understand. I have a question: Can I do deep learning on the Raspberry Pi? What are the steps?

And almost always, I have the same response:

The question really depends on what you mean by “do”. You should never be training a neural network on the Raspberry Pi — it’s far too underpowered. You’re much better off training the network on your laptop, desktop, or even GPU (if you have one available).

That said, you can deploy efficient, shallow neural networks to the Raspberry Pi and use them to classify input images.

Again, I cannot stress this point enough:

You should not be training neural networks on the Raspberry Pi (unless you’re using the Pi to do the “Hello, World” equivalent of neural networks — but again, I would still argue that your laptop/desktop is a better fit).

With the Raspberry Pi there just isn’t enough RAM.

The processor is too slow.

And in general it’s not the right hardware for heavy computational processes.

Instead, you should first train your network on your laptop, desktop, or deep learning environment.

Once the network is trained, you can then deploy the neural network to your Raspberry Pi.

In the remainder of this blog post I’ll demonstrate how we can use the Raspberry Pi and pre- trained deep learning neural networks to classify input images.

Looking for the source code to this post?
Jump right to the downloads section.

Deep learning on the Raspberry Pi with OpenCV

When using the Raspberry Pi for deep learning we have two major pitfalls working against us:

  1. Restricted memory (only 1GB on the Raspberry Pi 3).
  2. Limited processor speed.

This makes it near impossible to use larger, deeper neural networks.

Instead, we need to use more computationally efficient networks with a smaller memory/processing footprint such as MobileNet and SqueezeNet. These networks are more appropriate for the Raspberry Pi; however, you need to set your expectations accordingly — you should not expect blazing fast speed.

In this tutorial we’ll specifically be using SqueezeNet.

What is SqueezeNet?

Figure 1: The “fire” module in SqueezeNet, consisting of a “squeeze” and an “expand” (Iandola et al., 2016).

SqueezeNet was first introduced by Iandola et al. in their 2016 paper, SqueezeNet: AlexNet-level accuracy with 50x few parameters and <0.5MB model size.

The title alone of this paper should pique your interest.

State-of-the-art architectures such as ResNet have model sizes that are >100MB. VGGNet is over 550MB. AlexNet sits in the middle of this size range with a model size of ~250MB.

In fact, one of the smaller Convolutional Neural Networks used for image classification is GoogLeNet at ~25-50MB (depending on which version of the architecture is implemented).

The real question is: Can we go smaller?

As the work of Iandola et al. demonstrates, the answer is: Yes, we can decrease model size by applying a novel usage of 1×1 and 3×3 convolutions, along with no fully-connected layers. The end result is a model weighing in at 4.9MB, which can be further reduced to < 0.5MB by model processing (also called “weight pruning” and “sparsifying a model”).

In the remainder of this tutorial I’ll be demonstrating how SqueezeNet can classify images in approximately half the time of GoogLeNet, making it a reasonable choice when applying deep learning on your Raspberry Pi.

Interested in learning more about SqueezeNet?

If you’re interested in learning more about SqueezeNet, I would encourage you to take a look at my new book, Deep Learning for Computer Vision with Python.

Inside the ImageNet Bundle, I:

  1. Explain the inner workings of the SqueezeNet architecture.
  2. Demonstrate how to implement SqueezeNet by hand.
  3. Train SqueezeNet from scratch on the challenging ImageNet dataset and replicate the original results by Iandola et al.

Go ahead and take a look — I think you’ll agree with me when I say that this is the most complete deep learning + computer vision education you can find online.

Running a deep neural network on the Raspberry Pi

The source code from this blog post is heavily based on my previous post, Deep learning with OpenCV.

I’ll still review the code in its entirety here; however, I would like to refer you over to the previous post for a complete and exhaustive review.

To get started, create a new file named pi_deep_learning.py , and insert the following source code:

Lines 2-5 simply import our required packages.

From there, we need to parse our command line arguments:

As is shown on Lines 9-16 we have four required command line arguments:

  • --image : The path to the input image.
  • --prototxt: The path to a Caffe prototxt file which is essentially a plaintext configuration file following a JSON-like structure. I cover the anatomy of Caffe projects in my PyImageSearch Gurus course.
  • --model : The path to a pre-trained Caffe model. As stated above, you’ll want to train your model on hardware which packs much more punch than the Raspberry Pi — we can, however, leverage a small, pre-existing model on the Pi.
  • --labels : The path to class labels, in this case ImageNet “syn-sets” labels.

Next, we’ll load the class labels and input image from disk:

Go ahead and open synset_words.txt  found in the “Downloads” section of this post. You’ll see on each line/row there is an ID and class labels associated with it (separated by commas).

Lines 20 and 21 simply read in the labels file line-by-line ( rows ) and extract the first relevant class label. The result is a classes  list containing our class labels.

Then, we utilize OpenCV to load the image on Line 24.

Now we’ll make use of OpenCV 3.3’s Deep Neural Network (DNN) module to convert the image  to a blob  as well as to load the model from disk:

Be sure to make note of the comment preceding our call to cv2.dnn.blobFromImage  on Line 31 above.

Common choices for width and height image dimensions inputted to Convolutional Neural Networks include 32 × 32, 64 × 64, 224 × 224, 227 × 227, 256 × 256, and 299 × 299. In our case we are pre-processing (normalizing) the image to dimensions of 227 x 227 (which are the image dimensions SqueezeNet was trained on) and performing a scaling technique known as mean subtraction. I discuss the importance of these steps in my book.

Note: You’ll want to use 224 x 224 for the blob size when using SqueezeNet and 227 x 227 for GoogLeNet to be consistent with the prototxt definitions.

We then load the network from disk on Line 35 by utilizing our prototxt  and model  file path references.

In case you missed it above, it is worth noting here that we are loading a pre-trained model. The training step has already been performed on a more powerful machine and is outside the scope of this blog post (but covered in detail in both PyImageSearch Gurus and Deep Learning for Computer Vision with Python).

Now we’re ready to pass the image through the network and look at the predictions:

To classify the query blob , we pass it forward through the network (Lines 39-42) and print out the amount of time it took to classify the input image (Line 43).

We can then sort the probabilities from highest to lowest (Line 47) while grabbing the top five predictions  (Line 48).

The remaining lines (1) draw the highest predicted class label and corresponding probability on the image, (2) print the top five results and probabilities to the terminal, and (3) display the image to the screen:

We draw the top prediction and probability on the top of the image (Lines 53-57) and display the top-5 predictions + probabilities on the terminal (Lines 61 and 62).

Finally, we display the output image on the screen (Lines 65 and 66). If you are using SSH to connect with your Raspberry Pi this will only work if you supply the -X  flag for X11 forwarding when SSH’ing into your Pi.

To see the results of applying deep learning on the Raspberry Pi using OpenCV and Python, proceed to the next section.

Raspberry Pi and deep learning results

We’ll be benchmarking our Raspberry Pi for deep learning against two pre-trained deep neural networks:

  • GoogLeNet
  • SqueezeNet

As we’ll see, SqueezeNet is much smaller than GoogLeNet (5MB vs. 25MB, respectively) and will enable us to classify images substantially faster on the Raspberry Pi.

To run pre-trained Convolutional Neural Networks on the Raspberry Pi use the “Downloads” section of this blog post to download the source code + pre-trained neural networks + example images.

From there, let’s first benchmark GoogLeNet against this input image:

Figure 3: A “barbershop” is correctly classified by both GoogLeNet and Squeezenet using deep learning and OpenCV.

As we can see from the output, GoogLeNet correctly classified the image as “barbershop” in 1.7 seconds:

Let’s give SqueezeNet a try:

SqueezeNet also correctly classified the image as “barbershop”

…but in only 0.9 seconds!

As we can see, SqueezeNet is significantly faster than GoogLeNet — which is extremely important since we are applying deep learning to the resource constrained Raspberry Pi.

Let’s try another example with SqueezeNet:

Figure 4: SqueezeNet correctly classifies an image of a cobra using deep learning and OpenCV on the Raspberry Pi.

However, while SqueezeNet is significantly faster, it’s less accurate than GoogLeNet:

Figure 5: A jellyfish is incorrectly classified by SqueezNet as a bubble.

Here we see the top prediction by SqueezeNet is “bubble”. While the image may appear to have bubble-like characteristics, the image is actually of a “jellyfish” (which is the #2 prediction from SqueezeNet).

GoogLeNet on the other hand correctly reports “jellyfish” as the #1 prediction (with the sacrifice of processing time):

Summary

Today, we learned how to apply deep learning on the Raspberry Pi using Python and OpenCV.

In general, you should:

  1. Never use your Raspberry Pi to train a neural network.
  2. Only use your Raspberry Pi to deploy a pre-trained deep learning network.

The Raspberry Pi does not have enough memory or CPU power to train these types of deep, complex neural networks from scratch.

In fact, the Raspberry Pi barely has enough processing power to run them — as we’ll find out in next week’s blog post you’ll struggle to get a reasonable frames per second for video processing applications.

If you’re interested in embedded deep learning on low cost hardware, I’d consider looking at optimized devices such as NVIDIA’s Jetson TX1 and TX2. These boards are designed to execute neural networks on the GPU and provide real-time (or as close to real-time as possible) classification speed.

In next week’s blog post, I’ll be discussing how to optimize OpenCV on the Raspberry Pi to obtain performance gains by upwards of 100% for object detection using deep learning.

To be notified when this blog post is published, just enter your email address in the form below!

Downloads:

If you would like to download the code and images used in this post, please enter your email address in the form below. Not only will you get a .zip of the code, I’ll also send you a FREE 11-page Resource Guide on Computer Vision and Image Search Engines, including exclusive techniques that I don’t post on this blog! Sound good? If so, enter your email address and I’ll send you the code immediately!

, , , , ,

27 Responses to Deep learning on the Raspberry Pi with OpenCV

  1. Andrey October 2, 2017 at 10:55 am #

    Nice detailed post. Thank you.

    • Adrian Rosebrock October 3, 2017 at 11:00 am #

      Thanks Andrey!

  2. Flávio Rodrigues October 2, 2017 at 11:03 am #

    Hi, Dr. Adrian! Should not the size of the input image be 227×227 as reported in the prototxt file? And just to be perfect as always, there is a typo on line “Train SqueezeNet from scratch on the challenging ImageNetd ataset and replicate the original results by Iandola et al.” 😉 Thanks a lot for your posts!

    • Adrian Rosebrock October 3, 2017 at 11:02 am #

      Thank you for reporting the typo, Flávio! I have fixed it now.

      As for SqueezeNet, yes, it should be 227×227 as that is what the prototxt reports. I have updated that typo as well.

  3. Ashutosh Dubey October 2, 2017 at 12:39 pm #

    does your book “deep learning for computer vision” is useful to work with raspberry pi?

    • Adrian Rosebrock October 3, 2017 at 11:00 am #

      You can go through the vast majority of the Starter Bundle of Deep Learning for Computer Vision with Python on the Raspberry Pi, but not the Practitioner Bundle or ImageNet Bundle.

      As I mentioned in the blog post you should really be using your laptop or desktop if you intend to train neural networks with scratch.

  4. Cristian Benglenok October 2, 2017 at 12:49 pm #

    if I set a cluster with 2 or 4 raspbery pi 3, will increase the processing speed considerably? if we use it to deploy a pre-trained deep learning network.

    • Adrian Rosebrock October 3, 2017 at 10:59 am #

      Not really, because at that point you’ll be dealing with network I/O latency at inference time. You’ll need a faster system if you need quicker classifications.

  5. fariborz October 2, 2017 at 1:16 pm #

    hi Adrian
    tanx alot
    nice post
    i need deep learning with raspberry pi
    please make a tutorial for Drowsiness detection with raspberry pi.
    i do drowsiness project with raspberry but it is lagy and very slow

    • Adrian Rosebrock October 3, 2017 at 10:57 am #

      I will be doing a Raspberry Pi + drowsiness detector later this month (October 2017).

      • fariborz October 8, 2017 at 4:40 pm #

        wow nice tanx a lot

  6. Kiran October 2, 2017 at 2:26 pm #

    That’s so awesome! I can’t believe we have made it this far with all the CV blogs! The Pi community is so big and getting bigger and bigger with DNNs. I wanted to know if there’s a separate webpage on your website where people who follow your blog could benchmark and share all their Pi’s performance ratings for CNNs! Looking forward to your upcoming post on running it on live stream! I pity those Pi’s taking so much stress! ha ha 😛

    • Adrian Rosebrock October 3, 2017 at 10:58 am #

      Hi Kiran — I don’t have any a page dedicated where readers can share their benchmarks using deep learning and the Raspberry Pi, but I’ll definitely consider this for the future.

  7. aditya October 2, 2017 at 2:44 pm #

    Hello Sir Please write a blog on how to make a model for object recognization in image steps to train the model and what algorithm using deep learning please please make such tut as you have made a tutorial for object detection but it has pre-trained model but i want to learn how to train manually the images soo make such blog tut…
    As you have told before that this is available in your book but really i don’t have enough money to buy your book but i always read your blog and gain knoweldge. I use collage resources to implement your code and i am really intrested in learning so i work hard but i don’t have enough money to get the paid course but your blog has helped me a lot soo thank you very much…

  8. Aleksandr Rybnikov October 2, 2017 at 4:53 pm #

    Thank you, Adrian! Excellent job! I just want to add that model’s size isn’t an exhausting way to estimate inference time. For example, ENet architectire has ~10MB as all weights, but takes a lot of computations and time to apply these weights. OpenCV’s dnn has functions to get FLOPs for model without inference.
    Also recently I added new face detector to the dnn. It’s ResNet-based SSD and I trained it on some open dataset full of faces. It works faster than real-time on i7-6700k and according to tests it’s better than casscades-based one. Maybe this infirmation will be interesting for you

    • Adrian Rosebrock October 3, 2017 at 10:57 am #

      Hi Aleksandr — you’re absolutely right, examining model size is not an exact way to measure inference time. I was instead referring to the limited RAM available on the Raspberry Pi and that users need to be careful not to exhaust this memory.

      The new ResNet-based SSD face detector sounds very interesting as well! I’ll be sure to take a look at this.

  9. Abrie October 3, 2017 at 7:18 am #

    Hi Adrian. Thanks a lot for these tutorials.

    Will next weeks blog post cover how to detect objects from a live video stream on the Raspberry Pi and Picam module?

    Kind regards

    Abrie

    • Adrian Rosebrock October 3, 2017 at 10:55 am #

      Hi Abrie — next week’s blog post will be on optimizing OpenCV and milking every last bit of performance out of it. The following week will then cover deep learning-based object detection on the Raspberry Pi.

  10. Doug October 3, 2017 at 7:35 pm #

    THANK YOU! I downloaded the .py source for this squeezenet demo.. After building python3.6 and then opencv3.3 with tinydnn needed for dnn in cv (or was it?) all in a docker (and not on a Pi), this demo.tutorial worked for me first time. I am now tempted to take the twenty-one day course.

    • Adrian Rosebrock October 4, 2017 at 12:36 pm #

      Nice job, Doug!

  11. chetan j October 9, 2017 at 8:43 am #

    hi,
    how to detect person using this technique.

    • Adrian Rosebrock October 9, 2017 at 12:12 pm #

      Please see this post where discusses the SSD object detector that includes a “person” class.

  12. memeka October 9, 2017 at 6:06 pm #

    Can the googlenet and squeezenet models be used for object detection too (not classification), or do I need different models?
    Thanks.

    • Adrian Rosebrock October 13, 2017 at 9:18 am #

      No, you cannot swap out arbitrary networks. The network needs to follow a certain object detection framework such as SSD, R-CNN, YOLO, etc. Networks train for strictly image classification cannot be directly used for object detection to predict the bounding box around an object in an image. I’ll be covering how to train your own custom object detection networks inside Deep Learning for Computer Vision with Python.

      • memeka October 13, 2017 at 10:27 am #

        Do you know of squeezenet or googlenet (or resnet) ssd pretrained models?
        I wanted to see the speed difference between them on my board 🙂

  13. Rolba October 10, 2017 at 11:43 am #

    In general, you should:

    Never use your Raspberry Pi to train a neural network.
    Only use your Raspberry Pi to deploy a pre-trained deep learning network.

    I only tried once just for fun:P.

    But being serious this is very usefull lesson lerned. Imagine you have to count something in the field or what so ever. Something – somewhere without electricity plugin, no IP connection. Middle of Alaska! R-pi is very good choice to power it from solar panel (~5 wats), plugin camera, load pre-trained dnn and classify. Spend 35 bucks on sale and there you go!

    But if you have more money you can always play with very nice Nvidia Jestson platform ;).

    But it’s always a matter of costs. You can always have your R-pi and connect it to Irirdium if you have to collect a lot of data and make a lot of dnn operations on server.

    Take care.
    Rolba

    • Adrian Rosebrock October 13, 2017 at 9:06 am #

      The Raspberry Pi is an extremely versatile tool, you’re absolutely right 🙂

Leave a Reply