Raspberry Pi: Deep learning object detection with OpenCV

A few weeks ago I demonstrated how to perform real-time object detection using deep learning and OpenCV on a standard laptop/desktop.

After the post was published I received a number of emails from PyImageSearch readers who were curious if the Raspberry Pi could also be used for real-time object detection.

The short answer is “kind of”…

…but only if you set your expectations accordingly.

Even when applying our optimized OpenCV + Raspberry Pi install the Pi is only capable of getting up to ~0.9 frames per second when applying deep learning for object detection with Python and OpenCV.

Is that fast enough?

Well, that depends on your application.

If you’re attempting to detect objects that are quickly moving through your field of view, likely

But if you’re monitoring a low traffic environment with slower moving objects, the Raspberry Pi could indeed be fast enough.

In the remainder of today’s blog post we’ll be reviewing two methods to perform deep learning-based object detection on the Raspberry Pi.

Looking for the source code to this post?
Jump right to the downloads section.

Raspberry Pi: Deep learning object detection with OpenCV

Today’s blog post is broken down into two parts.

In the first part, we’ll benchmark the Raspberry Pi for real-time object detection using OpenCV and Python. This benchmark will come from the exact code we used for our laptop/desktop deep learning object detector from a few weeks ago.

I’ll then demonstrate how to use multiprocessing to create an alternate method to object detection using the Raspberry Pi. This method may or may not be useful for your particular application, but at the very least it will give you an idea on different methods to approach the problem.

Object detection and OpenCV benchmark on the Raspberry Pi

The code we’ll discuss in this section is is identical to our previous post on Real-time object detection with deep learning and OpenCV; therefore, I will not be reviewing the code exhaustively.

For a deep dive into the code, please see the original post.

Instead, we’ll simply be using this code to benchmark the Raspberry Pi for deep learning-based object detection.

To get started, open up a new file, name it real_time_object_detection.py , and insert the following code:

We then need to parse our command line arguments:

Followed by performing some initializations:

We initialize CLASSES , our class labels, and corresponding COLORS , for on-frame text and bounding boxes (Lines 22-26), followed by loading the serialized neural network model (Line 30).

Next, we’ll initialize the video stream object and frames per second counter:

Wwe initialize the video stream and allow the camera warm up for 2.0 seconds (Lines 35-37).

On Line 35 we initialize our VideoStream  using a USB camera If you are using the Raspberry Pi camera module you’ll want to comment out Line 35 and uncomment Line 36 (which will enable you to access the Raspberry Pi camera module via the VideoStream  class).

From there we start our fps  counter on Line 38.

We are now ready to loop over frames from our input video stream:

Lines 41-55 simply grab and resize a frame , convert it to a blob , and pass the blob  through the neural network, obtaining the detections  and bounding box predictions.

From there we need to loop over the detections  to see what objects were detected in the frame :

On Lines 58-80, we loop over our detections . For each detection we examine the confidence  and ensure the corresponding probability of the detection is above a predefined threshold. If it is, then we extract the class label and compute (x ,y) bounding box coordinates. These coordinates will enable us to draw a bounding box around the object in the image along with the associated class label.

From there we’ll finish out the loop and do some cleanup:

Lines 82-91 close out the loop — we show each frame, break  if ‘q’ key is pressed, and update our fps  counter.

The final terminal message output and cleanup is handled on Lines 94-100.

Now that our brief explanation of real_time_object_detection.py  is finished, let’s examine the results of this approach to obtain a baseline.

Go ahead and use the “Downloads” section of this post to download the source code and pre-trained models.

From there, execute the following command:

As you can see from my results we are obtaining ~0.9 frames per second throughput using this method and the Raspberry Pi.

Compared to the 6-7 frames per second using our laptop/desktop we can see that the Raspberry Pi is substantially slower.

That’s not to say that the Raspberry Pi is unusable when applying deep learning object detection, but you need to set your expectations on what’s realistic (even when applying our OpenCV + Raspberry Pi optimizations).

Note: For what it’s worth, I could only obtain 0.49 FPS when NOT using our optimized OpenCV + Raspberry Pi install — that just goes to show you how much of a difference NEON and VFPV3 can make.

A different approach to object detection on the Raspberry Pi

Using the example from the previous section we see that calling net.forward()  is a blocking operation — the rest of the code in the while  loop is not allowed to complete until net.forward()  returns the detections .

So, what if net.forward()  was not a blocking operation?

Would we able to obtain a faster frames per second throughput?

Well, that’s a loaded question.

No matter what, it will take approximately a little over a second for net.forward()  to complete using the Raspberry Pi and this particular architecture — that cannot change.

But what we can do is create a separate process that is solely responsible for applying the deep learning object detector, thereby unblocking the main thread of execution and allow our while  loop to continue.

Moving the predictions to separate process will give the illusion that our Raspberry Pi object detector is running faster than it actually is, when in reality the net.forward()  computation is still taking a little over one second.

The only problem here is that our output object detection predictions will lag behind what is currently being displayed on our screen. If you detecting fast-moving objects you may miss the detection entirely, or at the very least, the object will be out of the frame before you obtain your detections from the neural network.

Therefore, this approach should only be used for slow-moving objects where we can tolerate lag.

To see how this multiprocessing method works, open up a new file, name it pi_object_detection.py , and insert the following code:

For the code walkthrough in this section, I’ll be pointing out and explaining the differences (there are quite a few) compared to our non-multprocessing method.

Our imports on Lines 2-10 are mostly the same, but notice the imports of Process  and Queue  from Python’s multiprocessing package.

Next, I’d like to draw your attention to a new function, classify_frame :

Our new classify_frame  function is responsible for our multiprocessing — later on we’ll set it up to run in a child process.

The classify_frame  function takes three parameters:

  • net : the neural network object.
  • inputQueue : our FIFO (first in first out) queue of frames for object detection.
  • outputQueue: our FIFO queue of detections which will be processed in the main thread.

This child process will loop continuously until the parent exits and effectively terminates the child.

In the loop, if the inputQueue  contains a frame , we grab it, and then pre-process it and create a blob  (Lines 16-22), just as we have done in the previous script.

From there, we send the blob  through the neural network (Lines 26-27) and place the detections  in an outputQueue  for processing by the parent.

Now let’s parse our command line arguments:

There is no difference here — we are simply parsing the same command line arguments on Lines 33-40.

Next we initialize some variables just as in our previous script:

This code is the same — we initialize class labels, colors, and load our model.

Here’s where things get different:

On Lines 56-58 we initialize an inputQueue  of frames, an outputQueue  of detections, and a detections  list.

Our inputQueue  will be populated by the parent and processed by the child — it is the input to the child process.  Our outputQueue  will be populated by the child, and processed by the parent — it is output from the child process. Both of these queues trivially have a size of one as our neural network will only be applying object detections to one frame at a time.

Let’s initialize and start the child process:

It is very easy to construct a child process with Python’s multiprocessing module — simply specify the target  function and args  to the function as we have done on Lines 63 and 64.

Line 65 specifies that p  is a daemon process, and Line 66 kicks the process off.

From there we’ll see some more familiar code:

Don’t forget to change your video stream object to use the PiCamera if you desire by switching which line is commented (Lines 71 and 72).

Once our vs  object and fps  counters are initialized, we can loop over the video frames:

On Lines 80-82, we read a frame, resize it, and extract the width and height.

Next, we’ll work our our queues into the flow:

First we check if the inputQueue  is empty — if it is empty, we put a frame in the inputQueue  for processing by the child (Lines 86 and 87). Remember, the child process is running in an infinite loop, so it will be processing the inputQueue  in the background.

Then we check if the outputQueue  is not empty — if it is not empty (something is in it), we grab the detections  for processing here in the parent (Lines 90 and 91). When we call get()  on the outputQueue , the detections are returned and the outputQueue  is now momentarily empty.

If you are unfamiliar with Queues or if you want a refresher, see this documentation.

Let’s process our detections:

If our detections  list is populated (it is not None ), we loop over the detections as we have done in the previous section’s code.

In the loop, we extract and check the confidence  against the threshold (Lines 100-105),  extract the class label index (Line 110), and draw a box and label on the frame (Lines 111-122).

From there in the while loop we’ll complete a few remaining steps, followed by printing some statistics to the terminal, and performing cleanup:

In the remainder of the loop, we display the frame to the screen (Line 125) and capture a key press and check if it is the quit key at which point we break out of the loop (Lines 126-130). We also update our fps  counter.

To finish out, we stop the fps  counter, print our time/FPS statistics, and finally close windows and stop the video stream (Lines 136-142).

Now that we’re done walking through our new multiprocessing code, let’s compare the method to the single thread approach from the previous section.

Be sure to use the “Downloads” section of this blog post to download the source code + pre-trained MobileNet SSD neural network. From there, execute the following command:

Here you can see that our while  loop is capable of processing 27 frames per second. However, this throughput rate is an illusion — the neural network running in the background is still only capable of processing 0.9 frames per second.

Note: I also tested this code on the Raspberry Pi camera module and was able to obtain 60.92 frames per second over 35 elapsed seconds.

The difference here is that we can obtain real-time throughput by displaying each new input frame in real-time and then drawing any previous detections  on the current frame.

Once we have a new set of detections  we then draw the new ones on the frame.

This process repeats until we exit the script. The downside is that we see substantial lag. There are clips in the above video where we can see that all objects have clearly left the field of view…

…however, our script still reports the objects as being present.

Therefore, you should consider only using this approach when:

  1. Objects are slow moving and the previous detections can be used as an approximation to the new location.
  2. Displaying the actual frames themselves in real-time is paramount to user experience.


In today’s blog post we examined using the Raspberry Pi for object detection using deep learning, OpenCV, and Python.

As our results demonstrated we were able to get up to 0.9 frames per second, which is not fast enough to constitute real-time detection. That said, given the limited processing power of the Pi, 0.9 frames per second is still reasonable for some applications.

We then wrapped up this blog post by examining an alternate method to deep learning object detection on the Raspberry Pi by using multiprocessing. Whether or not this second approach is suitable for you is again highly dependent on your application.

If your use case involves low traffic object detection where the objects are slow moving through the frame, then you can certainly consider using the Raspberry Pi for deep learning object detection. However, if you are developing an application that involves many objects that are fast moving, you should instead consider faster hardware.

Thanks for reading and enjoy!

And if you’re interested in studying deep learning in more depth, be sure to take a look at my new book, Deep Learning for Computer Vision with Python. Whether this is the first time you’ve worked with machine learning and neural networks or you’re already a seasoned deep learning practitioner, my new book is engineered from the ground up to help you reach expert status.

Just click here to start your journey to deep learning mastery.


If you would like to download the code and images used in this post, please enter your email address in the form below. Not only will you get a .zip of the code, I’ll also send you a FREE 11-page Resource Guide on Computer Vision and Image Search Engines, including exclusive techniques that I don’t post on this blog! Sound good? If so, enter your email address and I’ll send you the code immediately!

, , , , , , , , , , ,

99 Responses to Raspberry Pi: Deep learning object detection with OpenCV

  1. Mr.Odeh October 16, 2017 at 11:16 am #

    Thanks, dr.adrian for this great article!
    It is working fine with me, but I have a small question.
    I want to detect insects in real time using raspberry, do you recommend any pre-trained module that can do the object detection for insects not just persons, dogs, sofa …etc? if there isn’t what should u do in general to achieve my aim?
    thanks again

    • Adrian Rosebrock October 16, 2017 at 12:14 pm #

      I’m not aware of a pre-trained model that specifically detects insects. I would suggest fine-tuning an existing, pre-trained neural network. I discussing fine-tuning inside Deep Learning for Computer Vision with Python.

  2. JBeale October 16, 2017 at 11:55 am #

    Very impressive! My own experiments on RPi was about 3 or 4 seconds per frame. So almost 1 fps is quite an improvement from that. With a moderately wide-angle lens that could already be useful, unless you must have the object almost completely fill the frame. Does this code fully utilize all 4 cores on the RPi 3, or is there potentially some additional parallelization possible?

    • Adrian Rosebrock October 16, 2017 at 12:13 pm #

      There are always more optimizations that can be made, it’s just a matter of if it’s worth it. The fully utilize all cores to their maximum potential we would need OpenCL (which to my knowledge) The Raspberry Pi does not support.

      • JBeale October 17, 2017 at 1:46 pm #

        Thanks. I have yet to try your code, but if (for example) it only uses 2 cores and it’s CPU-bound rather than memory or I/O bound, you ought to get a speedup by simply instantiating two separate processes, one looking at odd frames and one doing even frames. But maybe it’s not that easy; you might run out of memory.

      • JBeale October 18, 2017 at 12:14 pm #

        Speaking of OpenCL on Raspbery Pi: it is not 100% complete, but:
        09-OCT-2017 : I present to you VC4CL (VideoCore IV OpenCL):

        • Adrian Rosebrock October 18, 2017 at 3:55 pm #

          Awesome, thanks for sharing.

          • Jay October 27, 2017 at 1:53 am #

            Came across this article by accident. Great site. I realise this is a python based site but what are the speed improvements were this to be implemented in C++ for comparison sake?

          • Adrian Rosebrock October 27, 2017 at 11:15 am #

            Hi Jay — I’m glad you enjoyed the blog post. Python is just a wrapper around the original C/C++ code for OpenCV. So the speed will be very similar.

  3. Dayle October 16, 2017 at 12:18 pm #

    Hi Adrian,

    Thanks a ton for remembering us Pi enthusiasts. I first got interested in image analysis after someone stole your beer, but was afraid you would lose interest in the Pi after purchasing the beast.

    Looking forward to diving in to this post and reading the new book.


    • Adrian Rosebrock October 16, 2017 at 12:35 pm #

      Hi Dayle — I certainly have not lost interested in the Raspberry Pi, I’ve just primarily been focusing on deep learning tutorials lately 🙂

  4. Anish October 16, 2017 at 2:25 pm #

    How is this method different from using Squezenet for object detection on a raspberry pi?
    The one you posted a couple of weeks ago?
    Also what are the pros and cons of using squezenet over this method?

    • Adrian Rosebrock October 17, 2017 at 9:37 am #

      SqueezeNet is an image classifier. It takes an entire image and returns a single class label. It does no object detection or localization.

      The SSD and Faster R-CNN frameworks can be used for object detection. It requires that you take an architecture (SqueezeNet, VGGNet, etc.) and then train it using the object detection framework. This will minimize the joint loss between class label prediction AND localization.

      The gist is that vanilla SqueezeNet and SSD are two totally different frameworks.

      If you’re interested in learning more about deep learning (and how these architectures differ), I would definitely suggest working through Deep Learning for Computer Vision with Python where I cover these methods in detail.

  5. Ahmad October 16, 2017 at 6:33 pm #

    i have this error , there is a problem here -> args = vars(ap.parse_args())

    usage: real_time_object_detection.py [-h] -p PROTOTXT -m MODEL [-c CONFIDENCE]
    real_time_object_detection.py: error: argument -p/–prototxt is required

    • Adrian Rosebrock October 17, 2017 at 9:34 am #

      Please read up on command line arguments.

      • aman January 6, 2018 at 4:27 am #

        I am also getting same error as above mentioned.

        usage: pi_object_detection.py [-h] -p PROTOTXT -m MODEL [-c CONFIDENCE]
        pi_object_detection.py: error: argument -p/–prototxt is required

        I read out your suggestion of “command line argument” but didn’t able to understand how to use it.

        So, help me ,how would i remove this error?

        • Adrian Rosebrock January 8, 2018 at 2:54 pm #

          Hey Aman — this blog assumes some basic knowledge of working the command line. If you’re new to the command line that’s okay — we all start somewhere! Take the time to familiarize yourself with command line arguments before continuing. Otherwise, be sure to Google “command line name of your operating system” and and read up on how to use the command line for your OS.

  6. Komoriii October 16, 2017 at 9:38 pm #

    Impressive tutorial.This article helped me a lot,thank you!

    • Adrian Rosebrock October 17, 2017 at 9:34 am #

      Fantastic, I’m glad to hear it Komoriii! 🙂

  7. Sachin October 17, 2017 at 1:36 am #

    Great post as always, Adrian! I have learned a lot about computer vision from the content on your site.

    Regarding doing AI on the Pi, I would personally not do detection and recognition on an edge device. At least not until they ship a Pi with an AI chip and a decent GPU! And maybe not even then, due to the high power (electricity) consumption of AI. I’d much rather use the Pi as a sensor + basic signal processor, WiFi over all the video / sensor signals to a CPU box, and run all the algorithms on that box.
    So I guess I agree with your conclusions.

    • Adrian Rosebrock October 17, 2017 at 9:33 am #

      Hi Sachin — thanks for the comment. I actually discuss the tradeoffs of using the Raspberry Pi for deep learning in this post. In general, I do agree with you that a Raspberry Pi should not be used for deep learning unless under very specific circumstances.

      • Peter November 30, 2017 at 9:53 pm #

        Any specific CPU box that you think is a good (relatively cheap) option for doing the post-processing?

        • Adrian Rosebrock December 2, 2017 at 7:31 am #

          What type of post-processing are you referring to? The type of post-processing you are doing would impact my suggestion.

  8. Abhishek October 17, 2017 at 3:35 am #

    Hi Adrian, i love ur work, Sir can you please tell me how i can compute :the (x, y)-coordinates of the bounding box for the object if i’m using Squeezenet instead of MobileNet SSD caffe Model on my raspberry pi 3…..what i supposed to change in “box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])” so that it will work with detecting object in squeezenet with highest probablity (I’m able to find the index of object with highest probability till now with your previous post on deep learning) any help is appreciated 🙂 [i have raspberrian stretch with opencv3.30 -dev installed(neon optimized)]

    • Adrian Rosebrock October 17, 2017 at 9:31 am #

      Are you using your own custom trained SqueezeNet using the SSD framework? Keep in mind that you cannot swap out networks trained for image classification and use them for object detection.

      • Abhishek October 17, 2017 at 11:16 pm #

        Thanks for reply :), i figured that out eventually (Previously i was using SqueezeNet v1.1 imageclassifier instead of SSD framework) but i found another great SqueezeNet-SSD (based on Squeezenet v1.1)


        I benchmarked it using this script but see merely any difference from MobileNetSSD …..both have same FPS around 1.72FPS(opencv optimized) on normal script and 29.4FPS(opencv optimized) using this Multithreaded script…..Through Squeezenet v1.1 (around 0.4 ms on raspberry pi 3) is way faster than any other image classifier, why this Squeezenet-SSD is slower ? I’m totally confused :\

        • Adrian Rosebrock October 19, 2017 at 4:56 pm #

          SqueezeNet v1.1 is slower because it utilizes ResNet-like modules. These increase accuracy, but slow the network down a bit.

          • cweihang December 29, 2017 at 3:30 am #

            I think that’s not the point. It’s the difference between SqueezeNet-SSD and SqueezeNet that causes the speed difference.

          • Adrian Rosebrock December 31, 2017 at 9:53 am #

            Thanks for the note, I must have misunderstood the original comment when I read it the first time. In general, yes, image classification networks will tend to run faster than object detection networks.

  9. David Killen October 17, 2017 at 4:46 am #

    trivial point, no need to publish, but you say ‘net.forwad()’ vice ‘net.forward()’ at least twice

    • Adrian Rosebrock October 17, 2017 at 9:30 am #

      Thanks for letting me know, David! I have updated the post.

  10. M.Komijani October 17, 2017 at 8:37 am #

    Hello Adrian,

    Thanks for this great article!
    Actually, I’m a starter in deep learning, but I want to use the Raspberry Pi for deep learning object detection.
    One question: Does x-nor net improves the speed results?

    • Adrian Rosebrock October 17, 2017 at 9:27 am #

      I haven’t used XNOR net to benchmark it, but from the paper the argument is that you can use XNOR net to speedup the network. You end up saving (approximately) 32x memory and 58x faster convolutional operations.

  11. Ying October 18, 2017 at 5:09 am #

    hi Adrian,

    Can we only detect people or car (i.e. specific class) by changing the python code?

    • Adrian Rosebrock October 19, 2017 at 4:53 pm #

      Yes. Check the idx of the predicted class and filter out the ones you are uninterested in.

  12. jsmith October 18, 2017 at 11:27 am #

    Hi Adrian,

    I’ve been wanting to do this for months, and it was this that got me to your website, so thank you!

    I have been trying to tweak the code so that I can grab the frame when an object is detected and save that as a .jpg in a folder as:


    following the ‘Accessing the Raspberry Pi Camera with OpenCV and Python’ tutorial, so I can have my own dataset by using the Pi to do all the hard work.

    However I keep getting an mmalError message.

    How would you go about taking a frame from the Pi when it detects an object and saving that frame in a folder with that object’s class so you can have a dataset to work with?


    • Adrian Rosebrock October 19, 2017 at 4:51 pm #

      I would suggest debugging this line-by-line. Try to determine what line is throwing the error by inserting “print” statements. If you can provide that, I can try to point you in the right direction.

      From there, you can use the cv2.imwrite to save your image to disk. You can format your filename using the detected label and associated probability returned by net.forward.

  13. Roald October 18, 2017 at 3:52 pm #

    Hi Adrian,

    I’m having issues that net.forward() seems to return inconsistent results. For example, I have two frames. One frame with my cat, one frame without. If I process frame-without-cat, the cat is not found. If I process frame-with-cat, it is correctly found. However, if I do this:

    detect frame-without-cat
    detect frame-with-cat
    detect frame-without-cat

    I get the following results:
    cat not detected
    cat detected
    cat detected

    Which is inconsistent, as the third and first frame should have the same result. However, if for each detection I reload the model, this issue does not occur. It looks as though the net retains previous detections?

    Do you have any idea what this could be? If need be, I can provide example data and source.

    • Adrian Rosebrock October 19, 2017 at 4:47 pm #

      Hi Roald — this is indeed strange; however, I would double-check your images and 100% verify that you are passing in the correct images as you expect. The network should not be retaining any type of memory from previous detections. Secondly, check the confidence (i.e., probability) of your false-positives and see if you can increase the confidence to filter out these weak detections.

  14. Noble October 20, 2017 at 12:03 am #

    Hi Adrian,

    Downloaded the example for this post and ran it:

    $ $ python3 real_time_object_detection.py
    usage: real_time_object_detection.py [-h] -p PROTOTXT -m MODEL [-c CONFIDENCE]
    real_time_object_detection.py: error: the following arguments are required: -p/–prototxt, -m/–model

    How do I specify the path for the protext file and the model file to ap.add_argument. They are all in the same folder.

    • Adrian Rosebrock October 22, 2017 at 8:46 am #

      Please read up on command line arguments. This will enable you to learn more about command line basics. Furthermore, I also present examples on how to run the Python script via the command line inside this blog post.

    • aman January 8, 2018 at 3:22 am #

      Hi, i also got the same problem while running the python program.
      Have you solved this or still stucked in this problem

  15. Human October 20, 2017 at 5:15 pm #

    i want to track a ball is that code reliable to do the task

    • Adrian Rosebrock October 22, 2017 at 8:37 am #

      I cover ball/object tracking inside this post.

  16. Apramey October 21, 2017 at 7:45 am #

    Hello Adrian Sir,
    I’m great fan of all your articles and I ought learn more from you.
    I’m presently running on ubuntu mate on raspberry pi 3, I even optimized pi, the way you told in previous post. I removed the unnecessary applications of ubuntu mate which I’m not using. The code runs without any error. But the problem is GPU rendering, I get the frame, but I can’t visualize the video it’s recording. it continuously lags after code starts running.

    • Adrian Rosebrock October 22, 2017 at 8:35 am #

      The Raspberry Pi will only be able to process ~1 frame per second using this deep learning-based object detection method so the lag is entirely normal. Is there another type of lag you are referring to?

  17. Reed October 29, 2017 at 1:01 am #

    Hi Adrian
    I tried to run the codes above, but the result was
    $ python pi_object_detection.py –prototxt MobileNetSSD_deploy.prototxt.txt –model MobileNetSSD_deploy.caffemodel
    [INFO] loading model…
    [INFO] starting process…
    [INFO] starting video stream…

    (h, w) = image.shape[:2]
    AttributeError: ‘NoneType’ object has no attribute ‘shape’

    and I also read the post on 26th Dec 2016, still have no clue
    what should I do?

    • Adrian Rosebrock October 30, 2017 at 3:19 pm #

      Hi Reed — this error usually occurs when the webcam didn’t properly grab an image. You could put the following between lines 80 and 81:

      if frame is None:

      • JayEf November 25, 2017 at 3:23 pm #

        now it gives me:
        NameError: name ‘frame’ is not defined

  18. chetan k j November 8, 2017 at 12:40 am #

    will u please tel me how to detect human(person) within a fraction of second.

    i used this coding technique, found idx values and compared with threshold but its having one to two second delay.

    if idx == 15 and threshold>0.2:
    print ‘human detected’

    please suggest how to detect human within fraction of second.

  19. chetan k j November 8, 2017 at 7:48 am #


    please help to find human within fraction of second using raspberry pi-3.

    • Adrian Rosebrock November 9, 2017 at 6:42 am #

      I would suggest taking a look at this blog post to start. Even with an optimized OpenCV install you are not going to be able to detect objects in a fraction of a second on the Raspberry Pi, it’s simply too slow.

      • Chetan K j November 9, 2017 at 1:47 pm #

        Thanks for ur reply,
        Will u please give me suggestions, which board is used to do this operation means with a fraction of second detecting human…. Which board is better to use

        • Adrian Rosebrock November 13, 2017 at 2:21 pm #

          I would suggest trying the Jetson TX1 and TX2.

  20. sachin November 9, 2017 at 1:52 am #

    nice work

    • Adrian Rosebrock November 9, 2017 at 6:14 am #

      Thanks Sachin!

  21. Dharshan November 9, 2017 at 7:47 am #

    Hi Adrian,

    Fantastic Work. Thanks a lot for sharing the code. I used an IP camera that streams h264/Jpeg camera over RTSP and was able to see fps of 0.6 as compared to 0.9 you see. Not Bad I guess. Exploring avenues to increase the fps count.

    • Adrian Rosebrock November 13, 2017 at 2:28 pm #

      Nice job, Dharshan!

  22. hamed November 9, 2017 at 2:10 pm #

    hi does it have any image processing in matlab soft

    • Adrian Rosebrock November 13, 2017 at 2:20 pm #

      Are you referring to the Raspberry Pi? The Raspberry Pi does not include any out-of-the-box image processing software for MATLAB.

  23. Andres November 10, 2017 at 11:37 pm #

    Hi Adrian! I wanted to know if it is needed to have opencv 3.3 to run this programme since everytime i try to intall it my rasp gets frozen. i have installed previous versions of opencv without problem.
    Nice work by the way 🙂

    • Adrian Rosebrock November 13, 2017 at 2:09 pm #

      Hi Andres — yes, OpenCV 3.3 is required for this blog post. If you’re having trouble getting OpenCV installed on your Pi, please see this tutorial.

  24. Paul November 19, 2017 at 7:16 am #

    Thx for this great blog, I got it working on my robot with raspPI3&arduino. I am streaming the output frames to a webclient, because the robot has no monitor attached. However, it is very slow given all the other processes that are running in my robot script. So it takes 10 seconds to identify the objects…. Is it possible to limit the number of known objects to look for, for example to look only for persons, sofa, potted plant and chair? Would this speed up the detection process? (since i don’t have airplanes or cars in my living room where the robot is supposed to manouvre..

    • Adrian Rosebrock November 20, 2017 at 4:05 pm #

      The speed of the object detection has nothing to do with the number of classes — it has to do with the depth/complexity of the network (within reason). The deeper/more complex the network, the slower it will run. I have a tutorial on optimizing the Raspberry Pi but in general, if you want to deploy a deep learning object detector, the Raspberry Pi is not the right hardware. I would recommend an NVIDIA Jetson.

  25. Abuthahir November 20, 2017 at 6:41 am #

    Hey. That was a great tutorial. And eventually worked for me. I want to make this thing to run at my raspberry pi boots. I tried it in linked way (https://pastebin.com/zjyEq99c) but I failed. Is there any way to do it? I’m looking for the help.

    • Adrian Rosebrock November 20, 2017 at 3:46 pm #

      I would recommend following this tutorial where I demonstrate how to run the script from boot. Be sure to take a look at the comments as well where we discuss a few other alternative methods.

  26. Sharath November 26, 2017 at 10:59 am #

    Hi Adrian,
    Really Nice. Keep up the good work.
    I am using Pi & open cv for the first time in my life and i simply followed your mentioned steps from the installation tutorial and it really worked.
    Then I integrated this py code on my stretch os with external Logitech webcam (C920-C) and it did work for the first time and i was really happy to see it working.

    Later, I restarted Pi to check the behaviour again and all of sudden it stopped working and started throwing error on command window.Now i am stuck..

    Error message :
    “VIDEOIO ERROR: V4L2: Pixel format of incoming image is unsupported by OpenCV
    Unable to stop the stream: Device or resource busy”.

    Then i started researching on google, found many blogs and its a common issue, but unfortunately i could not crack it.

    Do i need to change the setting in ” /home/pi/opencv-3.3.0/CMakeLists.txt” from OFF to ON in ” OCV_OPTION(WITH_LIBV4L “Use libv4l for Video 4 Linux support” ON ”
    and recompile the openCV.
    If yes, should i use the same commands from #Step5 to #Step7 of compile and Install openCv, the whole process will take 10h again to do it. 🙁

    Do you have any other solution for this to solve it easily.


  27. Lenny December 5, 2017 at 9:26 am #

    Dear Adrian,
    Thank for this website it helps me to understand how to do new things. however I am running into this problem and I cannot figure it out:

    (cv)pi@raspberry:~/real-time-object-detection $ python real_time_object_detection.py –prototxt MobileNetSSD_deploy.prototxt.txt –model MobileNetSSD_deploy.caffemodel
    [INFO] loading model…
    [INFO] starting video stream…
    Traceback (most recent call last):
    File “real_time_object_detection.py”, line 48, in
    frame = imutils.resize(frame, width=400)

    (h, w) = image.shape[:2]
    AttributeError: ‘NoneType’ object has no attribute ‘shape’

    this file exists /home/pi/.virtualenvs/cv/local/lib/python2.7/site-packages/imutils/convenience.py
    and I am pretty sure I did not messed up the installation.

    Do you have any advice?

    Best regards,

    • Adrian Rosebrock December 8, 2017 at 5:18 pm #

      It sounds like your Raspberry Pi is unable to read the frame from your camera sensor as vs.read() is returning None. Are you using the Raspberry Pi camera module? Or a USB camera? I would also suggest reading up on NoneType errors as I discuss here.

  28. Benni Joel December 9, 2017 at 3:34 am #

    Thank You Adrian. But I wonder if I can detect the cube. Is there any packages like Caffe that can detect the cube of a particular dimension. Thank You in advance.

    • Adrian Rosebrock December 9, 2017 at 7:25 am #

      Hey Benni — are you trying to detect just a cube? Or the dimensions associated with a cube?

  29. Gaurav December 12, 2017 at 12:32 am #

    Hi Adrian, thanks for the great tutorial. I am following your tutorials from long time. I followed this tutorial and it worked as expected, Thanks a lot.

    Does the detection time depend on image resolution. Can we reduce the detection time if connect any low resolution USB camera to raspberry pi.

    • Adrian Rosebrock December 12, 2017 at 9:06 am #

      The input resolution to the neural network is fixed in this case (300x300px). Reducing the resolution from the USB camera to the Pi will speedup the frame acquisition rate but keep in mind that the bottleneck here isn’t frame I/O — it’s the speed of the SSD performing object detection.

  30. Gaurav Wable December 12, 2017 at 1:42 am #

    How can i track a detected object? E.g. I want to track the movement of first detected person among all people present in front of camera.

    • Adrian Rosebrock December 12, 2017 at 9:03 am #

      I would suggest taking a look at tracking algorithms, in particular “correlation tracking”. The dlib library has an implementation of correlation tracking.

  31. raghav December 21, 2017 at 9:23 am #

    hey, error = ‘module object has no attribute

  32. Dom December 21, 2017 at 7:36 pm #

    Hi Adrian. Maybe you know how to stream this real time object detection to the local area website (web browser)?

    • Adrian Rosebrock December 22, 2017 at 6:45 am #

      Hi Dom — I don’t have any tutorials on streaming the output frames to a web browser but I’ll add this to my queue for 2018. Thank you for the suggestion!

  33. Gary December 22, 2017 at 3:08 pm #

    How to stream object detection in web? Maybe i can make it with flask or django? Thanks.

    • Adrian Rosebrock December 26, 2017 at 4:38 pm #

      Hey Gary — I do not have any tutorials for RTSP or MJPEG streaming but I’ll add it to my queue!

  34. seyed December 28, 2017 at 4:16 pm #

    Hi Adrian .thanks for the great tutorial. I download the code and run ‘pi_object_detection.py’ on raspberry pi and I have an error…”no module named ‘imutils’ ” then I open a terminal and type ‘pip install imutils’
    and recieve ‘Successfully installed imutils-0.4.5’ but there is also same error
    can you help me please

    • Adrian Rosebrock December 31, 2017 at 9:58 am #

      Are you using a Python virtual environments? If so, make sure you use the “workon” command to access your Python virtual environment, install imutils, and then execute the script:

  35. wally December 30, 2017 at 3:25 pm #


    Have you considered the possibilities of the new Google AIY Vision kit?

    Seems to be some kind of co-processor/microcontroller add on board (vision bonnet) that is supposed to let a PiZero-W run deep learning object detection and classification models.

    I got one, but I need to return it as the connector for the flat cable broke when I tried to open it, so I didn’t get to first base 🙁

    Presumably it could work with the Pi3 as well.

    Google provides a “compiler” that takes a tensorflow “frozen model” and produces code to run on the vision bonnet.

    Obviously my hope here would be to get a higher frame rate than Pi3 can provide. The cost of the vision bonnet kit (~$45) and a PiZero-W (~$10) would not be too much above the price of a Pi3 if we can increase the frame rate significantly.

    • Adrian Rosebrock December 31, 2017 at 9:34 am #

      I am interested in playing around with the Google AIY Vision Kit but I haven’t purchased one yet. I’m still a bit concerned with using a Pi Zero. With only a single core responsible for all system + user operations your frame rate will suffer even if you are using threading to decrease I/O latency.

  36. Mark Z January 1, 2018 at 2:07 pm #

    I am using your Raspberry Pi image from your course, and the above code, and get the following error. Googling the error doesn’t give me a clear path to resolving this, can you please advise?

    File “pi_object_detection.py”, line 55, in
    net = cv2.dnn.readNetFromCaffe(args[“prototxt”], args[“model”])
    AttributeError: ‘module’ object has no attribute ‘dnn’

    • Adrian Rosebrock January 3, 2018 at 1:12 pm #

      Hey Mark — you need OpenCV 3.3 or greater to access the “dnn” module that contains the deep learning features. Any OpenCV version prior to OpenCV 3.3. does not include them.

  37. Paul January 2, 2018 at 6:25 am #

    Good morning and Happy New Year,
    i am getting this error message:

    [INFO] loading model…
    Traceback (most recent call last):
    File “pi_object_detection.py”, line 56, in
    net = cv2.dnn.readNetFromCaffe(args[“prototxt”], args[“model”])
    AttributeError: ‘module’ object has no attribute ‘dnn’

    Any idea ? Thanks in advance!

    • Adrian Rosebrock January 3, 2018 at 1:03 pm #

      Hi Paul — you need to have OpenCV 3.3 installed on your system. Anything older than OpenCV 3.3 does not have the “dnn” module.

  38. Theethat January 8, 2018 at 6:32 am #

    Can I use readNetFromTensorflow(.pb, .prototxt) instread readNetFromCaffe? It can detect object same Caffe or not?

    • Adrian Rosebrock January 8, 2018 at 2:36 pm #

      The readNetFromTensorflow function still doesn’t work in all cases. Whether or not it detects the same objects as the Caffe example in this blog post depends on what your TensorFlow network was trained on.

  39. Theethat January 9, 2018 at 1:16 pm #

    Thank so much Adrian. I have weight.pb file from trained my own dataset in Tensorflow. But I don’t know how to train dataset with Caffe SSD_Mobilenet Model. Do you have plan to post tutorial train own images image dataset with Caffe?

    • Adrian Rosebrock January 10, 2018 at 12:53 pm #

      You can try importing your TensorFlow weights via cv2.dnn.readNetFromTensorflow but it’s very likely that it will fail. The TensorFlow import methods are not (currently) as reliable and robust as the Caffe ones. As for training your own custom object detectors, take a look at Deep Learning for Computer Vision with Python where I cover how to train your own custom deep learning-based object detectors in detail.

  40. hardik shekhawat January 18, 2018 at 2:58 am #

    hello Adrian,
    i have been following your tutorial & i really liked it , i want to detect a street pole with sign on it.. can you tell me how to do edit the above pole accordingly, thank you

    • Adrian Rosebrock January 18, 2018 at 8:50 am #

      The model provided with this example was trained on the COCO dataset. You would need to train your own custom model to recognize street poles and signs. To start, use this tutorial to gather example images of what you’re trying to detect. I then demonstrate how to train your own custom object detectors inside my book, Deep Learning for Computer Vision with Python — recognizing street signs is even one of the included tutorials!

  41. huiyan January 18, 2018 at 5:19 am #

    hi,Adrian. i’m a chinese boy learning computer science. I read a lot of posts you wrote. I don’t know why my raspberrypi run the code in this post is very slow(about 3-4s one frame). does the video quality affect the speed? and can i transmit the video frame from pi to my pc to compute and post the result to pi?
    thanks a lot

    • Adrian Rosebrock January 18, 2018 at 8:43 am #

      As I discuss in this blog post the Raspberry Pi can be very slow for deep learning. If you are using a deeper, larger model the inference time will also be larger.

      As for transmitting a frame from your Pi to a PC you could use SFTP, Dropbox API, or even dedicated messaging passing libraries such ZeroMQ or RabbitMQ.

  42. Rituparna Das January 19, 2018 at 11:39 pm #

    I m getting an error like module object has no attribute dnn
    in the line net=cv2.dnn…

    • Rituparna Das January 20, 2018 at 12:25 am #

      And i have open cv version 3.3.1

      • Adrian Rosebrock January 20, 2018 at 8:07 am #

        That is indeed strange. Can you fire up a Python shell and verify your OpenCV version just to be safe?

        Additionally, are you using a Python virtual environment? Do you have any other OpenCV versions installed on your Raspberry Pi? And how did you install OpenCV on your Raspberry Pi — did you use one of my install tutorials on PyImageSearch?

Leave a Reply