OpenCV People Counter

In this tutorial you will learn how to build a “people counter” with OpenCV and Python. Using OpenCV, we’ll count the number of people who are heading “in” or “out” of a department store in real-time.

Building a person counter with OpenCV has been one of the most-requested topics here on the PyImageSearch and I’ve been meaning to do a blog post on people counting for a year now — I’m incredibly thrilled to be publishing it and sharing it with you today.

Enjoy the tutorial and let me know what you think in the comments section at the bottom of the post!

To get started building a people counter with OpenCV, just keep reading!

Looking for the source code to this post?
Jump right to the downloads section.

OpenCV People Counter with Python

In the first part of today’s blog post, we’ll be discussing the required Python packages you’ll need to build our people counter.

From there I’ll provide a brief discussion on the difference between object detection and object tracking, along with how we can leverage both to create a more accurate people counter.

Afterwards, we’ll review the directory structure for the project and then implement the entire person counting project.

Finally, we’ll examine the results of applying people counting with OpenCV to actual videos.

Required Python libraries for people counting

In order to build our people counting applications, we’ll need a number of different Python libraries, including:

Additionally, you’ll also want to use the “Downloads” section of this blog post to download my source code which includes:

  1. My special pyimagesearch  module which we’ll implement and use later in this post
  2. The Python driver script used to start the people counter
  3. All example videos used here in the post

I’m going to assume you already have NumPy, OpenCV, and dlib installed on your system.

If you don’t have OpenCV installed, you’ll want to head to my OpenCV install page and follow the relevant tutorial for your particular operating system.

If you need to install dlib, you can use this guide.

Finally, you can install/upgrade your imutils via the following command:

Understanding object detection vs. object tracking

There is a fundamental difference between object detection and object tracking that you must understand before we proceed with the rest of this tutorial.

When we apply object detection we are determining where in an image/frame an object is. An object detector is also typically more computationally expensive, and therefore slower, than an object tracking algorithm. Examples of object detection algorithms include Haar cascades, HOG + Linear SVM, and deep learning-based object detectors such as Faster R-CNNs, YOLO, and Single Shot Detectors (SSDs).

An object tracker, on the other hand, will accept the input (x, y)-coordinates of where an object is in an image and will:

  1. Assign a unique ID to that particular object
  2. Track the object as it moves around a video stream, predicting the new object location in the next frame based on various attributes of the frame (gradient, optical flow, etc.)

Examples of object tracking algorithms include MedianFlow, MOSSE, GOTURN, kernalized correlation filters, and discriminative correlation filters, to name a few.

If you’re interested in learning more about the object tracking algorithms built into OpenCV, be sure to refer to this blog post.

Combining both object detection and object tracking

Highly accurate object trackers will combine the concept of object detection and object tracking into a single algorithm, typically divided into two phases:

  • Phase 1 — Detecting: During the detection phase we are running our computationally more expensive object tracker to (1) detect if new objects have entered our view, and (2) see if we can find objects that were “lost” during the tracking phase. For each detected object we create or update an object tracker with the new bounding box coordinates. Since our object detector is more computationally expensive we only run this phase once every N frames.
  • Phase 2 — Tracking: When we are not in the “detecting” phase we are in the “tracking” phase. For each of our detected objects, we create an object tracker to track the object as it moves around the frame. Our object tracker should be faster and more efficient than the object detector. We’ll continue tracking until we’ve reached the N-th frame and then re-run our object detector. The entire process then repeats.

The benefit of this hybrid approach is that we can apply highly accurate object detection methods without as much of the computational burden. We will be implementing such a tracking system to build our people counter.

Project structure

Let’s review the project structure for today’s blog post. Once you’ve grabbed the code from the “Downloads” section, you can inspect the directory structure with the tree  command:

Zeroing in on the most-important two directories, we have:

  1. pyimagesearch/ : This module contains the centroid tracking algorithm. The centroid tracking algorithm is covered in the “Combining object tracking algorithms” section, but the code is not. For a review of the centroid tracking code ( ) you should refer to the first post in the series.
  2. mobilenet_ssd/ : Contains the Caffe deep learning model files. We’ll be using a MobileNet Single Shot Detector (SSD) which is covered at the top of this blog post in the section, “Single Shot Detectors for object detection”.

The heart of today’s project is contained within the  script — that’s where we’ll spend most of our time. We’ll also review the  script today.

Combining object tracking algorithms

Figure 1: An animation demonstrating the steps in the centroid tracking algorithm.

To implement our people counter we’ll be using both OpenCV and dlib. We’ll use OpenCV for standard computer vision/image processing functions, along with the deep learning object detector for people counting.

We’ll then use dlib for its implementation of correlation filters. We could use OpenCV here as well; however, the dlib object tracking implementation was a bit easier to work with for this project.

I’ll be including a deep dive into dlib’s object tracking algorithm in next week’s post.

Along with dlib’s object tracking implementation, we’ll also be using my implementation of centroid tracking from a few weeks ago. Reviewing the entire centroid tracking algorithm is outside the scope of this blog post, but I’ve included a brief overview below.

At Step #1 we accept a set of bounding boxes and compute their corresponding centroids (i.e., the center of the bounding boxes):

Figure 2: To build a simple object tracking via centroids script with Python, the first step is to accept bounding box coordinates and use them to compute centroids.

The bounding boxes themselves can be provided by either:

  1. An object detector (such as HOG + Linear SVM, Faster R- CNN, SSDs, etc.)
  2. Or an object tracker (such as correlation filters)

In the above image you can see that we have two objects to track in this initial iteration of the algorithm.

During Step #2 we compute the Euclidean distance between any new centroids (yellow) and existing centroids (purple):

Figure 3: Three objects are present in this image. We need to compute the Euclidean distance between each pair of original centroids (red) and new centroids (green).

The centroid tracking algorithm makes the assumption that pairs of centroids with minimum Euclidean distance between them must be the same object ID.

In the example image above we have two existing centroids (purple) and three new centroids (yellow), implying that a new object has been detected (since there is one more new centroid vs. old centroid).

The arrows then represent computing the Euclidean distances between all purple centroids and all yellow centroids.

Once we have the Euclidean distances we attempt to associate object IDs in Step #3:

Figure 4: Our simple centroid object tracking method has associated objects with minimized object distances. What do we do about the object in the bottom-left though?

In Figure 4 you can see that our centroid tracker has chosen to associate centroids that minimize their respective Euclidean distances.

But what about the point in the bottom-left?

It didn’t get associated with anything — what do we do?

To answer that question we need to perform Step #4, registering new objects:

Figure 5: In our object tracking example, we have a new object that wasn’t matched with an existing object, so it is registered as object ID #3.

Registering simply means that we are adding the new object to our list of tracked objects by:

  1. Assigning it a new object ID
  2. Storing the centroid of the bounding box coordinates for the new object

In the event that an object has been lost or has left the field of view, we can simply deregister the object (Step #5).

Exactly how you handle when an object is “lost” or is “no longer visible” really depends on your exact application, but for our people counter, we will deregister people IDs when they cannot be matched to any existing person objects for 40 consecutive frames.

Again, this is only a brief overview of the centroid tracking algorithm.

Note: For a more detailed review, including an explanation of the source code used to implement centroid tracking, be sure to refer to this post.

Creating a “trackable object”

In order to track and count an object in a video stream, we need an easy way to store information regarding the object itself, including:

  • It’s object ID
  • It’s previous centroids (so we can easily to compute the direction the object is moving)
  • Whether or not the object has already been counted

To accomplish all of these goals we can define an instance of TrackableObject  — open up the  file and insert the following code:

The TrackableObject  constructor accepts an objectID  + centroid  and stores them. The centroids variable is a list because it will contain an object’s centroid location history.

The constructor also initializes counted  as False , indicating that the object has not been counted yet.

Implementing our people counter with OpenCV + Python

With all of our supporting Python helper tools and classes in place, we are now ready to built our OpenCV people counter.

Open up your  file and insert the following code:

We begin by importing our necessary packages:

  • From the pyimagesearch  module, we import our custom CentroidTracker  and TrackableObject  classes.
  • The  VideoStream  and FPS  modules from  will help us to work with a webcam and to calculate the estimated Frames Per Second (FPS) throughput rate.
  • We need imutils  for its OpenCV convenience functions.
  • The dlib  library will be used for its correlation tracker implementation.
  • OpenCV will be used for deep neural network inference, opening video files, writing video files, and displaying output frames to our screen.

Now that all of the tools are at our fingertips, let’s parse command line arguments:

We have six command line arguments which allow us to pass information to our people counter script from the terminal at runtime:

  • --prototxt : Path to the Caffe “deploy” prototxt file.
  • --model : The path to the Caffe pre-trained CNN model.
  • --input : Optional input video file path. If no path is specified, your webcam will be utilized.
  • --output : Optional output video path. If no path is specified, a video will not be recorded.
  • --confidence : With a default value of 0.4 , this is the minimum probability threshold which helps to filter out weak detections.
  • --skip-frames : The number of frames to skip before running our DNN detector again on the tracked object. Remember, object detection is computationally expensive, but it does help our tracker to reassess objects in the frame. By default we skip 30  frames between detecting objects with the OpenCV DNN module and our CNN single shot detector model.

Now that our script can dynamically handle command line arguments at runtime, let’s prepare our SSD:

First, we’ll initialize CLASSES  — the list of classes that our SSD supports. This list should not be changed if you’re using the model provided in the “Downloads”We’re only interested in the “person” class, but you could count other moving objects as well (however, if your “pottedplant”, “sofa”, or “tvmonitor” grows legs and starts moving, you should probably run out of your house screaming rather than worrying about counting them! ? ).

On Line 38 we load our pre-trained MobileNet SSD used to detect objects (but again, we’re just interested in detecting and tracking people, not any other class). To learn more about MobileNet and SSDs, please refer to my previous blog post.

From there we can initialize our video stream:

First we handle the case where we’re using a webcam video stream (Lines 41-44). Otherwise, we’ll be capturing frames from a video file (Lines 47-49).

We still have a handful of initializations to perform before we begin looping over frames:

The remaining initializations include:

  • writer : Our video writer. We’ll instantiate this object later if we are writing to video.
  • W  and H : Our frame dimensions. We’ll need to plug these into cv2.VideoWriter .
  • ct : Our CentroidTracker . For details on the implementation of CentroidTracker , be sure to refer to my blog post from a few weeks ago.
  • trackers : A list to store the dlib correlation trackers. To learn about dlib correlation tracking stay tuned for next week’s post.
  • trackableObjects : A dictionary which maps an objectID  to a TrackableObject .
  • totalFrames : The total number of frames processed.
  • totalDown  and totalUp : The total number of objects/people that have moved either down or up. These variables measure the actual “people counting” results of the script.
  • fps : Our frames per second estimator for benchmarking.

Note: If you get lost in the while  loop below, you should refer back to this bulleted listing of important variables.

Now that all of our initializations are taken care of, let’s loop over incoming frames:

We begin looping on Line 76. At the top of the loop we grab the next frame  (Lines 79 and 80). In the event that we’ve reached the end of the video, we’ll break  out of the loop (Lines 84 and 85).

Preprocessing the frame  takes place on Lines 90 and 91. This includes resizing and swapping color channels as dlib requires an rgb  image.

We grab the dimensions of the frame  for the video writer  (Lines 94 and 95).

From there we’ll instantiate the video writer  if an output path was provided via command line argument (Lines 99-102). To learn more about writing video to disk, be sure to refer to this post.

Now let’s detect people using the SSD:

We initialize a status  as “Waiting” on Line 107. Possible status  states include:

  • Waiting: In this state, we’re waiting on people to be detected and tracked.
  • Detecting: We’re actively in the process of detecting people using the MobileNet SSD.
  • Tracking: People are being tracked in the frame and we’re counting the totalUp  and totalDown .

Our rects  list will be populated either via detection or tracking. We go ahead and initialize rects  on Line 108.

It’s important to understand that deep learning object detectors are very computationally expensive, especially if you are running them on your CPU.

To avoid running our object detector on every frame, and to speed up our tracking pipeline, we’ll be skipping every N frames (set by command line argument --skip-frames  where 30  is the default). Only every N frames will we exercise our SSD for object detection. Otherwise, we’ll simply be tracking moving objects in-between.

Using the modulo operator on Line 112 we ensure that we’ll only execute the code in the if-statement every N frames.

Assuming we’ve landed on a multiple of skip_frames , we’ll update the status  to “Detecting” (Line 114).

Then we initialize our new list of trackers  (Line 115).

Next, we’ll perform inference via object detection. We begin by creating a blob  from the image, followed by passing the blob  through the net to obtain detections  (Lines 119-121).

Now we’ll loop over each of the detections  in hopes of finding objects belonging to the “person” class:

Looping over detections  on Line 124, we proceed to grab the confidence  (Line 127) and filter out weak results + those that don’t belong to the “person” class (Lines 131-138).

Now we can compute a bounding box for each person and begin correlation tracking:

Computing our bounding box  takes place on Lines 142 and 143.

Then we instantiate our dlib correlation tracker  on Line 148, followed by passing in the object’s bounding box coordinates to dlib.rectangle , storing the result as rect  (Line 149).

Subsequently, we start tracking on Line 150 and append the tracker  to the trackers  list on Line 154.

That’s a wrap for all operations we do every N skip-frames!

Let’s take care of the typical operations where tracking is taking place in the else  block:

Most of the time, we aren’t landing on a skip-frame multiple. During this time, we’ll utilize our trackers  to track our object rather than applying detection.

We begin looping over the available trackers  on Line 160.

We proceed to update the status  to “Tracking” (Line 163) and grab the object position (Lines 166 and 167).

From there we extract the position coordinates (Lines 170-173) followed by populating the information in our rects  list.

Now let’s draw a horizontal visualization line (that people must cross in order to be tracked) and use the centroid tracker to update our object centroids:

On Line 181 we draw the horizontal line which we’ll be using to visualize people “crossing” — once people cross this line we’ll increment our respective counters

Then on Line 185, we utilize our CentroidTracker  instantiation to accept the list of rects , regardless of whether they were generated via object detection or object tracking. Our centroid tracker will associate object IDs with object locations.

In this next block, we’ll review the logic which counts if a person has moved up or down through the frame:

We begin by looping over the updated bounding box coordinates of the object IDs (Line 188).

On Line 191 we attempt to fetch a TrackableObject  for the current objectID .

If the TrackableObject  doesn’t exist for the objectID , we create one (Lines 194 and 195).

Otherwise, there is already an existing TrackableObject , so we need to figure out if the object (person) is moving up or down.

To do so, we grab the y-coordinate value for all previous centroid locations for the given object (Line 204). Then we compute the direction  by taking the difference between the current centroid location and the mean of all previous centroid locations (Line 205).

The reason we take the mean is to ensure our direction tracking is more stable. If we stored just the previous centroid location for the person we leave ourselves open to the possibility of false direction counting. Keep in mind that object detection and object tracking algorithms are not “magic” — sometimes they will predict bounding boxes that may be slightly off what you may expect; therefore, by taking the mean, we can make our people counter more accurate.

If the TrackableObject  has not been counted  (Line 209), we need to determine if it’s ready to be counted yet (Lines 213-222), by:

  1. Checking if the direction  is negative (indicating the object is moving Up) AND the centroid is Above the centerline. In this case we increment totalUp .
  2. Or checking if the direction  is positive (indicating the object is moving Down) AND the centroid is Below the centerline. If this is true, we increment totalDown .

Finally, we store the TrackableObject  in our trackableObjects  dictionary (Line 225) so we can grab and update it when the next frame is captured.

We’re on the home-stretch!

The next three code blocks handle:

  1. Display (drawing and writing text to the frame)
  2. Writing frames to a video file on disk (if the --output  command line argument is present)
  3. Capturing keypresses
  4. Cleanup

First we’ll draw some information on the frame for visualization:

Here we overlay the following data on the frame:

  • ObjectID : Each object’s numerical identifier.
  • centroid  : The center of the object will be represented by a “dot” which is created by filling in a circle.
  • info  : Includes totalUp , totalDown , and status

For a review of drawing operations, be sure to refer to this blog post.

Then we’ll write the frame  to a video file (if necessary) and handle keypresses:

In this block we:

  • Write the  frame , if necessary, to the output video file (Lines 249 and 250)
  • Display the frame  and handle keypresses (Lines 253-258). If “q” is pressed, we break  out of the frame processing loop.
  • Update our fps  counter (Line 263)

We didn’t make too much of a mess, but now it’s time to clean up:

To finish out the script, we display the FPS info to the terminal, release all pointers, and close any open windows.

Just 283 lines of code later, we are now done ?.

People counting results

To see our OpenCV people counter in action, make sure you use the “Downloads” section of this blog post to download the source code and example videos.

From there, open up a terminal and execute the following command:

Here you can see that our person counter is counting the number of people who:

  1. Are entering the department store (down)
  2. And the number of people who are leaving (up)

At the end of the first video you’ll see there have been 7 people who entered and 3 people who have left.

Furthermore, examining the terminal output you’ll see that our person counter is capable of running in real-time, obtaining 34 FPS throughout.  This is despite the fact that we are using a deep learning object detector for more accurate person detections.

Our 34 FPS throughout rate is made possible through our two-phase process of:

  1. Detecting people once every 30 frames
  2. And then applying a faster, more efficient object tracking algorithm in all frames in between.

Another example of people counting with OpenCV can be seen below:

I’ve included a short GIF below to give you an idea of how the algorithm works:

Figure 7: An example of an OpenCV people counter in action.

A full video of the demo can be seen below:

This time there have been 2 people who have entered the department store and 14 people who have left.

You can see how useful this system would be to a store owner interested in foot traffic analytics.

The same type of system for counting foot traffic with OpenCV can be used to count automobile traffic with OpenCV and I hope to cover that topic in a future blog post.

Additionally, a big thank you to David McDuffee for recording the example videos used here today! David works here with me at PyImageSearch and if you’ve ever emailed PyImageSearch before, you have very likely interacted with him. Thank you for making this post possible, David! Also a thank you to BenSound for providing the music for the video demos included in this post.

What are the next steps?

Congratulations on building your person counter with OpenCV!

If you’re interested in learning more about OpenCV, including building other real-world applications, including face detection, object recognition, and more, I would suggest reading through my book, Practical Python and OpenCV + Case Studies.

Practical Python and OpenCV is meant to be a gentle introduction to the world of computer vision and image processing. This book is perfect if you:

  • Are new to the world of computer vision and image processing
  • Have some past image processing experience but are new to Python
  • Are looking for some great example projects to get your feet wet


Learn OpenCV fundamentals in a single weekend!

If you’re looking for a more detailed dive into computer vision, I would recommend working through the PyImageSearch Gurus course. The PyImageSearch Gurus course is similar to a college survey course and many students report that they learn more than a typical university class.

Inside you’ll find over 168 lessons, starting with the fundamentals of computer vision, all the way up to more advanced topics, including:

  • Face recognition
  • Automatic license plate recognition
  • Training your own custom object detectors
  • …and much more!

You’ll also find a thriving community of like-minded individuals who are itching to learn about computer vision. Each day in the community forums we discuss:

  • Your burning questions about computer vision
  • New project ideas and resources
  • Kaggle and other competitions
  • Development environment and code issues
  • …among many other topics!

Master computer vision inside PyImageSearch Gurus!


In today’s blog post we learned how to build a people counter using OpenCV and Python.

Our implementation is:

  • Capable of running in real-time on a standard CPU
  • Utilizes deep learning object detectors for improved person detection accuracy
  • Leverages two separate object tracking algorithms, including both centroid tracking and correlation filters for improved tracking accuracy
  • Applies both a “detection” and “tracking” phase, making it capable of (1) detecting new people and (2) picking up people that may have been “lost” during the tracking phase

I hope you enjoyed today’s post on people counting with OpenCV!

To download the code to this blog post (and apply people counting to your own projects), just enter your email address in the form below!


If you would like to download the code and images used in this post, please enter your email address in the form below. Not only will you get a .zip of the code, I’ll also send you a FREE 17-page Resource Guide on Computer Vision, OpenCV, and Deep Learning. Inside you'll find my hand-picked tutorials, books, courses, and libraries to help you master CV and DL! Sound good? If so, enter your email address and I’ll send you the code immediately!

, , ,

496 Responses to OpenCV People Counter

  1. Jay August 13, 2018 at 10:53 am #

    Hi Adrian ! the tutorial is really great and it’s very helpful to me . however, I was wandering that is this kind of people counting can implement on raspberry pi3 ?

    • Adrian Rosebrock August 13, 2018 at 10:58 am #

      If you want to use just the Raspberry Pi you need to use a more efficient object detection routine. Possible methods may include:

      1. Background subtraction, such as the method used in this post.
      2. Haar cascades (which are less accurate, but faster than DL-based object detectors)
      3. Leveraging something like the Movidius NCS to help you reach a faster FPS throughput

      Additionally, for your object tracking you may want to look into using MOSSE (covered in this post) which is faster than correlation filters. Another option could be to explore using Kalman filters.

      I hope that helps!

      • 蘇鉉 August 14, 2018 at 1:21 am #

        thank you so much! another question , is it possible to combine this people counting algorithm with the method you have post before which was talk inking about Raspberry Pi: Deep learning object detection with OpenCV

        • Adrian Rosebrock August 15, 2018 at 8:37 am #

          Yes, you can, but keep in mind that the FPS throughput rate is going to be very, very low since you’re trying to apply deep learning object detection on the Pi.

      • Wang August 21, 2018 at 6:15 pm #

        Adrian, to get better performance with raspberry pi3, do you need to use all of these methods? Or just a few? For example, you can join background subtraction with Haar Cascade?

        Thank you very much!

        • Adrian Rosebrock August 22, 2018 at 9:25 am #

          You can join background subtraction in with a Haar cascade and then only apply the Haar cascade to the ROI regions. But realistically Haar cascades are pretty fast anyway so that may be overkill.

      • Lafleur August 22, 2018 at 3:02 am #

        Thank you so much for your work and for sharing it. It’s great.
        May you detail a bit more what we are suppose to do to use the software on Raspberry. I’m not very used to it so I don’t understant everything you wrote.

        • Adrian Rosebrock August 22, 2018 at 9:22 am #

          I’ll likely have to write another blog post on Raspberry Pi people counting — it’s too much to detail in a blog post comment.

          • Lafleur August 22, 2018 at 10:02 am #

            Seems logic…
            Could you give me the URL of a trusted blog where you use to go on which I will be able to find informations ?
            I’ve tried the software “Footfall” but it doesn’t work.
            And many blogs are just outdated concerning this subject.
            Thank you for all 🙂

          • Adrian Rosebrock August 22, 2018 at 10:17 am #

            I don’t know of one, which is why I would have to do one here on PyImageSearch 😉

          • Ben Bartling August 22, 2018 at 5:00 pm #

            Looking forward to the Rasberry Pi people counting!

          • AJ August 30, 2018 at 12:14 am #

            Hi, firstly, thank you for your blog it’s so awesome! Im wondering when that Raspberry Pi counter will be posted? Also can it be made into vehicles? Thank you!

          • Adrian Rosebrock August 30, 2018 at 8:54 am #

            Yes, you can do the same for vehicles, just swap out the “person” class for any other objects you want to detect and track. I’m honestly not sure when the Pi people counter will be posted. I have a number of other blog posts and projects I’m working on though. I hope it will be soon but I cannot guarantee it.

          • Cheeriocheng September 11, 2018 at 3:40 am #

            yes please!

          • Hamza Awais February 18, 2019 at 11:08 am #

            Hey Adrian!

            So did you write anything related to Pi and Counting algorithms?

          • Adrian Rosebrock February 20, 2019 at 12:28 pm #

            I will be covering it in my upcoming Computer Vision and Raspberry Pi book! Make sure you’re on the PyImageSearch email list to be notified when the book goes live in a couple of months.

      • Andres September 12, 2018 at 3:29 pm #

        Hi Adrian. I have a question about Kalman filters. I wanna implement people counter on a Raspberry PI3B and I use background substraction for detection and FindCountours to enclosing in a rectangle the person position and for tracking I need to implement MOSSE o Kalman filter but here is my question. How can I track a person with those algorithms? Because each of those algorithm need to receive the position of the object but I’m detect multiple object so it will be an issue to send the correct coordinate for each object that I need to track

      • clarence chhoa November 11, 2018 at 8:16 pm #

        can this code deals with live streaming?

      • michal February 10, 2019 at 2:15 am #

        Hey Adrian, awesome post. Thank you for sharing and detailing the steps. Is there a raspberry Pi post in the near future? Would love to see your approach. Thanks again, gonna check out your other stuff.

        • Adrian Rosebrock February 14, 2019 at 1:47 pm #

          I’ll actually be covering it in my upcoming Computer Vision + Raspberry Pi book 🙂 Stay tuned, I’ll be announcing it soon.

          • michael h March 21, 2019 at 7:19 am #

            I can hardly wait for the book. Is there a model that will reliably detect people walking in profile (passing by a camera pointed at the sidewalk)? I haven’t found the haar do this well. The Caffee you have does it well but as you mention it won’t run well on a Pi.

            Is there a haar that will detect profile or a low-cost hardware that will run the Caffee?

            Again – looking forward to the book!

          • Adrian Rosebrock March 22, 2019 at 8:41 am #

            I’ll actually be showing you how to use deep learning-based object detectors on the Pi! They will be fast enough to run in real-time and be more accurate than Haar detectors.

  2. issaiass August 13, 2018 at 11:08 am #

    Great! Awesome job as always. I was trying to improve my tracking part. This is a good reference point for my application.

    Thankyou Adrian!

  3. Sukhmeet SIngh August 13, 2018 at 11:19 am #

    Hi Adrian,
    This is by far my Favorite blog post from you.
    I was wondering if you could also do a blog/tutorial on people counting in an image and show the gender of the people. That would make up for a really interesting blog and tutorial.

  4. rvjenya August 13, 2018 at 11:36 am #

    I really liked your blog lesson.. Thanks so much. I’m going to convers caffe model to NCS Movidius and go to Store my friend. Hi is going to count people and recognize (age, gender and maybe emotion). I really like your Blog. I plan to buy your book. Thanks for motivation and good practic.

    • Adrian Rosebrock August 13, 2018 at 12:46 pm #

      Thank you for the kind words, I’m happy you liked the post. I wish the best of luck to you and your friend implementing your own person counter for their store!

  5. anirban August 13, 2018 at 12:04 pm #

    Sir ,

    Great Work. Thanks for Sharing.



    • Adrian Rosebrock August 13, 2018 at 12:46 pm #

      Thanks Anirban!

  6. Anand Simmy August 13, 2018 at 12:42 pm #

    Hi Adrian, is there any specifc reason to use dlib correlation tracker instead of opencv’s 8 inbuilt trackers.Will any of those trackers will be more precise than dlib tracker?

    • Adrian Rosebrock August 13, 2018 at 12:46 pm #

      To quote the blog post:

      “We’ll then use dlib for its implementation of correlation filters. We could use OpenCV here as well; however, the dlib object tracking implementation was a bit easier to work with for this project.”

      OpenCV’s CSRT tracker may be more accurate but it will be slower. Similarly, OpenCV’s MOSSE tracker will be faster but potentially less accurate.

  7. Bilal August 13, 2018 at 2:08 pm #

    Loved your post and with the level of explanation so you have posted hats off to you SIr! I was wandering what if we have to implement it on multiple cameras? or we have separate door/ separate camera for entrance and exit. would like to have your words on these too. Thanks in advance.

    • Adrian Rosebrock August 13, 2018 at 2:20 pm #

      This tutorial assumes you’re using a single camera. If you’re using multiple cameras it becomes more challenging. If the viewpoint changes then your object tracker won’t be able to associate the objects. Instead, you might want to look into face recognition or even gate recognition, enabling you to associate a person with a unique ID based on more than just appearance alone.

      • Bilal August 13, 2018 at 4:55 pm #

        Yes, the view point do change. As cameras will be placed on certain different places. We would like to tag the person with his face id and recognize around all the cameras using the face recognition and ID. Thank you once again.

        • Adrian Rosebrock August 15, 2018 at 8:45 am #

          Yeah, if the viewpoints are changing you’ll certainly want to explore face recognition and gait recognition instead.

      • Prashant April 5, 2019 at 3:17 am #

        @Adrian Thanks for the post. I would love to see any blog on gait recognition. I am doing some research on this recently. I have tried a gait recognition paper which uses CASIA-B dataset. But getting a silhouette seems to be a difficult task. I am a little bit off the topic but if you read this one, would love to know your views.

        • Adrian Rosebrock April 12, 2019 at 1:03 pm #

          Thanks for the suggestion. I will consider covering it but I cannot guarantee if/when that may be.

  8. Michael Gerasimov August 13, 2018 at 2:12 pm #

    I liked the article very much. in the new centers on all the inputs to put cameras and on the computer to collect information that all the people came out and no one hid in the interior.

  9. Dakhoo August 13, 2018 at 2:36 pm #

    Thanks for sharing this tutorial – last week I was trying to do something similar – do you think you can make a comment/answer on ?!

    • Jaca September 4, 2018 at 9:25 am #

      You can try to “place” a blank region on already detected car. Since the tracking method gives you location of the object in every frame, you could just move the blank region accordingly. Then you can use it to prevent Haar cascade from finding a car there. If you’re worried about overlapping cars, I suggest you adjust the size of blank region.

  10. Krishna August 13, 2018 at 2:52 pm #

    Does this algorithm works fine with raspberry pi based projects ? If not suggest me a effective algorithm for detecting humman presence sir . I have treid cassade method but it does not make the satisfaction .

    Thank you sir , I am awaiting for ur reply

    • Adrian Rosebrock August 15, 2018 at 8:45 am #

      Make sure you’re reading the comments. I’ve already answered your question in my reply to Jay — it’s actually the very first comment on this post.

  11. ando August 13, 2018 at 3:21 pm #

    Thanks. God Job. How to improve the code to detect people very close?

    • Adrian Rosebrock August 15, 2018 at 8:44 am #

      Hey Ando — can you clarify what you mean by “very close”? Do you have any examples?

  12. Jeff August 13, 2018 at 5:56 pm #

    Thank you very much for these tutorials. I am new to this and I seem to be having issues getting webcam video from Can you provide a short test script to open the video stream from the pi camera using imutils?

    • Adrian Rosebrock August 15, 2018 at 8:43 am #

      Just set:

      vs = VideoStream(usePiCamera=True).start()

      • clarence chhoa hua sheng November 19, 2018 at 1:52 am #

        how to link your videostream class to here? and how to run ?
        is the videostream class created in the same file at there or another python file

        • Adrian Rosebrock November 19, 2018 at 12:26 pm #

          You first need to install the “imutils” library:

          $ pip install imutils

          From there you can import it into your Python scripts via:

          from import VideoStream

  13. Rohit August 14, 2018 at 12:25 am #

    Thanks for the wonderful explanation. It was always a pleasure to read your post. I ran your people-counting tracker but getting some random objectID while detection. For me on 2nd example videos there was 20 people going Up and 2 people coming Down. What do you recommend to remove these ambiguities ?

    • Adrian Rosebrock August 15, 2018 at 8:39 am #

      Hey Rohit — that is indeed strange behavior. What version of OpenCV and dlib are you using?

      • kaisar khatak August 18, 2018 at 9:59 pm #

        Running the Downloaded scripts with the default parameter values using the same input videos, I was UNABLE to match the sample output videos. I ran into the same issue as Rohit.

        I played around with the confidence values and still could NOT match the results. The code is missing some detections and what looks like overstating (false positive detections?) others? Any ideas???

        Nvidia TX1
        OpenCV 3.3
        Python 3.5 (virtual environment)

        The videos can be viewed on my google drive:

        Video 1: (My Result = Up 3, Down 8) [Actual (ground truth) Up 3 Down 7]

        Video 2: (My Result = Up 20, Down 2) [Actual (ground truth) Up 14 Down 3]

        • Adrian Rosebrock August 22, 2018 at 10:09 am #

          Upgrade from OpenCV 3.3 to OpenCV 3.4 or better and it will work for you 🙂 (which I also believe you found it from other comments but I wanted to make sure)

      • kaisar khatak August 19, 2018 at 6:05 pm #

        Comment Updated (4/19): I encountered the same issue using OpenCV 3.3, but after I upgraded to OpenCV 3.4.1, my results now match the video on this blog post. I recommend upgrading to OpenCV 3.4 for anyone encountering similar detection/tracking behavior…

    • kaisar khatak August 19, 2018 at 6:04 pm #

      Rohit – I encountered the same issue using OpenCV 3.3, but after I upgraded to OpenCV 3.4.1, my results now match the video on this blog post. I recommend upgrading to OpenCV 3.4…

  14. Sourabh Mane August 14, 2018 at 1:08 am #

    Hi Adrian,

    Thanks for the great post!!!!. I have few questions..

    1.Will this people counter work on crowded places like Airport or Railway station’s?? Will it give accurate count??

    2.Can we use it for mass(crowd) counting?? Does it consider pet’s and babies??

    • Adrian Rosebrock August 15, 2018 at 8:38 am #

      1. Provided you can detect the full body and there isn’t too much occlusion, yes, the method will work. If you cannot detect the full body you might want to switch to face detection instead.

      2. See my note above. You should also read the code to see how we filter out non-people detection 🙂

      • kaisar khatak August 19, 2018 at 10:08 pm #

        I have come across some app developers using what looks to be custom trained head detection models. Sometimes, the back of the head can be seen, other times the frontal view can be seen. I think the “head count” approach makes sense since that is how humans think when taking class attendance for example. Is head counting a better method for people counting??? Is this even possible and will the method be accurate for the back of heads???

        Examples: (VION VISION)

        • Adrian Rosebrock August 22, 2018 at 10:04 am #

          I’m reluctant to say which is “better” as that’s entirely dependent on your application and what exactly you’re trying to do. You could argue that in dense areas a “head detector” would be better than a “body detector” since the full body might not be visible. But on the other hand, having a full body detected can reduce false positives as well. Again, it’s dependent on your application.

  15. Anthony The Koala August 14, 2018 at 3:15 am #

    Dear Dr Adrian,
    I need a clarification please on object detection. How does the object detector distinguish between human and non-human objects.
    Thank you,
    Anthony of exciting Sydney

    • Adrian Rosebrock August 15, 2018 at 8:34 am #

      The object detector has been pre-trained to recognize a variety of classes. Line 137 also filters out non-person detections.

  16. qiang92 August 14, 2018 at 4:23 am #

    Thanks for your sharing.

  17. David August 14, 2018 at 6:38 am #

    Hi Adrian,

    For the detection part, I wanted to try another network. So I went for the ssd_mobilenet_v2_coco_2018_03_29, tensorFlow version (See here: and here: ).

    Problem is I had too much detection boxes, so I used a NMS function to help me sort out things, but even after that I had too much results even with confidence at 0.3 and NMS treshold at 0.2, see an exemple here: (network detection boxes are in red, NMS output boxes are in green)
    Do you know why I have got some much results? Is it because I used a TensorFlow model instead of Caffe? Or is it because the network was trained with other parameters? Something changed in SSD MobileNet v2 compared to chuanqi305’s SSD mobileNet?


    • Adrian Rosebrock August 15, 2018 at 8:25 am #

      Hey David — I haven’t tested the TensorFlow model you are referring to so I’m honestly not sure why it would be throwing so many false positives like that. Try to increase your minimum confidence threshold to see if that helps resolve the issue.

  18. Christian August 14, 2018 at 11:22 am #

    Thanks Adrian, great work!!!

    please can you tell us what version of Python and OpenCV you used ????

    Do you think this code can works with a raspberry PI 3 with streaming from an IP camera?

    • Adrian Rosebrock August 15, 2018 at 8:23 am #

      I used OpenCV 3.4 for this example. As for using the Raspberry Pi, make sure you read my reply to Jay.

  19. Roald August 15, 2018 at 5:40 am #

    Hi Adrian,

    You write “we utilize our CentroidTracker instantiation to accept the list of rects , regardless of whether they were generated via object detection or object tracking” however as far as I can see, in the Object Detection fase, you don’t actually seem to populate the rects[] variable? I’ve downloaded the source as well, couldn’t find it there either.
    Am I missing something?

    Very valuable post throughout, looks a lot like what I am trying to achieve for my cat tracker (which you may recall from earlier correspondence).

    • Adrian Rosebrock August 15, 2018 at 8:15 am #

      Hey Roald — we don’t actually have to populate the list during the object detection phase. We simply create the tracker and then allow the tracker to update “rects” during the tracking phase. Perhaps that point was not clear.

  20. kumar August 15, 2018 at 8:26 am #

    Great article, I have a doubt though, It could potentially be a noob question so please bare with me.
    Say I use this in my shop for tracking foot count, now all the new objects are stored in a dictionary right? If i leave the code running perpetually, wont it cause errors with the memory?

    • Adrian Rosebrock August 15, 2018 at 8:31 am #

      If you left it running perpetually, yes, the dictionary could inflate. It’s up to you to add any “business logic” code to update the dictionary. Some people may want to store that information in a proper database as well — it’s not up to me make those decisions for people. This code is a start point for tracking foot count.

  21. Abkul August 15, 2018 at 9:53 am #

    Great blog!!! its amazing how you simplify difficult concepts.

    I am working on ways to identify each and every individual going through the entrance through image captured in real time using a camera(we have their passport size photos plus other labels e.g., personal identification number, department ,etc).kindly advice on how to include this multi class labels other than the ID notation you used in the example.

    Will you be covering the storage of the counted individuals to the database for later retrieval?

  22. Juan LP August 15, 2018 at 12:01 pm #

    For those who had the following error when running the script:

    Traceback (most recent call last):
    File “”, line 160, in
    rect = dlib.rectangle(startX, startY, endX, endY)
    Boost.Python.ArgumentError: Python argument types in
    rectangle.__init__(rectangle, numpy.int32, numpy.int32, numpy.int32, numpy.int32)
    did not match C++ signature:
    __init__(struct _object * __ptr64, long left, long top, long right, long bottom)
    __init__(struct _object * __ptr64)

    please update line 160 of to

    rect = dlib.rectangle(int(startX), int(startY), int(endX), int(endY))

    • Adrian Rosebrock August 15, 2018 at 1:23 pm #

      Thanks for sharing, Juan! Could you let us know which version of dlib you were using as well just so we have it documented for other readers who may run into the problem?

      • Durian August 23, 2018 at 11:24 pm #

        i have the same problem with him and my version of dlib is 19.6.0

        • lenghonglin September 16, 2018 at 9:52 am #

          my dlib version is 19.8.1

        • Mou October 23, 2018 at 11:32 am #

          i have the same problem with it and i have tried 19.18.0 and 19.6.0, both of them doesn’t work.

    • gunan August 16, 2018 at 2:06 am #


    • Aysenur September 13, 2018 at 4:55 am #

      thanks 🙂

    • lenghonglin September 16, 2018 at 9:48 am #

      Hi,i meet the same question,do u solve it?

      • Adrian Rosebrock September 17, 2018 at 2:17 pm #

        As Juan said, you change Line 160 to:

        rect = dlib.rectangle(int(startX), int(startY), int(endX), int(endY))

    • abbhijeet July 8, 2019 at 12:31 pm #

      Thanks a lot! you saved me 🙂

  23. Kibeom Kwon August 15, 2018 at 9:15 pm #


    Your wonderful work is priceless text book. Unfortunately, my understanding is still not enough to understand the whole code. I tried to execute python files, but have an error.

    Can I know how to solve it. Thank you so much

    python –prototxt mobilenet_ssd/MobileNetSSD_deploy.prototxt \
    usage: [-h] -p PROTOTXT -m MODEL [-i INPUT] [-o OUTPUT]
    [-c CONFIDENCE] [-s SKIP_FRAMES] error: argument -m/–model is required

    • Adrian Rosebrock August 16, 2018 at 5:32 am #

      Your error can be solved by properly providing the command line arguments to the script. If you’re new to command line arguments, that’s fine, but you should read up on them first.

    • m October 9, 2018 at 6:49 am #

      remove ‘/’ between the arguments and remove the newline space and provide the 3 lines as 1 liner command

  24. Jan August 16, 2018 at 12:09 pm #

    Hi Adrian,
    thanks for sharing this great article! It really helps me a lot to understand object tracking.

    The CentroidTracker uses two parameters: MaxDisappeared and MaxDistance.
    I understand the reason for MaxDistance, but I cannot find the implementation in the source code.

    I am running this algorithm on vehicle detection in traffic and the same ID is sometimes jumping between different objects.
    How can I implement MaxDistance to avoid that?

    Thanks in advance! I really appreciate your work!!

    • Adrian Rosebrock August 16, 2018 at 3:54 pm #

      Hey Jan — have you used the “Downloads” section of the blog post to download the source code? If so, take a look at the implementation. You will find both variables being used inside the file.

    • Misbah September 18, 2018 at 8:18 am #

      Kindly help me to, Have you resolve the error.

  25. Mattia August 16, 2018 at 12:56 pm #

    Hi Adrian,
    do you think it’s worth to train a deep learning object detector with only the classes I’m interested in (about 15), instead of filtering classes on a pre-trained model, to run it on devices with limited resources(beagleBoard X-15 or similar SBC)?

    • Adrian Rosebrock August 16, 2018 at 3:54 pm #

      If you train on just the classes you are interested in you may be able to achieve higher accuracy, but keep in mind it’s not going to necessarily improve your inference time that much.

  26. David August 17, 2018 at 11:49 am #

    Hi Adrian,

    Does this implement the multi-processing you were talking about the week before in ?

    • Adrian Rosebrock August 17, 2018 at 12:41 pm #

      It doesn’t use OpenCV’s implementation of multi-object tracking, but it uses my implementation of how to turn dlib’s object trackers into multi-object trackers.

  27. sau August 19, 2018 at 1:36 am #

    thank you very much dear adrian for best blog post

  28. senay August 20, 2018 at 8:28 am #

    This is really nice thank you….
    I have developed a people counter using Dlib tracker and SSD detector. you have skipped 30 frames for the detector to save memory usage. but in my case the detection and the tracker run in each of the frames. when there is no detection (when the detector lost the object) I try to initialize the tracker by the previous bounding box of the tracker ( only for two frames). the problem is when there is no object in the video ( object is not lost by the detector but has passed ) the tracker bounding box stack on the screen and it cause a problem when another object came in the view of the video. is there any way to delete the tracker when I need?

    • Adrian Rosebrock August 22, 2018 at 9:54 am #

      I would suggest applying another layer of tracking, this time via centroid tracking like I do in this guide. If the maximum distance between the old object and new object is greater than N pixels, delete the tracker.

  29. Aditya Johar August 21, 2018 at 3:54 am #

    Hi Adrian
    Again, a great tutorial. Can’t praise it enough. I’ve got my current job because of PyImageSearch and that’s what this site means to me.
    I was going through the code, and trying to understand –

    –If you are running the object detector every 30 frames, how are you ensuring that an *already detected* person with an associated objectID, does not get re-detected in the next iteration of the object detector after the 30 frame wait-time? For example, if we have a person walking really slowly, or if two people are having a conversation within the bounds of our input frame, how are they not getting re-detected?–

    Thanks and Regards,

    • Adrian Rosebrock August 22, 2018 at 9:38 am #

      They actually are getting re-detected but our centroid tracker is able to determine if (1) they are the same object or (2) two brand new objects.

  30. Stefan August 21, 2018 at 9:08 am #

    Thank you Adrian for another translation of the language of the gods. The combination of graph theory, mathematics, conversion to code and implementation is like ancient Greek and you are the demigod who takes the time to explain it to us mere mortals. Most importantly, you take a stepwise approach. When ‘Rosebrock Media Group’ has more employees, someone in it can even spend more time showing how alternative code snippets behave. In terms of performance, I am just starting to figure out if a CUDA implementation would be of benefit. Of course, there is no ‘Adrian for CUDA coding’. Getting this to run smoothly on a small box would be another interesting project but requires broad knowledge of all the hardware options available – a Xilinx FPGA? an Edison board? a miniiTX pc? a hacked cell phone? (there’s an idea – it’s a camera, a quad core cpu and a gpu in a tidy package but obviously would need a mounting solution and a power source too). Of course to run on an iphone I have to jailbreak the phone and translate the code to swift. But then perhaps it would be better to go to android as the hardware selection is broader and the OS is ‘open’. Do you frequent any specific message boards where someone might pick up this project and get it to work on a cell phone? There are a lot of performance optimizations that could make it work.

    • Adrian Rosebrock August 22, 2018 at 9:34 am #

      Thank you for the kind words, Stefan! Your comment really made my day 🙂 To answer your question — yes, running the object detector on the GPU would dramatically improve performance. While my deep learning books cover that the OpenCV bindings themselves do not (yet). I’m also admittedly not much of an embedded device user (outside of the Raspberry Pi) so I wouldn’t be able to comment on the other hardware. Thanks again!

      • Mike Isted October 7, 2018 at 3:35 am #

        Hi Adrian, just spotted this…
        For information I have successfully implemented this post on a Jetson TX2, replacing the SSD with one that is optimised for TensorRT. I would refer your reader to the blog of JK Jung for guidance.

        Performance wise, I am finding that all 6 cores are maxed out at 100% and the GPUs at around 50% depending on the balance of SSD/trackers used. The trackers in particular are very CPU intensive and as you say, the pipieline slows a great deal with multiple objects.

        As always, thanks for your huge contribution to the community and congratulations on just getting married!

        Chers, Mike

        • Adrian Rosebrock October 8, 2018 at 9:38 am #

          Awesome, thanks so much for sharing, Mike!

  31. senay August 22, 2018 at 1:14 pm #

    I find out the problem for my issue !! it is because I changed the skip_frames to 15 .
    so how to set an appropriate number of frames to skip? because maximum frame number to skip will lead to a miss to an object and smaller number of skip_frames will lead to inappropriate assignation of object ID….

    • Adrian Rosebrock August 24, 2018 at 8:56 am #

      As you noticed, it’s a balance. You need to balance (1) skipping frames to help the pipeline to run faster while (2) ensuring objects are not lost or trackings missed. You’ll need to experiment to find the right value for skip frames.

  32. Jaime August 23, 2018 at 6:49 am #

    Hi Adrian,

    I’ve recently found your blog and I really like the way you explain things.

    I’m doing and people counter in a raspberry pi , I’m using background subtration and centroid tracking.
    The problem I’m facing is that sometimes objects ID switch as you said in the “simple object tracking with OpenCV” post. Is there something I can do to minimize these errors?

    If you have any recommendations feel free to share.

    Thanks in advance.

    Ps: I’d be really interested if you did a post about people counter in raspberry pi like you mentioned in the first comment

    • Adrian Rosebrock August 24, 2018 at 8:41 am #

      Hey Jaime — there isn’t a ton you can do about that besides reduce the maximum distance threshold.

  33. Nilesh Garg August 23, 2018 at 11:15 am #

    Thanks Adrian for such a nice tutorial. You have released it on perfect timing, I am working on similar kind of project for tracking the number of people in and out from bus. Some how I am not getting proper result. But this tutorial is very good start and helped me to understand the logic.
    Thanks again. Keep rocking!!!

    • Adrian Rosebrock August 24, 2018 at 8:37 am #

      Best of luck with the project, Nilesh!

  34. Wang August 23, 2018 at 10:02 pm #

    Hi Adrian,

    The camera is fixed to how many meters of the floor (approximately)?

    Thank you very much!

    • Adrian Rosebrock August 24, 2018 at 8:34 am #

      To be totally honest I’m not sure how many meters above the ground the camera was. I don’t recall.

  35. Nik August 24, 2018 at 12:36 am #

    Thank you Adrian for inspiring me and introducing me to the world of computer vision.

    I started with your 1st edition and followed quite a few of your blog projects, with great success.
    I was excited to read this blog, as people counting is something I have wanted to pursue.

    However,………………..there’s a problem.

    .When I execute the runtime, I get,

    [INFO] loading model…
    [INFO] opening video file…

    the sample video does open up, plays for about 1 second (The lady doesn’t reach the line), and then, boom…my computer crashes! and Python quits!
    I have tried to increase the –skip-frames, still crashes. I even played with Python3 (thinking my version 2.7 was old) – no joy!

    Is it time to say goodbye to my 11 year old Macbook Pro? or could this be something else?

    “It’s important to understand that deep learning object detectors are very computationally expensive, especially if you are running them on your CPU.”

    Out of interest is there a ballpark guide to minimum spec machines, when delving into this world of OpenCV?

    Best Regards,

    • Nik August 24, 2018 at 12:55 am #


      Reading your /install-dlib-easy-complete-guide/

      I noticed you say to install XCode.
      I had removed XCode for my homebrew installation as instructed, as it was an old version.

      When I installed dlib, I simply did pip install dlib.

      Could this be related?


      • Adrian Rosebrock August 24, 2018 at 8:33 am #

        Hey Nik — it sounds like you’re using a very old system and if you’ve installed/uninstalled Xcode before then that could very well be an issue. I would advise you to try to use a newer system if at all possible. Otherwise, it would be any number of problems and it’s far too challenging to diagnose at this point.

  36. Safaa Diab August 26, 2018 at 4:06 pm #

    Hello, Dr. Adrian thank you for your great work. I am a beginner in this field and your webpage is really helping me through. I have a question, I’ve tried to run this code and an error popped out “ error: the following arguments are required: -p/–prototxt, -m/–model” and I really don’t know what to do. I would be grateful if you helped.
    Thanks in advance.

  37. senay August 27, 2018 at 10:10 am #

    Hi Adrian !!
    This is the answer you give me my question !!! thank you for that….
    August 24, 2018 at 8:56 am

    As you noticed, it’s a balance. You need to balance (1) skipping frames to help the pipeline to run faster while (2) ensuring objects are not lost or tracking missed. You’ll need to experiment to find the right value for skip frames.

    but balancing will be possible for a video because i have it in my hand….
    what do you suggest me for a camera ( do not know when an object will appear to set a skip frame number)

  38. Anand Simmy August 31, 2018 at 12:14 pm #

    Hi Adrian !!,

    How we can evaluate the counting accuracy of this counter ? My mentor asked me for the counting accuracy. Do we need to find some videos as benchmark or is there some libraries for accuracy evaluation ?

  39. Andy September 1, 2018 at 2:47 am #

    Another great post! Thanks so much for your contributions to the community.

    One question, I have tried the code provided on a few test videos and it seems like detected people can be counted as moving up or down without having actually crossed the yellow reference line. In the text you mention the fact that people are only counted once they have crossed the line. Is this a behaviour you have seen as well? Is there an approach you would recommend to place a more strict condition that only counts people who have actually crossed from one side of the line to the other? Thanks

    • Adrian Rosebrock September 5, 2018 at 9:16 am #

      Hey Andy — that’s actually a feature, not a bug. For example, say you are tracking someone moving from the bottom to the top of a frame. But, they are not actually detected until they have crossed the actual line. In that instance we still want to track them so we check if they are above the line, and if so, increment the respective counter. If that’s not the behavior you expect/want you will have to update the code.

  40. Frank Yan September 3, 2018 at 11:24 am #

    Hello Adrian,

    Thank you for the great post.

    I modified the code for horizontal camera as below:

    I noticed that below problems:
    1-No response on fast moving object
    2-Irrelevant Centroids noise
    3-Repeated counting on same person

    And I try to solve these problems by introducing face recognition and pose estimation.

    Do u have any suggestion/comment on this?


    • Adrian Rosebrock September 5, 2018 at 8:56 am #

      Face recognition would greatly solve this problem but the issue you may have is being unable to identify faces from side/profile views. Pose estimation and gait recognition are actually more accurate than face recognition — they would be worth looking into.

    • Bruno Bonela June 5, 2019 at 9:55 am #


      Did you solved yout problem?

      Bruno Bonela.

  41. Andres Gomez September 4, 2018 at 11:29 am #

    Hi Adrian. First, I wanna said thank you for your time to explains each details on your code, Your blog is incredible (the best of the best!).

    I have a doubt on CentroidTracker, because it creates a object ID when appears a new person on a video but never destroy that ID, so would be cause any trouble in the future with the memory if I wanna implemented on a Raspberry Pi 3? I followed your person counter code just with a some modifications to run it on the PI

    My best regards

    • Adrian Rosebrock September 5, 2018 at 8:35 am #

      Hey Andres — the CentroidTracker actually does destroy the object ID once it has disappeared from a sufficient number of frames.

  42. Andres Gomez September 6, 2018 at 8:51 am #

    Thank you very much Adrian. Another question, I have an problem with centroid tracker update, since a person is out of the frame but instantaneously another person comes in, the algorithm thinks that is the same person, doesn’t count it and put he centroid to the person that came in (I change the maxDisappered but not succes) so I check again the code to understand in which line you use the minimum Euclidean distance to put the new position of the old centroid but I couldn’t understand the method that you used to achieve that. Can you give an advice to solve that problem?

    It doesn’t happen every time but to rise the success rate.

    My best regards

    • Adrian Rosebrock September 11, 2018 at 8:44 am #

      That is an edge case you will need to decide how to handle. If you reduce the “maxDisappared” value too much you could easily register false-positives. Keep in mind that footfall applications are meant to be approximations, they are never 100% accurate, even with a human doing the counting. If it doesn’t happen very often then I wouldn’t worry about it. You will never get 100% accurate footfall counts.

      • Andres September 11, 2018 at 10:08 am #

        I handled modifying the CentroidTracker, where I put a condition if a distance from the old centroid to the new one is more than 200 in y-axis, continue. Thanks for the answer

  43. Marc September 6, 2018 at 9:09 am #

    Somehow i cant run the code….
    I always get the error message:

    Can’t open “mobilenet_ssd/MobilenetSSD_deploy.prototxt” in function ‘ReadProtoFromTextFile’

    Seems like the program is unable to read the prototxt…

    Do you have an idea on how to fix it?

    • Adrian Rosebrock September 11, 2018 at 8:42 am #

      Yes, that does seem to be the problem. Make sure you’ve used the “Downloads” section of the blog post to download the source code + models. From there double-check your paths to the input .prototxt file.

  44. Harsha Jagadish September 10, 2018 at 7:20 am #

    Hi Adrian,

    Thank you for a great tutorial. Would it be possible for you to let me know how I can count the people moving from right to left or left to right. I am able to draw the trigger lines but unable to count the objects.

    Harsha J

    • Adrian Rosebrock September 11, 2018 at 8:14 am #

      You’ll need to modify Lines 213 and 220 (the “if” statements) to perform the check based on the width, not the height. You’ll also want to update Line 204 to keep track of the x-coordinates rather than the y-coordinates.

      • NT December 27, 2018 at 2:37 am #

        Hi Adrian,

        I am a beginner in this field and your webpage is really helping me through.
        Could you please give me a code?
        I’m so confuse how to change it for a while.

        Thank in advance

        • Adrian Rosebrock December 27, 2018 at 10:05 am #

          I am happy to hear you are finding the PyImageSearch blog helpful! However, I do not provide custom modifications to my code. I provide over 300+ free tutorials here on PyImageSearch and I do my best to guide you, again, for free, but if you need custom modification you will need to do that yourself.

  45. Jaime September 11, 2018 at 9:25 am #

    Hi Adrian,

    I’m wondering what does the tracker do when a object doesn’t move (i.e. the object stands in the same position for a few frames). I’m not sure if OpenCV’s trackers are able to handle this situation.

    Thanks in advance.

    • Adrian Rosebrock September 11, 2018 at 9:44 am #

      It will keep tracking the object. If the object is lost the object detector will pick it back up.

  46. Toufik September 13, 2018 at 11:22 am #

    Hello Adrian, first i want to say thank you for this amazing project it helped me understand quiet a bunch of thing concerning computer visioning.firstly, i have this question which you could help me with, i want to make this project to monitor two doors on my store and i was wondering what changes i might have to do to use two cameras simultaneously
    ps: i was working on simple opencv programs since that i’m quiet the noob and i tried to use cap0 = cv2.VideoCapture(0)
    cap1 = cv2.VideoCapture(1) however it opens only one camera feed even though the camera indexes are correct!
    Thanks for this project again and for taking time to read my comment

    • Adrian Rosebrock September 14, 2018 at 9:31 am #

      Follow this guide and you’ll be able to efficiently access both your webcams 🙂

  47. smit September 14, 2018 at 5:35 am #

    Hi @Adrian. How can we improve object detection accuracy? As your method is completely based on how good the detection is? Any other model you recommend to use for detection?

    • Adrian Rosebrock September 14, 2018 at 9:20 am #

      That’s a complex question. Exactly how you improve object detection accuracy varies on your dataset, your number of images, and the intended use of the model. I would suggest you read this tutorial on the fundamentals of object detection and then read Deep Learning for Computer Vision with Python to help you get up to speed.

      • smit September 24, 2018 at 7:08 am #

        One of the purpose of object tracking is to track people when object detection may fail right? But your tracking algorithm accuracy i if I understand correctly is completely based on whether we detect object in subsequent frames. What is my object just gets detected once, then how should I track him. What modification will be required in your solution.

        • Adrian Rosebrock October 8, 2018 at 12:53 pm #

          No, the objects do not need to be detected in subsequent frames — I only apply the object detector every N frames, the object tracker tracks the object in between detections. You only need to detect the object once to perform the tracking.

  48. Misbah September 15, 2018 at 11:54 am #

    Hey Adrian, I just downloaded the source code from “people counter” with OpenCV and Python. Using OpenCV, we’ll count the number of people who are heading “in” or “out” of a department store in real-time:

    But getting the following error…

    usage: [-h] -p PROTOTXT -m MODEL [-i INPUT] [-o OUTPUT]
    [-c CONFIDENCE] [-s SKIP_FRAMES] error: the following arguments are required: -p/–prototxt, -m/–model

  49. Bharath September 18, 2018 at 7:05 am #

    Hello Adrian, I’ve been following your blog for a couple of months now and indeed there is no other blog which serves with this much of content and practices. Thanks a lot man.

    Currently, I’m working on a project with the same application “Counting people”. I’m using a raspberry pi and a pi cam. Due to some constraints I’ve settled down to a over-head view of the camera. I’m using computationally less expensive practices. A haar-casacade detector (custom trained to detect head from over-head view). The detector is doing a good job. I have also integrated the tracking and counting methods which you have provided. Firstly I encountered low fps. So, I ventured around a bit and came up with the “imutils” library to spped up my fps feed. Now I have achieved a pretty decent fps throughput. And I aslo have tested the codes with a video feed. Its all working good.
    When I use my live feed from the pi cam. There is a bit of lag at detection and the whole system. How do I get this working at real-time? Is there a way to do this on real-time?
    Or Is this just the computational potential of a raspberry pi.
    Thanks in advance Adrian!
    Curious and eagerly waiting for your reply!

    • Adrian Rosebrock September 18, 2018 at 7:08 am #

      Hi Bharath — thank you for the kind words, I appreciate it 🙂 And congratulations on getting your Pi People Counter this far along, wonderful job! I’d be curious to know where you found a dataset of overhead views of people? I’d like to play around with such a dataset if you don’t mind sharing.

      As far as the lag goes, could you clarify a bit more? Where exactly is this “lag”? If you can be a bit more specific I an try to help but my guess is that it’s a limitation of the Pi itself.

      • Bharath September 19, 2018 at 3:35 am #

        Thanks for the reply Adrian!
        The dataset was hand-labeld at my University. Let me know if you may need it!

        Hey, and by “lag” I mean…

        With a pre-captured video feed, the pi was able to achieve about ~160 fps (15s video)

        With the live feed from pi-cam, it was able to achieve about ~ 50fps(while there was no detection) and once there is detection, the fps reduces down to around 20 fps. (This all was possible only after the implementation of the “imutils” library).

        When tested without the “imutils” library, the fps was around 2fps to 6fps.

        So, what I would like to conclude as the key inference is, The system performs at a pretty good accuracy when the subject(head) travels at a slower speed(Slower than the normal pace at which any human can walk).

        When the head moves at a normal pace(the pace at which anyone normally walks), the system fails to track, even after detection and IDing.

        Hope, I made myself clear Adrian!
        Please let me know your thought about this!

        • Adrian Rosebrock October 8, 2018 at 1:32 pm #

          Hey Bharath, would you mind sending me the dataset? I’d love to play with it!

          Also, thanks for the clarification on your question. I think you should read my reply to Jay at the top of this post. You won’t be able to get true real-time performance out of this method using the Pi, it’s just not fast enough.

    • Prashant Bansod April 19, 2019 at 3:27 am #

      Hi Bharath, would you mind sharing the overhead image dataset? I am also working on something like your application. Thank you.

  50. lenghonglin September 18, 2018 at 9:37 am #

    Hi Adrian,

    Thank you for a great tutorial. i have some questions。
    1、Where is the caffe model from? how can i train my own model?
    2、Do u test the situation that people hold up an umbrella。My test results is the models can’t detect this situation 。
    Do u have some idea?
    Thanks very much

  51. Jan September 20, 2018 at 3:47 am #

    Hi Adrian

    Can you please make a tutorial with Kalman filter on top of this 🙂

    DLIB is not very good with fast moving objects.

    Thank you.

  52. Rohit sharma September 21, 2018 at 5:58 am #

    Hey ardrian,
    If I want to capture using pi camera what should I do and what will be the command for it?

    • Adrian Rosebrock October 8, 2018 at 1:09 pm #

      I would suggest reading this tutorial to help you get started.

  53. lenghonglin September 22, 2018 at 5:44 am #

    Hi @Adrian. I run this source on Raspberry Pi,but the fps is 3,it’s so slow,slow. Then i change

    Raspberry Pi to RK3399,the solution is not better,FPS almost 20.

    D u have someidea to imporve the FPS?

    Thanks very much.

    • Adrian Rosebrock October 8, 2018 at 1:04 pm #

      Make sure you refer to my reply to Jay at the top of the comments section of this post.

  54. Federico September 30, 2018 at 6:59 pm #

    Hi Adrian, thanks for this great tutorial! I’m using a rpi 3 B+ with raspbian stretch and I am getting very slow frame rates of about 5 fps with the example file. I have tried not writing the output with same results. Playing the example file with omxplayer works fine at 30 fps. I have tried using another sd card to no avail (write speed is about 17 MB/s and read is 22 MB/s, which I think is not that bad). Do you know what could be happening?


  55. Guru Vishnu October 6, 2018 at 11:05 am #

    Hi Adrian,

    Thanks for this post!

    Can you please let me know, the process to use this code to count vehicles?


    • Adrian Rosebrock October 8, 2018 at 9:42 am #

      Hi Guru — I will try to cover vehicle counting in a separate blog post 🙂

      • Guru Vishnu October 8, 2018 at 2:52 pm #

        Thanks Adrian.

        Since I am trying to build one, Can you please enligten me, If I can use time in seconds/milliseconds instead of a centroid to count the object, as the time can be a crucial factor than position … Please let me know your thoughts.


  56. Guru Vishnu October 8, 2018 at 3:01 pm #

    Also, Please let me know, If I can use CAP_PROP_POS_MSEC(via imutils) to count vehicles in live CCTV stream based on time.

  57. Eric N October 10, 2018 at 11:10 pm #

    Hi, I’m trying to swap out the dlib tracker for the OpenCV tracker, since the dlib one is pretty inaccurate. However, when I use the OpenCV, eg, CSRT, the new detections accumulate into a new tracking item, instead of updating and replacing the original ID associated with that object. So in the first cycle, I have one bounding box and tracker with it, and in the next cycle, it will detect a person again, however, it will just create a new tracker and then I’ll have 2 bounding boxes representing the same person. And it keeps adding more trackers each time for the same person. Any idea what I did wrong? Thanks!

    • Adrian Rosebrock October 12, 2018 at 9:09 am #

      It sounds like there is a logic error in your code. My guess is that you’re creating a new CSRT tracker when you should actually be updating the an existing tracker. Keep in mind that we only create the tracker once and from there we only update its position.

  58. Daniele October 11, 2018 at 7:06 am #

    Hi Adrian,
    thank you so much for this post, was very useful for my research project. What’s the best micro-pc (raspberry pi, asus thinker, ecc.) to implement a good counter or a machine learning system in general. Thanks.

    • Adrian Rosebrock October 12, 2018 at 9:04 am #

      That really depends on your end goal. The Pi is a nice piece of hardware and is very cheap but if you want more horsepower to run deep learning models I highly recommend NVIDIA’s Jetson TX series.

  59. Steve October 11, 2018 at 8:18 pm #

    Hi Adrian!

    Thank you for this post. I have quick question that confusing me. In earlier post you mentioned that the size parameter in blob = cv2.dnn.blobFromImage should match CNN network dimensions. According the dim in the prototxt the size should be 300 X 300. W and H being supplied to cv2.dnn.blobFromImage in this example 373, 500. Does this effect accuracy?

    Thank you for your help.


    • Adrian Rosebrock October 12, 2018 at 8:54 am #

      Object detection networks are typically fully convolutional, implying that any size image dimensions can be used. However, for the face detector specifically I’ve found that 300×300 input blobs tend to obtain the best results, at least for the applications I’ve built here on PyImageSearch.

  60. Atul October 15, 2018 at 8:02 am #

    Hi Adrian, this is awesome starting point for people like me who are new to algorithms and implementing machine learning !!! Just curious to ask few silly questions :

    1. Is it possible to track object left to right and vice versa
    2. Is it possible to implement it for live streaming (it appears you have given option, but would like to know more)

    Just thinking of implementing this in one of the maker fairs in Mumbai , if possible. Just to give an idea to students about OS technologies and usages of OpenCV.

    • Adrian Rosebrock October 16, 2018 at 8:30 am #

      1. Yes, see my note in the blog post. You’ll want to update the call to “cv2.line” which actually draws the line but more importantly you’ll want to update the logic in the code that handles direction tracking (Lines 199-222).

      2. Yes, use the VideoStream class.

  61. Hj October 15, 2018 at 1:36 pm #

    Hi Adrian,

    Would it be possible to see if a person exists within a defined space in the video frame ?similar to a rectangle of trigger lines . If yes do let me know how I can go about it .

    Harsha j

    • Adrian Rosebrock October 16, 2018 at 8:27 am #

      Yes, you would want to define the (x, y)-coordinates of your rectangle. If a person is detected within that rectangle you can take whatever action necessary.

      • HJ October 19, 2018 at 12:25 pm #

        Hi Adrian,

        I am able to get the rectangle but recognition is on the entire frame. Could you please let me know how to restrict detection only with in the rectangle ?

        Harsha J

        • Adrian Rosebrock October 20, 2018 at 7:28 am #

          Please take a look at my other replies to your comments, Harsha. I’ve already addressed your question. You can either (1) monitor the (x, y)-coordinates of a given person/objects bounding box and see if they fall into the range of the rectangle or (2) you can simply crop out the ROI of the rectangle and only perform person detection there.

          • HJ October 21, 2018 at 1:00 pm #

            Thanks Adrian,

            Will try it out.

            Harsha Jagadish

  62. Rakesh October 20, 2018 at 7:39 am #

    Hi the code is tracking people. how to make it only count specific kind of objects like a boat in water or car on road etc

    • Adrian Rosebrock October 20, 2018 at 8:10 am #

      You’ll want to take a look at my guide on deep learning-based object detection. Give the guide a read, it will give you a better idea on how to detect and track custom objects.

  63. jorge nunez October 22, 2018 at 9:24 pm #

    I’m working in a project where, rather than tracking movement, i need to know the local coordinates of each person, the images come from multiple cameras and each camera (the algorithm running on the pc, actually) should be able to compute the coordinate of each person detected on the corresponding image, my first thought is to train a neural network, but my intuition tells me that would be overkill, and killing a fly with a bazooka sounds disastrous in any context were you have limited resources, which is my case.

    • Adrian Rosebrock October 29, 2018 at 2:15 pm #

      By “local coordinates” do you mean just the bounding box coordinates? I’m not sure what you mean here.

  64. Dheeraj October 23, 2018 at 5:51 am #

    I really appreciate for a great work from your end.

    I am facing some issue while importing dlib. The code is running without importing dlib but unable to track people and count it. How to install and import dlib.

    Import Error: no module named dlib

    Please figure out this issue for me.

  65. JP October 24, 2018 at 4:47 am #

    hi, I’ve tried to run the however, there this error pop up, ImportError: No module named ‘pyimagesearch’. how do I solve this?

    • Adrian Rosebrock October 29, 2018 at 2:03 pm #

      Make sure you use the “Downloads” section of the blog post to download the source code, videos, and the “pyimagesearch” module. From there you’ll be able to execute the code.

  66. Dheeraj October 24, 2018 at 7:55 am #


    Thank you for a great tutorial.

    The code is only tracking people and even its not counting people when they are moving with greater speed and hence the counting is inaccurate. Even if people comes into the region of interest and moves back without crossing the line , the counter increments. How to avoid false detection and at what minimum height from ground, the webcam should be installed?

    Any solution on how to fix this out?

    • Adrian Rosebrock October 29, 2018 at 2:02 pm #

      You’re getting those results on the videos supplied with the blog post? Or with your own videos?

      • Dheeraj October 31, 2018 at 5:27 am #

        I am getting those results on my own videos as fast movement is not detected and getting counted.I am making use of a normal webcam C270 , the count is not accurate.

  67. Rahma October 26, 2018 at 12:00 pm #

    Hello thank you for the tutorial if you could help me it’s not working with me from the beging and this is the ereur message :
    Traceback (most recent call last):

    ModuleNotFoundError: No module named ‘scipy’

    • Adrian Rosebrock October 29, 2018 at 1:39 pm #

      You need to install SciPy on your system:

      $ pip install imutils

  68. Gordon October 29, 2018 at 2:04 am #

    Hello Adrian, i have tried this with the video of passenger entering the bus and the result is not really good. Is there any way that i can improve the accuracy? Should i fine tune the model with my own data? If so, is there any tutorial or material that i can refer to for fine tuning the model. Thanks a lot.

    • Adrian Rosebrock October 29, 2018 at 1:16 pm #

      Hey Gordon — what specifically do you mean by the result not being good? Are people not being detected? Are the trackers themselves failing?

      • Dheeraj October 31, 2018 at 5:32 am #

        How to increase the accuracy of people count and filter out false detections?

        • Adrian Rosebrock November 2, 2018 at 7:39 am #

          I would suggest training your own object detector on overhead views of people. The object detector we are using here was not trained on such images.

          • Prashant Bansod April 11, 2019 at 6:35 am #

            @Adrian Thank you very much for your nice tutorial. I have found a github repository where the object detector is trained on the overhead views. But thing which is not sure for me is how to give the input to the function net = cv2.dnn.readNetFromDarknet(configPath, weightsPath) when the weights is in .weights format. I would appreciate if you could give some advice. Thank you. the repository I am referring here is :

          • Prashant Bansod April 11, 2019 at 6:38 am #

            @Adrian Sorry I made a mistake in the previous comment. But thing which is not sure for me is how to give the input to the function net = cv2.dnn.readNetFromDarknet(configPath, weightsPath) when the weights is in .h5 format.

          • Adrian Rosebrock April 12, 2019 at 11:29 am #

            Thank you for sharing the model, Prashant. I don’t have any experience with that particular model. I’ll check it out and potentially write a blog post about it.

      • Gordon November 1, 2018 at 10:40 pm #

        Hello Adrian, the people is not being detected.

  69. Sohib October 30, 2018 at 5:59 am #

    Above all, I would like to thank you for these efforts you have been doing ever since you first started sharing these awesome posts.

    Now, to the technical part.

    You used MobileNetSSD_deploy.prototxt and MobileNetSSD_deploy.caffemodel as your deep learning “CNN-model” and “weights” respectiveley, right?

    And you only considered “person” class among the classes available.
    Would it be possible to fine-tune the model you used, say, for objects like glasses (that people wear).
    It looks like it is trained in Caffe, so could you share your insights on how to train this model for our custom objects? In that case we would be able to exclude non-trackable objects while fine tuning it. Thanks again!

    • Adrian Rosebrock November 2, 2018 at 8:26 am #

      Yes, you could absolutely fine-tune the model to predict classes it was never trained on. I actually discuss how to fine-tune and train your own custom object detectors inside Deep Learning for Computer Vision with Python.

  70. git-scientist November 9, 2018 at 4:24 am #

    I have tried this awesome code with my own video. The moving is almost similar to the one you had. In my video, however, I encountered some little errors. Let me post them below and if you, Adrian, or some of other guys could help modify particular parts of this code, I’d appreciate it much.

    1. up-counter is incorrectly increased (UP is incremented by 1) as soon as object is detected on the upper half of the horizontal visualization line, and then, if that object moves down, down-counter remains the same (though it should increment by 1).

    2. tracker ID is sometimes lost (on the upper edge of the frame when trackable object is moving a bit horizontally on the upper half of the line) even though object is still in the frame. This causes one object being identified twice.

    Thank you in advance to all who try 😃

    • Adrian Rosebrock November 10, 2018 at 10:01 am #

      1. That is a feature, not a bug. In the event that a person is missed when they are walking up they are allowed to be counted as walking up after (1) they are detected and (2) they demonstrate they are moving up in two consecutive frames. You can modify that behavior, of course.

      2. You may want to try a different object tracker to help increase accuracy. I cover various object trackers in this post.

  71. Henry November 12, 2018 at 1:52 pm #

    First of all, thanks a lot for this post. Very helpful. I can run your code for no problem.

    I am trying to connect the people detected at the people detection step with the ID assigned to it at the tracking step. Any idea on how to do that? A brutal force I can think about is to match the centroid of the people detected in detection step with the centroid of the ids in the tracking step. Any better solution?



  72. Niko Gamulin November 13, 2018 at 11:50 am #

    Thanks for a great post, Adrian!

    I have tried to use people counter on video from actual store. First, I tried to input the frames as they are, without rotation and the model performed really poorly. Then, I tried to rotate the image and it performed a little better:
    Has the model for detecting people from the above been finetuned with the dataset that contains images from that perspective or did you use pretrained model without additional finetuning? I am asking this because I can’t intuitively find an explanation for such a difference in acuracy in case the frames are rotated.
    Also, if you have finetuned the model (this or any other), it would be helpful if you could provide any info about the size of finetuning dataset – I am planning to finetune the model in order to detect occluded people behind the exhibited objects as they obviously affect the prediction accuracy for the model out of the box.

    • Adrian Rosebrock November 13, 2018 at 4:13 pm #

      The model was not fine-tuned at all. It’s an off-the-shelf model trained. Better accuracy would come from training a model on a top-down view of people. As far as tips, suggestions, and best practices when it comes to training/fine-tuning your own models, I would suggest taking a look at Deep Learning for Computer Vision with Python where I’ve included my suggestions to help you successfully train your models.

  73. interntss November 14, 2018 at 8:11 pm #

    Hi, everyone can anyone explain how I can determine minimum system requirements for this particular project. For example, I built .exe file out of .py file and I am able to run it on any PC now. But, I would like to know what the minimum system requirements are.

    FYI, the .exe file is ~500MB, and when I use it, task manager shows:
    40 % CPU;
    800MB Memory;
    4.1Mbps Disk usages.

    (where my CPU is Intel i7-8700 @ 3.20GHz, RAM is 32 GB ).

    Anyway, how could possibly one be able to calculate system requirements for some project to run seamlessly?
    Thank you!

  74. Marcelo November 16, 2018 at 7:08 am #

    Hi Adrian! thanks for this tutorial!

    One noob question.. why do you use np.arange to loop over detections (line 124) instead of range?

    • Adrian Rosebrock November 19, 2018 at 12:51 pm #

      No real reason — I just felt like using it.

  75. shen November 18, 2018 at 7:06 am #

    hi adrian, first of all thank you so much,
    Do u mind providing me an extra code on how to turn on an led if the ppl going up and down are not equal whereas turn off an led when up and down are equal
    thanks in advance

    • Adrian Rosebrock November 19, 2018 at 12:33 pm #

      I’m happy to provide the current source code for free so you can learn from it and build your own applications; however, I do not make modifications to it. I request that if I put out the information for free that others build with it, hack with, enjoy it, and engineer their own applications. The code is here for you to use and learn from — now it’s your turn 🙂

  76. bill November 18, 2018 at 9:17 am #

    hey adrian, thanks a lot for the guidance
    if i wan to combine ur tutorial about qr code together with this counter tutorial and run as one … do u have any idea on how to make this happen ? thanks

    • Adrian Rosebrock November 19, 2018 at 12:33 pm #

      You want to detect people and QR codes in the same application? Could you explain a bit more of what exactly you’re trying to accomplish?

      • bill November 19, 2018 at 8:35 pm #

        Hey Adrian, thanks for the reply, i am trying to make an entrance system which require ppl to scan a qr code which will store the data in mysql database whenever the enter a room and at the same time using a counter to monitor the number of ppl going in and out from the room

        • Adrian Rosebrock November 20, 2018 at 9:15 am #

          That’s absolutely do able. How much experience do you have programming and using computer vision? Based on your previous comment I think you may just be getting started learning computer vision. If so, make sure you read through Practical Python and OpenCV to help you learn the basics. You’ll need basic programming experience though so if you don’t have any experience programming make sure you take the time to learn Python first.

  77. clarence chhoa hua sheng November 19, 2018 at 1:46 am #

    Hey adrian! how do i link this with my picamera on live streaming? how to use the video class u made?

    • Adrian Rosebrock November 19, 2018 at 12:27 pm #

      If you’re new to working with the VideoStream class make sure you read this tutorial.

  78. BAE November 20, 2018 at 3:32 am #

    Hi Adrian thank you for this tutorial!
    How can I get ‘pyimagesearch’ module??

    • Adrian Rosebrock November 20, 2018 at 6:05 am #

      Use the “Downloads” section of this post to download the source code (including the “pyimagesearch” module), machine learning models, and example videos.

  79. Gagandeep November 22, 2018 at 3:00 am #

    Dear Adrian

    It is working fine if person speed is slow but if person speed is fast it not able to detect the person, Here is my questions

    1;- what necessary changes need to be done so it work more accurately?

    2;- maxdistance variable you set to 50 , and when it picture, what is the use of this variable

    and thanx for sharing your wounder-full knowledge its really helpfull for us…..

    • Adrian Rosebrock November 25, 2018 at 9:34 am #

      You would want to run the actual object detector at a faster rate. For speed we only run it occasionally but for faster moving updates you’ll want to run it more often.

  80. NIlton November 25, 2018 at 8:31 am #

    Hi, how to use with network cam?

    • Adrian Rosebrock November 25, 2018 at 8:47 am #

      You can use my VideoStream class which will allow you to access your webcam. I unfortunately do not have any tutorials for using a network cam though.

  81. Zoom November 25, 2018 at 9:02 am #

    Thank you Adrian! is it practice to detect people in real time video?

    • Adrian Rosebrock November 25, 2018 at 9:45 am #

      I’m not sure what you mean by “is it practice to detect people”, could you please clarify?

  82. clarence chhoa November 25, 2018 at 11:17 am #

    is this worked with pi ? is there any tutorial for pi to make people counting from you ?

    • Adrian Rosebrock November 26, 2018 at 2:34 pm #

      You can use this tutorial with the Pi but I would recommend swapping in a faster object detector. Deep learning object detectors are too slow on the Pi. HOG + Linear SVM or Haar cascades would be your best bet.

      • Mathews P Jacob November 29, 2018 at 3:54 am #

        Actually, how can use Haar cascades model and classifier with this program?

        • Adrian Rosebrock November 30, 2018 at 8:59 am #

          You can use this pre-trained Haar person detector but I honestly don’t think it will work well. You would want to train your own Haar cascade or top-views of people.

  83. Mathews P Jacob November 28, 2018 at 5:47 am #

    How can I count an object (ed: ball’s) from video using this method. can you please help?

    • Adrian Rosebrock November 30, 2018 at 9:15 am #

      What type of ball? Sports ball? Any arbitrary ball? The more details you can share on the project the more likely it will be that I can point you in the right direction.

  84. Jeraldina November 29, 2018 at 3:04 pm #

    Hello!. A question that OpenCV version you use and what operating system did you do?

    • Adrian Rosebrock November 30, 2018 at 8:52 am #

      I used OpenCV 3.4.2 for this tutorial. I use both macOS and Ubuntu.

  85. TwelveW November 30, 2018 at 12:18 pm #

    Hello, I’m having an error running. “ImportError: No Module named scipy.spatial”

    • Adrian Rosebrock December 4, 2018 at 10:19 am #

      You need to install the SciPy library:

      $ pip install scipy

      • TwelveW December 6, 2018 at 7:59 am #

        Hi, I’m installing it on a Raspberry Pi and it the install gets stuck halfway. Is there anyway I can install scipy?

        • Adrian Rosebrock December 6, 2018 at 9:26 am #

          First make sure you’re using my OpenCV install guide to install OpenCV. From there I also suggest compiling with only a single core to ensure your Pi doesn’t get locked up.

  86. Pedro Perez November 30, 2018 at 11:25 pm #

    Hello Adrian, thank you for the tutorial if you could help me it’s not working with me from the beginning and this is the error message :

    ImportError: No module named scipy.spatial

    • Adrian Rosebrock December 4, 2018 at 10:18 am #

      You need to install the SciPy library:

      $ pip install scipy

  87. PPC December 7, 2018 at 4:41 am #

    Hi Adrian,
    Love your work. 1 quick question, the model you used is trained to detect multiple classes. How to make your code to detect multiple classes? Like, detecting “car” and “bus” movements.

    I tried changing the line: if CLASSES[idx] != “person”:
    But it didn’t work when I put multiple classes.

    • Adrian Rosebrock December 11, 2018 at 1:04 pm #

      There are a number of ways you can programmatically achieve the desired change. Try the following:

      if CLASSES[idx] not in ("person", "bus"):

  88. sset December 11, 2018 at 4:08 am #

    Thanks Adrian for good article.
    I need to count number of vehicles crossing a particular boundary post. What will be best technique?

    • Adrian Rosebrock December 11, 2018 at 12:35 pm #

      I would start by considering how you are detecting the boundary post. Is that something you can pre-label and know the coordinates of before you start the script? Or must the boundary post be detected automatically?

  89. sset December 12, 2018 at 5:16 am #

    Hi – Boundary post can be check post. As you mentioned
    (a) Needs to be detected automatically – this is correct
    (b) Preset co-ordinates – unlikely
    Please suggest any suitable approach.

    • Adrian Rosebrock December 13, 2018 at 8:54 am #

      I would suggest training a dedicated object detector such as HOG + Linear SVM or Faster R-CNN, SSD, or YOLO to automatically detect your boundary post. If you’re interested in learning how to train your own custom object detectors you’ll want to refer to my book, Deep Learning for Computer Vision with Python.

  90. sset December 18, 2018 at 6:48 am #

    Thanks for articles.

    (a) for frame number statisfying skip_frame condition we are invoking object detection model for instance rcnn,yolo. dlib correlation tracker is invoked and trackers is populated with rectangles.

    (b) for frame number not satisfying skip frame condition we are doing tracker update to get new positions of objected detected in (a). But what happens if new objects appear in frame numbers for condition (b). They will not be detected by object detection model and will not be tracked?

    • Adrian Rosebrock December 18, 2018 at 8:45 am #

      Correct, if a new object appears during the frame skip you will not be able to detect or track them. You need to achieve a balance between the two for your own application.

  91. inirah December 21, 2018 at 12:29 am #

    How will you represent that people have been using a particular path more and another path less using heatmap?It is possible.

  92. David December 21, 2018 at 1:13 am #

    Dear Adrian

    First of all, I have to say Thank You very much for this wonderful tuition.
    I am new to Raspberry, OpenCV,people count etc. I have no idea where to start at first,
    but your blog help me a lot. Really appreciate it.

    I wish if you could give me some adivce on people count program
    I successfully run your code on my windows OS(which has high spec) and raspberry PI 3 model B.
    However when running on raspberry , the video almost freezes, it took a lot of time to process.
    I have read your comments on this page, and it seemed like people counting programming
    is very difficult to have a good performance on raspberry.
    I just want to know is it possible to use raspberry(with pi camera) to do real time people count? I am purchasing the latest model PI 3 Model B +, and try to run the test
    (I know that the image process requires high spec. and actually as long as the accuracy of people count can reach like 80% on raspberry , I am OK with that.)
    Any advice would be appreciated

    • Adrian Rosebrock December 27, 2018 at 11:03 am #

      I don’t have any tutorials dedicated to building a people counter on a Raspberry Pi but I will be covering it in my upcoming Raspberry Pi + Computer Vision book. Stay tuned!

    • nach March 11, 2019 at 5:01 pm #

      hey David can you explain how u run it on windows please?

  93. Aditya December 21, 2018 at 2:11 am #

    Hi Adrian,

    I want to write/record and save the output video displayed using the imshow() with the two lines and bounding boxes and in/out count locally.How do I do that.

  94. sset December 21, 2018 at 11:36 pm #

    If we reduce number of frames to skip say to one fourth of frame rate then it is highly likely that we reduce number of objects to go undetected (of-course with reduce in performance due to invoking object detection more number of times).
    Even if we object goes undetected during frame-skip it is likely that object will be detected in next subsequent frames (since detected flag is still not set). I have observed this because some of objects are detected after crossing ROI line (at-least after some frames); Please confirm. Thanks

  95. sset December 21, 2018 at 11:39 pm #

    What happens if object has suddenly stopped moving – I guess once it is counted – flag will be set and even if it is stationary in subsequent frames will not be counted – correct?

    • Adrian Rosebrock December 27, 2018 at 10:54 am #

      You are correct.

  96. khassans December 24, 2018 at 2:16 am #

    Thanks for the tutorial & code.
    but there is a issue

    net = cv2.dnn.readNetFromCaffe(args[“prototxt”], args[“model”])
    AttributeError: module ‘cv2’ has no attribute ‘dnn’

    how can I fix this?

    can I use live ip camera to capture & process real time video?

    It’s an urgent issue.

    • Adrian Rosebrock December 27, 2018 at 10:43 am #

      It sounds like you are using a very old version of OpenCV. Make sure you follow my OpenCV install guides to obtain a newer OpenCV version.

  97. Mr. JAMIL December 24, 2018 at 5:34 am #


    I need to know if this code can well run on Pi zero with camera


    • Adrian Rosebrock December 27, 2018 at 10:39 am #

      No, the Raspberry Pi Zero is far too underpowered to run the code in this tutorial.

  98. Khai December 25, 2018 at 8:31 am #

    Hi Adrian, im new with this image processing and opencv. I want to create a project that will count the number of car cross the line. I change the “person” to car and it work perfectly with the video with the actual car inside it. BUt when i need to recreate prototype, i want to create by using toy’s car. However it cant detect the toys as a car. Do i need to find another detector?

    • Adrian Rosebrock December 27, 2018 at 10:27 am #

      The model used in this post was not trained to detect toy cars. You will need to train your own custom object detector. I discuss how to train your own custom object detectors (including code) inside Deep Learning for Computer Vision with Python.

      • Khai January 7, 2019 at 7:41 am #

        Hello, i already solve the “Car” problem. Now i want to edit the code to count the car when it cross from left to right instead of up down. I already edit the line 204 into x axis, but it still count from up down. Did i miss anything?

        • LK January 14, 2019 at 3:39 am #

          Hi Adrian,

          I also faced same problem from above, is there had any solution? I had done edit all related variable or is there anything I had miss up?

          • LK January 14, 2019 at 3:48 am #

            Just an add on for the question, I had rotated the video example_02.mp4 given at output directory to detect left to right instead up down. It is able to detect only total one people form right to left in that video, look for the help from this issue, thanks in advance.

  99. Saeed Koochaki Kelardeh December 28, 2018 at 4:43 pm #

    You Are Awesome Adrian .Definitely You Are a Great Person With High Degree of Proficiency . Very Good Luck .

    • Adrian Rosebrock January 2, 2019 at 9:37 am #

      Thank you for the kind words, Saeed.

  100. sset January 3, 2019 at 6:10 am #

    Hi Adrian
    “maxDistance” is it distance between centroids of 2 objects? – euclidean distance – is this distance in terms number of pixels?

    • Adrian Rosebrock January 5, 2019 at 8:50 am #

      Correct, that distance is measured in pixels.

  101. sset January 3, 2019 at 6:33 am #

    Hi Adrain,
    Concept of ‘up’/’down’ can this be used for vehicle lane crossing?

    • Adrian Rosebrock January 5, 2019 at 8:49 am #

      Yes, just change the Line 137 to check for a “vehicle” class rather than “person”.

  102. Mohammed Jaddoa January 7, 2019 at 7:59 pm #

    Thank you Adrian , Its really amazing article, thank you very much for your effort and sharing code with us
    if you don’t mind, I have two questions,
    First, what if detected person show gain in view, for example, person with ID=1 , back on same direction to other direction ID change or has the same number, ID=1
    Second, I can apply it in real time and is it slow or fast

    • Adrian Rosebrock January 8, 2019 at 6:43 am #

      1. I’m not sure what you mean here, could you elaborate?
      2. Yes, this algorithm can run in real-time on a laptop/desktop.

  103. Phung Gia Khiem January 10, 2019 at 3:56 am #

    Hello Mr.Adrian Rosebrock
    Could you modify this example to compatible with Movidius NCS?
    Many thank.

    • Adrian Rosebrock January 11, 2019 at 9:39 am #

      I will be doing exactly that in my upcoming Raspberry Pi + Computer Vision book 🙂

      • Phung Gia Khiem January 14, 2019 at 1:24 am #

        Great new, when do you release that book ?

        • Adrian Rosebrock January 16, 2019 at 10:00 am #

          Right now I am targeting Q3 of 2019.

  104. Saeid January 10, 2019 at 7:24 am #

    Brilliant work
    ive got a question
    how can i set my video source from an ip camera? say with a rtsp url.
    i tried cv2.VideoCapture, it shows video frame, but nothing else works..

    • Adrian Rosebrock January 11, 2019 at 9:38 am #

      Sorry, I don’t have any tutorials on RTSP URL streaming but I will be covering it in my upcoming book. Stay tuned!

      • Saeid January 13, 2019 at 3:51 am #

        another thing
        i have an issue while playing the 1 minute example video, the whole video plays in 35 seconds, looks like playing speed is more than usual, do u have anything in mind why this might be happening?
        thank you very much

        • Adrian Rosebrock January 16, 2019 at 10:09 am #

          The goal of OpenCV is real-time image and video processing. OpenCV doesn’t care what FPS the video was recorded it, it simply wants to process the video frames as fast as possible. OpenCV + the people counting algorithm is fast enough that it’s processing the video faster than the original FPS. That’s actually a good thing.

    • Arfan February 28, 2019 at 5:12 am #

      just print your ip address with user name and password like below:

      vs = VideoStream(src = “rtsp://username:password@ip_address”)

  105. kumar January 10, 2019 at 8:05 am #

    AttributeError: module ‘cv2’ has no attribute ‘dnn’

    • Adrian Rosebrock January 11, 2019 at 9:37 am #

      Your OpenCV version is tool old. You need OpenCV 3.4 or greater for this tutorial. See this page for my list of OpenCV install guides.

  106. P Tinsley January 10, 2019 at 7:23 pm #

    Hey Adrian — thanks for all the tutorials; they’re immensely helpful!
    I’m wondering if you have a blog on face detection and tracking using the OpenCV trackers (as opposed to the centroid technique). Thanks!

  107. Anthony January 14, 2019 at 12:32 am #

    Do I can use yolov3?

  108. Benny W January 14, 2019 at 10:57 am #

    Hi Adrian! Excellent post, as always!
    One question: is it possible to not increment the counter when the same person first enters, then leaves, then enters back again in the department store? Something like a “unique visitor counter” kind of algorithm. Thanks!

    • Adrian Rosebrock January 16, 2019 at 9:49 am #

      Yes, you could do that but that would require quantifying the face of the person. See this tutorial on face recognition for more info.

      • Benny W January 17, 2019 at 6:34 am #

        For scenarios that implies privacy concerns, quantifying the face of the person is prohibited. Is there a way to quantify the body of the person – basically everything except the face: clothes, accessories, shoes etc

        • Adrian Rosebrock January 22, 2019 at 9:50 am #

          If you cannot quantify the face you should look into “gait recognition” (i.e., how someone walks). Gait recognition is more accurate for person identification than even face recognition!

  109. Saeid January 16, 2019 at 5:39 am #

    Hi There
    i have been wondering how can i optimize your code’s performance and i think using openCV for loading videos and getting frame might be quite resource consuming, do you know any other libs i can use for this part? FFMPEG is a quite awesome thing for these stuff but there are too many wrappers written for python and on the other hand we can use FFMPEG commands directly in python, i am a bit confused, Do you have anything i your mind about these stuff?
    I really appreciate it

  110. Yashwant ram January 21, 2019 at 2:02 am #

    Hey, i would like to use faster rcnn or yolo v2 for increasing the accuracy can you provide me the weights and the .pb files and guide me please

  111. Minh Tri January 22, 2019 at 5:02 am #

    Hi Adrian!
    Thank you for your good tutorial
    I tested your code and it seems running well. But when I adjust the skip-frame to smaller value (5 for example), it seems the counter is not working well anymore

    The second, in your code, the variable “counted” should move upper after else because after it is marked as counted, we don’t need to calculate direction

  112. niek January 22, 2019 at 3:21 pm #

    Hi Adrian,

    After waiting for an hour to install scipy, it ended up with all kind of errors.

    Is there any other way to install scipy? followed the tutorial many times over and over..

    im using an raspberry pi 3 B 2015 model
    python for opencv development
    opencv 4.0 for computer vision

    • Adrian Rosebrock January 25, 2019 at 7:35 am #

      Try installing SciPy via:

      $ pip install scipy --no-cache-dir

      It will take awhile to install but it should work.

  113. Gordon January 24, 2019 at 12:47 pm #

    Hi Adrian.

    Thanks for the tutorial.
    I tested the code with the caffemodel trained with my own dataset (around 1.5k of images – labeled the bus passengers’ head walking through the entrance door). But, with this model, the tracking works very poor which i assume could be due to the poor detection of the model (with approximately 30% average confidence and sometimes failed to detect in consecutive frames). Would like to ask is there any way to improve the accuracy of the model? Should i train with more data or is there anything else can i do?

    Million thanks in advance.

    • Adrian Rosebrock January 25, 2019 at 6:54 am #

      It sounds like a problem with your model itself. With 30% average confidence and many failed detections it’s likely that the detector is the problem. I would suggest working on building a better object detector. If you need help training an object detector I would suggest you refer to Deep Learning for Computer Vision with Python where I discuss how to train accurate object detectors in detail.

  114. Summa January 29, 2019 at 1:08 am #

    Hi Adrian,

    First of all thanks a lot for this much needed tutorial on object tracking.

    However, I have some queries I thought of discussing with you:
    – Is it possible to do object tracking sing tensorflow instead of caffe framework?
    – I am facing a lot of issues while setting up this caffe framework in my CPU based ubuntu system; and it is very much needed to first set up the framework in order to use it later for training your own object as per your specific requirement
    – Thus I was wondering if object tracking can be done with tensorflow models(as I have tensorflow installed in my system) and if yes, what changes are needed in the codebase you shared in your tutorial.

    Your advice would be very helpful for my situation.

    Thanks in advance,

    • Adrian Rosebrock January 29, 2019 at 6:31 am #

      1. Yes, just load your TensorFlow model instead of the Caffe one. It will require updates to the code since the TensorFlow model needs to be loaded differently and a forward pass likely performed differently (depending on how you exported the model) but that’s something you can resolve.

      2 and 3. Caffe can be a bit of a pain. Inside Deep Learning for Computer Vision with Python I teach you how to train your own custom Faster R-CNN, SSD, and RetinaNet object detectors using TensorFlow and Keras. It also includes my code base for object detection in images and video. I would suggest starting there.

  115. Wanderson January 31, 2019 at 4:23 pm #

    Hi Adrian
    How many training and tests samples used on caffemodel?

    • Adrian Rosebrock February 1, 2019 at 6:42 am #

      The model was pre-trained on the COCO dataset, you can refer to the COCO dataset for more information on training data.

  116. Sajjad February 6, 2019 at 2:47 pm #

    Hello Adrian,

    Hi Adrain,

    I have already read some tutorials in your website but thy only for long distance (when there is a long distance between camera and people and camera captures a vast space).
    I am looking for some codes which are suitable for situations, there is a short distance between camera and people, as the below links do not work well under circumstances which camera can not capture large space (short distance between camera and people).
    I am looking forward to your answer.


  117. Vasilis February 22, 2019 at 6:58 am #


    I have a problem with dlib. Pycharm doesn’t let me install it and when I try to download it it always clashes with my Python version or something else. Any ideas?

  118. Viraj February 22, 2019 at 11:50 pm #

    This was amazing Adran!!!
    Can you suggest a camera for this project

    • Adrian Rosebrock February 27, 2019 at 6:24 am #

      The camera you choose should be based on your environment. Do you need auto-focus? Is the camera supposed to work in both day and night? Consider your environment first, then choose a camera.

  119. Yadnyesh February 24, 2019 at 11:48 pm #

    Hey Adrian,
    I have a question
    I want to divide the frame into 4 quadrants and locate the detected object, I want to know in which quadrant does it lie
    Please guide me in the right direction
    Thank you in advance

    • Adrian Rosebrock February 27, 2019 at 6:01 am #

      To start, simply detect the people in the image and compute their bounding box coordinates. Then compute the center of the bounding box. Since you know (1) the center of their bounding box and (2) the dimensions of the frame you can then determine which quadrant they are in.

  120. Di February 26, 2019 at 3:18 am #

    I encountered an error while running the program, “argument -p/–prototxt is required”. I think it was not set to the path of Caffe “deploy” prototxt file. I do not know how to solve the problem to run the program successfully.I just came into contact with it, so maybe my question is naive.

    • Adrian Rosebrock February 27, 2019 at 5:43 am #

      The problem is you are not setting your command line arguments properly. Read this tutorial first and then you’ll understand how to use them.

  121. Aaron February 26, 2019 at 11:46 pm #

    Hey Adrian,
    Great post
    I have a doubt though
    What if I have to draw the line vertical and then count if detected object is on right hand side or left
    What changes do I have to make to the code ?

    • Adrian Rosebrock February 27, 2019 at 5:31 am #

      See the comment thread with Harsha. I’ll also be covering that question in detail inside my upcoming Computer Vision + Raspberry Pi book.

      • Aaron February 27, 2019 at 9:24 pm #

        Thank you for the reply
        I made the changes and it works great
        I have another doubt
        What if I have to count the total Objects Detected in frame?
        Irrespective of them crossing the line

        • Adrian Rosebrock February 28, 2019 at 1:42 pm #

          You would loop over the number of detected objects in increment an integer each type an object passes the minimum confidence/probability threshold.

  122. Rheza Aditya February 27, 2019 at 10:37 am #

    Hello Adrian

    This is a really great post. I have one question.
    Suppose i want to create a program to count road capacity of a road in real time, how can i utilize the dlib package? Or should i use something else?

    Thank you in advance.

    • Rheza Aditya February 27, 2019 at 10:41 am #

      road capacity here is in percentage, so that i can see at what percentage is the capacity of the road

      • Adrian Rosebrock February 28, 2019 at 1:51 pm #

        You would use semantic segmentation which will return a pixel-wise mask, assigning each pixel of an input image to a class. You can then count the number of pixels that belong to a certain class and derive your percentage.

  123. hanumant March 1, 2019 at 12:48 am #

    Hello Adrian
    i wann help how i can count people standing in queue on shop billing counter.

    • Adrian Rosebrock March 1, 2019 at 5:25 am #

      You can simply apply the object detector covered here. Loop over the detections, check to see if the minimum probability/confidence is reached, and then increment a counter.

  124. Lyron March 9, 2019 at 8:58 pm #

    Dear Adrian

    Once again, fantastic post. My journey with opencv and deep learning has started with you and I am enjoying every moment of it. I would really love to learn to count automobile traffic with OpenCV and I’m looking forward to your next blog post.

    Thank you

    • Adrian Rosebrock March 13, 2019 at 3:45 pm #

      Thanks Lyron, I really appreciate the kind words. I’ll absolutely be covering traffic counting in my upcoming Computer Vision + Raspberry Pi book. Stay tuned!

  125. nach March 11, 2019 at 4:58 pm #

    is there a way to run in windows?

    • Adrian Rosebrock March 11, 2019 at 5:51 pm #

      You can certainly run it on Windows, just make sure you have the proper pre-reqs installed. OpenCV and dlib you should be configured and installed. I would suggest starting there.

  126. Karl March 12, 2019 at 11:04 pm #

    This is extremely helpful. Is there a way to send an mjpeg stream as the video source instead of a file or the default webcam?

    • Adrian Rosebrock March 13, 2019 at 3:13 pm #

      Yes, but you need to specify the IP address of the camera (and potentially other parameters) to the “src” of the VideoStream.

      • Karl March 13, 2019 at 7:31 pm #

        I changed the script to read:
        vs = VideoStream(src=””).start()

        and it returned with an error of:
        OpenCV(3.4.1) Error: Assertion failed (scn == 3 || scn == 4) in cvtColor

        It sounded like that sometimes might mean that the libjpeg module was installed at the time of compiling, however I did “sudo apt-get install libjpeg-dev” before compiling.

        Maybe there’s an mjpeg module I’m missing too? Curious if you had any insight.

        • Adrian Rosebrock March 19, 2019 at 10:30 am #

          No, the error is because the object returned from the stream is “NoneType”, implying that a frame could not be successfully read from the stream.

  127. Elena March 17, 2019 at 2:08 pm #

    Hi Adrian,
    Could you please tell me how can modify the detection part to put bounding boxes and confidence level as long as Id numbers(counting them)?
    like this post but with counter

    • Adrian Rosebrock March 19, 2019 at 10:06 am #

      You can use the “cv2.rectangle” and “cv2.putText” functions to draw bounding boxes and integer IDs. If you are new to computer vision and image processing I would reading Practical Python and OpenCV so you can learn the basics.

      • Elena March 23, 2019 at 6:37 pm #

        Thank you,
        Yes, I am new to OpenCV, just started reading your books.
        Could you please also guide me how can I change the detection part to YOLO3? I want confidence level, the name of the object, as well as the bounding boxes.

        • Elena March 23, 2019 at 6:42 pm #

          sorry forgot to tell, I want it to be real-time. DNN module does have access to GPU. How can I count in realtime though?

          • Adrian Rosebrock March 27, 2019 at 9:15 am #

            While I’m happy to provide these tutorials for free I cannot modify the code for you. The blog posts explain the code rigorously, the code itself is well documented as well. I would suggest you download the code and start playing with it. Building your own projects is the best way to learn. I believe in you Elena, good luck! And if you need more help you can find me inside the PyImageSearch Gurus course.

  128. Petar March 19, 2019 at 2:26 pm #

    Hey, thanks for the tutorial. I have a question which is a bit off topic. I am working on a project where I detect animals from a video and I want to be able to identify them so I can track whether they leave a certain area and for how much time they stay out of it. Individual animals are not that important and it is fine for me if IDs are switched. Centroid tracking was fine up to the point I started checking whether the animals are outside of a certain area by checking whether their centroid is in the area. Unfortunately, this is not good enough for me and I would like to also keep track of the startX,startY,endX,endY coordinates of their bounding boxes. Maybe it is obvious but I cannot figure out how to do this with your implementation of Centroid tracking. Can you give me so advice on how to do it if it is possible or additionally is there another way that I can use to identify objects? I can’t seem to find much on the topic of identification on the web.

    • Adrian Rosebrock March 22, 2019 at 9:52 am #

      How is the “area” defined? Is the area within view of the camera? Or is it out of view? It would likely be easier to provide a suggestion if you had an example video of what you’re working with.

      • Petar March 23, 2019 at 5:26 am #

        I don’t know how to send you an image over here but I will try to explain how a frame of the video looks as well as possible. The camera that shot the video is put on top of a microscope which and it films bacteria in a petri dish so the area I am talking about is circular. I think I managed to save the coordinates of the bounding boxes by registering by saving objects in the centroid tracking with their bboxes coordinates as well as centroid coordinates and then I slice inputCentroids and objectCentroids to get the centroid coordinates and compare the distance between them.

        • Adrian Rosebrock March 27, 2019 at 9:20 am #

          I think I understand the problem now. Your goal is to maintain a history of all their bounding box coordinates, correct? If so, all you need is Python’s “deque” data structure. This tutorial will show you how to use it. It uses centroids rather than bounding boxes but you can update it to use the bounding boxes.

  129. clarence chhoa March 21, 2019 at 10:07 pm #

    Hello, i want to process this code in my desktop, but my camera from pi is built other way and just to send the frame to my desktop how can i do this? any suggestion?

    • Adrian Rosebrock March 22, 2019 at 8:28 am #

      I’ll be covering that very question in my upcoming Computer Vision + Raspberry Pi book, stay tuned!

  130. amare mahtsentu March 24, 2019 at 8:19 am #

    I have a plan to get a copy of your book like i get the deep learning starter bundle…. i expect it all in one

    • Adrian Rosebrock March 27, 2019 at 9:06 am #

      Thanks Amare 🙂

  131. amare mahtsentu March 24, 2019 at 8:32 am #

    why we need the self.maxDisappeared to be 50…. is it not too much ? it is equivalent to 50 frames. since the centroid of the lost object is same through out after it get lost, it has less chance of getting a match with the current centroids. or may be the centroid will match with the wrong object and that will be a problem for assigning wrong object id…. normally it has no effect on the counting …… i checked it out……

  132. WITTOS March 26, 2019 at 2:31 am #

    hello Adrian , i tried to use your code in order to count people with custom video , but i remarked that the processing of video is very fast . can i reduce the speed of processing ?

    • Adrian Rosebrock March 27, 2019 at 8:44 am #

      OpenCV’s job is to process the frames as quickly as possible. You can “slow down” the processing by calling “time.sleep” in the body of the “while” loop.

  133. Wilson Ting March 26, 2019 at 2:44 am #

    Hi if i am to live stream the video from the pi from a remote location to my PC, what would i need to change in the source code to allow me to listen in to the IP of the remote camera that is doing it’s live stream?

    How can i configure this? “vs = VideoStream(src=0).start()” to accept the live stream directly and process it remotely?

    • Adrian Rosebrock March 27, 2019 at 8:43 am #

      I’ll be covering remote streaming in my upcoming Computer Vision + Raspberry Pi book, stay tuned!

  134. Bobby March 28, 2019 at 12:04 pm #

    How can you incorporate timestamps with each person ( the entry time into the frame)?

    • Adrian Rosebrock April 2, 2019 at 6:29 am #

      I would suggest modifying the “TrackableObject” to include an “entry_time” instance variable. That variable should be set with the current timestamp with the object is first detected.

  135. Karl March 28, 2019 at 5:51 pm #

    I was able to get it to count people in real time using the example video, but it chokes when the video has more people. Like for instance, this video … its about 6 times slower than real-time on a PC.

    Is it possible to speed things up even with video of a lot of people?

    • Adrian Rosebrock April 2, 2019 at 6:26 am #

      Try using this post where I show how to distribute the object trackers over multiple processes.

  136. Lukcy Blue April 1, 2019 at 9:54 am #

    Thanks for your sharing.
    I’m working with some other videos using your source code, but I’m not sure how I can make it better than now. Do I need to share my videos?

  137. Appus April 3, 2019 at 2:07 am #

    Hello Adrian,

    Thanks for the wonderful code. So far everything was going fine,but while running the code when the lady just touches the yellow line the video is getting closed. Is this a problem with the model or something else.

    • Adrian Rosebrock April 4, 2019 at 1:25 pm #

      That is quite strange. Is there an error printed to your terminal? Double-check your terminal output.

  138. DeathCoder April 3, 2019 at 3:42 pm #

    The current code uses the centroid position to compare if its > or < the bar. How do I get the rectangle (top, left, right, bottom) of the "to" object that is being compared instead? I'd prefer to check if the rectangle intersects the line as the centroid may be above it. Thx

    • Adrian Rosebrock April 4, 2019 at 1:13 pm #

      I wouldn’t recommend trying to check if the centroid perfectly intersects the line. Due to quality of the video stream, artifacts, or simply a fast moving person and a slower processing pipeline, you cannot always guarantee the centroid will perfectly intersect the line.

      • DeathCoder April 5, 2019 at 9:30 am #

        I might have not been clear in my ask. I am actually looking to use centroids to calculate direction, but use “intersection” (I use shapely) of the bounding box of the object to my line that marks in vs. out. I solved the problem by storing the bounding boxes in the tracked objects as well during creation/update. Thanks!

  139. Eagle April 4, 2019 at 6:06 am #

    Hi i might have overlooked however it would be great if you could help me with drawing rectangles around each person.

    i saw a post regarding this using cv2.rectangle and cv2.putText however cant figure it out…

    Kind Regards

    • Adrian Rosebrock April 4, 2019 at 1:05 pm #

      It’s okay if you are new to the world of computer vision and OpenCV but if you need help learning the basics, such as using drawing functions, you’ll definitely want to read through Practical Python and OpenCV where I teach you the fundamentals. Definitely give it a look, I’m absolutely confident it will help you.

  140. Rakshith April 8, 2019 at 9:33 pm #

    Can this be created as a web page, suppose if you click a button on the web page the output video must be shown on that web page

    • Adrian Rosebrock April 12, 2019 at 12:28 pm #

      You mean like stream the output of the script to a web page?

  141. Rakshith April 10, 2019 at 9:48 am #

    Thank you very much for creating this tutorial, i found it immensely helpful, it would be very helpful if I get to know how to change the writer so that I can create a web application in such a way that the output video is shown on a web page

    • Adrian Rosebrock April 12, 2019 at 11:33 am #

      I am covering that very topic in my Raspberry Pi for Computer Vision book, stay tuned!

  142. Zaheer April 10, 2019 at 12:14 pm #

    Thank you for the post! I’m working with the same model of yours but I’m getting fake detections of people even if the code is same as yours. Can you help me with the problem?

    I only changed dlib.rect() with the arguements type as ‘int’ because without it I’m getting dlib error of types.

  143. Alaa April 12, 2019 at 5:01 am #

    Hi Adrian,

    first of all thanks for this nice implementation for people counter.
    please I would like to know why did you use two object tracking algorithms and can we use only one of them instead?

    thanks again

    • Adrian Rosebrock April 12, 2019 at 11:17 am #

      Sorry, I’m not sure what you mean by using “two object tracking algorithms”?

      • Bahi April 23, 2019 at 6:43 am #

        Dlib’s tracking algorithm and centroid trackers are doing different things.

        1. DLIB tracking algorithm give us boundary boxes (around detected objects)

        2. Centroid tracking algorithms’ role is to identify which boundary boxes belong to the same objects.

        • Adrian Rosebrock April 25, 2019 at 8:50 am #

          Thanks for jumping in there and clarifying, Bahi!

  144. prateti April 17, 2019 at 2:29 am #

    Hi adrian,

    can i use a pi camera for this tutorial?
    ( slow fps is fine for my application)

    • Adrian Rosebrock April 18, 2019 at 6:43 am #

      Change Line 43 to:

      vs = VideoStream(usePiCamera=True).start()

  145. Nigel April 19, 2019 at 12:15 am #

    Hello How do i hook this up to my pi camera and use it instead of just loading the preview videos?

    • Adrian Rosebrock April 25, 2019 at 9:31 am #

      Please refer to the post as I discuss that question already. Change:

      vs = VideoStream(src=0).start()


      vs = VideoStream(isPiCamera=true).start()

  146. Shivaranjini April 20, 2019 at 5:16 am #

    Hi Adrian,

    Thank you very much for the great article.

    I am facing below mentioned problems to accurately detect the people count:

    1. am just trying this on my camera and what I am observing is that if the detection of a person for some reason fails at the 30th frame, then the count is not accurate. It is missing out few people.

    2.When more people are moving together in a group even then I can see wrong count as detection framework is failing to detect all the people in group.

    3. When a person is idle for some time (standing still or sitting) and suppose detection is failing intermittently, (identifying sometimes and sometimes not) then the person ID is incrementing and resulting in wrong count.

    All these above mentioned problems are related to detection. Do you have any suggestion for them?

  147. Stefan Griffiths April 22, 2019 at 1:51 pm #

    Hi I was wondering is there a way in which i can show the video output on a website instead of in the window like produced above

  148. Martin S April 24, 2019 at 3:00 am #

    Hi adrian,
    Thank you so much for this project. I am currently trying to use this program. However, I got an error from the dlib.rectangle, the error says about Argument error. Can you please help me with this

    • Adrian Rosebrock April 25, 2019 at 8:41 am #

      Sorry, without knowing the exact error I cannot provide a suggestion.

  149. Rafael April 25, 2019 at 8:57 am #

    Hi, for what I can see, when the detection occurs, all trackings are lost (trackers = [ ]) and we start from scratch. Suppose a person is halfway through the scene and the detection starts, how could I persist that person’s id?

    Basically, is it possible to match detections with already tracked objects?


    • Rafael April 25, 2019 at 9:28 am #

      Forget it, I did not read the part of defining id based on the centroids distance 😀

      my apologies

      • Adrian Rosebrock April 25, 2019 at 9:39 am #

        No problem, Rafael!

  150. Ezequiel Arceo April 27, 2019 at 10:11 am #

    Hi Adrian

    What a great post!

    I would like to know the setup for the recording
    of the example videos used here, specifically
    which are the height and angle of depression
    of the camera.


    • Adrian Rosebrock May 1, 2019 at 11:56 am #

      Sorry, I don’t recall the exact angles.

  151. Ricky April 29, 2019 at 11:04 am #

    Hello Adrian ,
    Thank you for a great tutorial.
    l tried to change your codeto detect left to right instead up down.
    How do I make changes on line 204 to keep track of the x-coordinates?

    • Adrian Rosebrock May 1, 2019 at 11:42 am #

      Hey Ricky — I’m covering left-to-right tracking (along with vehicle tracking) inside Raspberry Pi for Computer Vision. Definitely take a look.

  152. James May 3, 2019 at 5:05 pm #

    This is great! I was wondering if there was a way to track them also going from left to right along with up and down?

    • James May 3, 2019 at 5:43 pm #

      I wanted it to work alongside the up/down simultaneously! Would that be possible?

      Having the horizontal and vertical line at the same time and tracking up, down, left, and right at the same time

      • Adrian Rosebrock May 8, 2019 at 1:31 pm #

        Hey James — I’m covering both left/right and up/down tracking inside Raspberry Pi for Computer Vision. The implementation there will help you.

  153. Martin S May 19, 2019 at 9:44 am #

    Hi Adrian, can I ask what is the minimum CPU Requirement for this people counter script ? I have run it using Odroid Xu 4 mini PC and it is quite laggy.
    Thank you


    • Adrian Rosebrock May 23, 2019 at 9:50 am #

      The Odroid Xu4 CPU is far too underpowered for that script. You would need a laptop/desktop/

      I’m actually covering how to optimize the people counter script for embedded devices, like the Odroid Xu4 and RPi, inside Raspberry Pi for Computer Vision.

  154. Alan Marion May 19, 2019 at 9:59 pm #

    Hi Adrian,

    Your tutorial and code are truly amazing and have helped me get a jump start on this technology. I have run the code successfully on Ubuntu 16.04 on 2 computers — one quite old and another more modern. I can definitely see a difference in performance and the newer laptop works fairly well. To experiment with performance I have installed Intel’s new “Making Python Fly: Accelerate Performance Without Re-coding”

    I am encountering some errors that seem to be related to “imutils”
    Still checking out the possible error sources, but would be interested in the experience of others and any practical tips.


  155. Alan Marion May 19, 2019 at 10:04 pm #

    Hi again Adrian,

    Somehow this quote got omitted from my just submitted reply ….

    “Intel® Distribution for Python*, which is absolutely free by the way, uses tried-and-true libraries like the Intel® Math Kernel Library (Intel® MKL) and the Intel® Data Analytics Acceleration Library (Intel® DAAL) to make Python code scream right out of the box – no recoding required. Here’s some of the benefits dev teams can expect:

    First, number crunching packages like NumPy, SciPy, and scikit-learn are all executed natively, rather than being interpreted. That’s one huge speed boost.”

  156. prash May 22, 2019 at 3:42 am #

    Hi Adrian, Thanks for the nice tutorial. Are you going to cover this in your book Raspberry Pi for Computer Vision in detail? Would it be explaining the conditions where this code does not work? The code works perfectly fine in the video you have shared but does not work on my own videos in a similar setup. I have an overhead camera for counting people coming in and out of the room. It gives false counting.