Face detection with OpenCV and deep learning

Click here to download the source code to this post.

Today I’m going to share a little known secret with you regarding the OpenCV library:

You can perform fast, accurate face detection with OpenCV using a pre-trained deep learning face detector model shipped with the library.

You may already know that OpenCV ships out-of-the-box with pre-trained Haar cascades that can be used for face detection…

…but I’m willing to bet that you don’t know about the “hidden” deep learning-based face detector that has been part of OpenCV since OpenCV 3.3.

In the remainder of today’s blog post I’ll discuss:

  • Where this “hidden” deep learning face detector lives in the OpenCV library
  • How you can perform face detection in images using OpenCV and deep learning
  • How you can perform face detection in video using OpenCV and deep learning

As we’ll see, it’s easily to swap out Haar cascades for their more accurate deep learning face detector counterparts.

To learn more about face detection with OpenCV and deep learning, just keep reading!

Looking for the source code to this post?
Jump right to the downloads section.

Face detection with OpenCV and deep learning

Today’s blog post is broken down into three parts.

In the first part we’ll discuss the origin of the more accurate OpenCV face detectors and where they live inside the OpenCV library.

From there I’ll demonstrate how you can perform face detection in images using OpenCV and deep learning.

I’ll then wrap up the blog post discussing how you can apply face detection to video streams using OpenCV and deep learning as well.

Where do these “better” face detectors live in OpenCV and where did they come from?

Back in August 2017, OpenCV 3.3 was officially released, bringing it with it a highly improved “deep neural networks” ( dnn ) module.

This module supports a number of deep learning frameworks, including Caffe, TensorFlow, and Torch/PyTorch.

The primary contributor to the dnn  module, Aleksandr Rybnikov, has put a huge amount of work into making this module possible (and we owe him a big round of thanks and applause).

And since the release of OpenCV 3.3, I’ve been sharing a number of deep learning OpenCV tutorials, including:

However, what most OpenCV users do not know is that Rybnikov has included a more accurate, deep learning-based face detector included in the official release of OpenCV (although it can be a bit hard to find if you don’t know where to look).

The Caffe-based face detector can be found in the face_detector  sub-directory of the dnn samples:

Figure 1: The OpenCV repository on GitHub has an example of deep learning face detection.

When using OpenCV’s deep neural network module with Caffe models, you’ll need two sets of files:

  • The .prototxt file(s) which define the model architecture (i.e., the layers themselves)
  • The .caffemodel file which contains the weights for the actual layers

Both files are required to when using models trained using Caffe for deep learning.

However, you’ll only find the prototxt files here in the GitHub repo.

The weight files are not included in the OpenCV samples  directory and it requires a bit more digging to find them…

Where can I can I get the more accurate OpenCV face detectors?

For your convenience, I have included both the:

  1. Caffe prototxt files
  2. and Caffe model weight files

…inside the “Downloads” section of this blog post.

To skip to the downloads section, just click here.

How does the OpenCV deep learning face detector work?

Figure 2: Deep Learning with OpenCV’s DNN module.

OpenCV’s deep learning face detector is based on the Single Shot Detector (SSD) framework with a ResNet base network (unlike other OpenCV SSDs that you may have seen which typically use MobileNet as the base network).

A full review of SSDs and ResNet is outside the scope of this blog post, so if you’re interested in learning more about Single Shot Detectors (including how to train your own custom deep learning object detectors), start with this article here on the PyImageSearch blog and then take a look at my book, Deep Learning for Computer Vision with Python, which includes in-depth discussions and code enabling you to train your own object detectors.

Face detection in images with OpenCV and deep learning

In this first example we’ll learn how to apply face detection with OpenCV to single input images.

In the next section we’ll learn how to modify this code and apply face detection with OpenCV to videos, video streams, and webcams.

Open up a new file, name it detect_faces.py , and insert the following code:

Here we are importing our required packages (Lines 2-4) and parsing command line arguments (Lines 7-16).

We have three required arguments:

  • --image : The path to the input image.
  • --prototxt : The path to the Caffe prototxt file.
  • --model : The path to the pretrained Caffe model.

An optional argument, --confidence , can overwrite the default threshold of 0.5 if you wish.

From there lets load our model and create a blob from our image:

First, we load our model using our --prototxt  and --model  file paths. We store the model as net  (Line 20).

Then we load the image  (Line 24), extract the dimensions (Line 25), and create a blob  (Lines 26 and 27).

The dnn.blobFromImage  takes care of pre-processing which includes setting the blob  dimensions and normalization. If you’re interested in learning more about the dnn.blobFromImage  function, I review in detail in this blog post.

Next, we’ll apply face detection:

To detect faces, we pass the blob  through the net  on Lines 32 and 33.

And from there we’ll loop over the detections  and draw boxes around the detected faces:

We begin looping over the detections on Line 36.

From there, we extract the confidence  (Line 39) and compare it to the confidence threshold (Line 43). We perform this check to filter out weak detections.

If the confidence meets the minimum threshold, we proceed to draw a rectangle and along with the probability of the detection on Lines 46-56.

To accomplish this, we first calculate the (x, y)-coordinates of the bounding box (Lines 46 and 47).

We then build our confidence text  string (Line 51) which contains the probability of the detection.

In case the our text  would go off-image (such as when the face detection occurs at the very top of an image), we shift it down by 10 pixels (Line 52).

Our face rectangle and confidence text  is drawn on the image  on Lines 53-56.

From there we loop back for additional detections following the process again. If no detections  remain, we’re ready to show our output image  on the screen (Lines 59 and 60).

Face detection in images with OpenCV results

Let’s try out the OpenCV deep learning face detector.

Make sure you use the “Downloads” section of this blog post to download:

  • The source code used in this blog post
  • The Caffe prototxt files for deep learning face detection
  • The Caffe weight files used for deep learning face detection
  • The example images used in this post

From there, open up a terminal and execute the following command:

Figure 3: My face is detected in this image with 74% confidence using the OpenCV deep learning face detector.

The above photo is of me during my first trip to Ybor City in Florida, where chickens are allowed to roam free throughout the city. There are even laws protecting the chickens which I thought was very cool. Even though I grew up in rural farmland, I was still totally surprised to see a rooster crossing the road — which of course spawned many “Why did the chicken cross the road?” jokes.

Here you can see my face is detected with 74.30% confidence, even though my face is at an angle. OpenCV’s Haar cascades are notorious for missing faces that are not at a “straight on” angle, but by using OpenCV’s deep learning face detectors, we are able to detect my face.

And now we’ll see how another example works, this time with three faces:

Figure 4: The OpenCV DNN face detector finds all three images without any trouble.

 

This photo was taken in Gainesville, FL after one of my favorite bands finished up a show at Loosey’s, a popular bar and music venue in the area. Here you can see my fiance (left), me (middle), and Jason (right), a member of the band.

I’m incredibly impressed that OpenCV can detect Trisha’s face, despite the lighting conditions and shadows cast on her face in the dark venue (and with 86.81% probability!)

Again, this just goes to show how much better (in terms of accuracy) the deep learning OpenCV face detectors are over their standard Haar cascade counterparts shipped with the library.

Face detection in video and webcam with OpenCV and deep learning

Now that we have learned how to apply face detection with OpenCV to single images, let’s also apply face detection to videos, video streams, and webcams.

Luckily for us, most of our code in the previous section on face detection with OpenCV in single images can be reused here!

Open up a new file, name it detect_faces_video.py , and insert the following code:

Compared to above, we will need to import three additional packages: VideoStream , imutils , and time .

If you don’t have imutils  in your virtual environment, you can install it via:

Our command line arguments are mostly the same, except we do not have an --image  path argument this time. We’ll be using our webcam’s video feed instead.

From there we’ll load our model and initialize the video stream:

Loading the model is the same as above.

We initialize a VideoStream  object specifying camera with index zero as the source (in general this would be your laptop’s built in camera or your desktop’s first camera detected).

A few quick notes here:

  • Raspberry Pi + picamera users can replace Line 25 with vs = VideoStream(usePiCamera=True).start() if you wish to use the Raspberry Pi camera module.
  • If you to parse a video file (rather than a video stream) swap out the VideoStream  class for FileVideoStream . You can learn more about the FileVideoStream class in this blog post.

We then allow the camera sensor to warm up for 2 seconds (Line 26).

From there we loop over the frames and compute face detections with OpenCV:

This block should look mostly familiar to the static image version in the previous section.

In this block, we’re reading a frame  from the video stream (Line 32), creating a blob  (Lines 37 and 38), and passing the blob  through the deep neural net  to obtain face detections (Lines 42 and 43).

We can now loop over the detections, compare to the confidence threshold, and draw face boxes + confidence values on the screen:

For a detailed review of this code block, please review the previous section where we perform face detection to still, static images. The code here is nearly identical.

Now that our OpenCV face detections have been drawn, let’s display the frame on the screen and wait for a keypress:

We display the frame  on the screen until the “q” key is pressed at which point we break  out of the loop and perform cleanup.

Face detection in video and webcam with OpenCV results

To try out the OpenCV deep learning face detector make sure you use the “Downloads” section of this blog post to grab:

  • The source code used in this blog post
  • The Caffe prototxt files for deep learning face detection
  • The Caffe weight files used for deep learning face detection

Once you have downloaded the files, running the deep learning OpenCV face detector with a webcam feed is easy with this simple command:

Figure 5: Face detection in video with OpenCV’s DNN module.

You can see a full video demonstration, including my commentary, in the following video:

Summary

In today’s blog post you discovered a little known secret about the OpenCV library — OpenCV ships out-of-the-box with a more accurate face detector (as compared to OpenCV’s Haar cascades).

The more accurate OpenCV face detector is deep learning based, and in particular, utilizes the Single Shot Detector (SSD) framework with ResNet as the base network.

Thanks to the hard work of Aleksandr Rybnikov and the other contributors to OpenCV’s dnn  module, we can enjoy these more accurate OpenCV face detectors in our own applications.

The deep learning face detectors can be hard to find in the OpenCV library, so
for your convenience, I have put gathered the Caffe prototxt and weight files for you — just use the “Downloads” form below to download the (more accurate) deep learning-based OpenCV face detector.

See you next week with another great computer vision + deep learning tutorial!

Downloads:

If you would like to download the code and images used in this post, please enter your email address in the form below. Not only will you get a .zip of the code, I’ll also send you a FREE 17-page Resource Guide on Computer Vision, OpenCV, and Deep Learning. Inside you'll find my hand-picked tutorials, books, courses, and libraries to help you master CV and DL! Sound good? If so, enter your email address and I’ll send you the code immediately!

, , , , , , , , ,

202 Responses to Face detection with OpenCV and deep learning

  1. Chris Combs February 26, 2018 at 10:50 am #

    How well does the new OpenCV model recognize faces of various skin tones? Do we know how it was trained?

    • Adrian Rosebrock February 26, 2018 at 1:40 pm #

      I have not performed an exhaustive evaluation for various skin tones so I’m not sure. That would make for a interesting article. If I cannot write one I would love to see one written by a PyImageSearch reader. The GitHub repo has more information on the training process.

  2. wildan February 26, 2018 at 10:53 am #

    what’s different this method with haar cascade’s face detection ?

    • Adrian Rosebrock February 26, 2018 at 1:39 pm #

      This method uses deep learning, in particular a Single Shot Detector (SSD) with ResNet base network architecture. That is the difference.

      • Sampreet Sarkar May 27, 2018 at 1:54 pm #

        Hi Adrian, big fan of your blog. I was amazed when I tried out the results myself! Saved me a lot of hassle. However, I was wondering how I could add a face recognition feature on top of this. Would help a lot if you could explain. Cheers!

        • Adrian Rosebrock May 28, 2018 at 9:38 am #

          Hey Sampreet, I’ll likely be covering facial recognition soon but I would also suggest taking a look at the PyImageSearch Gurus course where I cover facial recognition in depth.

  3. Ratman February 26, 2018 at 10:58 am #

    Sounds nice, but how is the the performance? There are a lot of face detection frameworks but they are not even near real time.

    • Adrian Rosebrock February 26, 2018 at 1:39 pm #

      Please see the video where I provide a live demonstration. This face detector can be used in real-time.

      • ratman February 28, 2018 at 3:41 pm #

        Thanks for the reply, Adran! I think it is still a bit slow for our family album including 40k pictures, but it is worth a trial then 🙂

  4. KK February 26, 2018 at 10:59 am #

    Hi Adrian, nice to know ..is this detector faster than dlib’s detector? thanks much!

    • Adrian Rosebrock February 26, 2018 at 1:38 pm #

      I haven’t tested them side-by-side, but it should be comparably fast to dlib’s HOG + Linear SVM detector.

  5. Curious Observer February 26, 2018 at 12:02 pm #

    Hey Adrian,

    First off, I want to thank you for the great work you’ve done so far. Your blog is the basis for the computer vision startup we’re founding.

    Coming back to this specific blog post, I haven’t tested it yet, but how do you think the speed of this DNN will compare to Haar cascades on a Raspberry Pi? On my computer, I’m seeing about a 25% slowdown.

    Which would you choose for detecting faces on a Pi where both speed and accuracy were equally important?

    • Adrian Rosebrock February 26, 2018 at 1:38 pm #

      Haar cascades will be the fastest here, but the deep learning face detector will give you the most accuracy. As for which one to choose, that really depends on your project. If you’re looking to detect faces that will naturally have more variability in viewing angle, use the deep learning detector. If the faces will almost always be “straight on” then the Haar cascades will likely be sufficient. Again, it really depends on the project.

    • TetsFR March 10, 2018 at 12:42 pm #

      Not sure about Haar cascades but this deepnet runs on my rpi3 at about 1fps, maybe a bit slower. So that is still usable for some projects although difficult for realtime applications. I am trying to control servos of a pan tilt camera mount and there is so much delay in the feedback loop that it is tricky to manage (if you guys have a suggestion of an algo that is robust to such delay I would take it).
      Tks Andrian for the great tutorial, as always.

  6. Matt February 26, 2018 at 12:45 pm #

    This is awesome to hear OpenCV now ships with the dnn module.

    If you wanted to take this a step further to start recognizing particular faces, would you have to go back to deep learning to actually teach the faces? Would there be a way to leverage this to assist in the collection of labeled samples?

    Thanks!

    • Adrian Rosebrock February 26, 2018 at 1:36 pm #

      Facial recognition is a two stage process. The first stage is detecting the presence of a face in an image but not knowing “who” the actual face is. The second stage is taking each detected face and recognizing it. For this, you would need a dedicated facial recognition algorithm. I actually discuss how to create a Python script to assist in collecting labeled faces (as you suggested) inside the PyImageSearch Gurus course. From there you’ll be able to build your own facial recognizer.

  7. swapnil February 26, 2018 at 1:13 pm #

    One of the post I was eagerly looking for. Thanks adrian for this post. You are the best when it comes to computer vision.

    • Adrian Rosebrock February 26, 2018 at 1:34 pm #

      Thank you swapnil! 🙂

  8. Steve Cox February 26, 2018 at 9:04 pm #

    Very nice !!!! I look forward to playing with this example.

    Now how do we train this deep model to recognize “Our” faces 😉

    I think this is in the right direction and away from eigenfaces which I noticed don’t seem to be accurate. (Not an exhaustive test on my part) I can still see using a har cascade in front of the this deep learning SSD. Har is so fast I think the two algo stacked together make sense.

    Thanks again !!!!

    • Adrian Rosebrock February 27, 2018 at 11:33 am #

      There are a bunch of ways to perform face recognition using deep learning. One of my favorites is to use deep learning embeddings of the faces. I’ll cover this is well in the future.

  9. phillity February 26, 2018 at 11:58 pm #

    Hi Adrian. Thanks for the great tutorial!

    • Adrian Rosebrock February 27, 2018 at 11:29 am #

      Thanks so much Phillity, I’m happy you enjoyed it! 🙂

  10. ray February 27, 2018 at 10:38 am #

    great tutorial!!!!

    • Adrian Rosebrock February 27, 2018 at 11:20 am #

      Thank you Ray, I’m glad you enjoyed it! 🙂

  11. Nam Phan February 27, 2018 at 12:53 pm #

    first off, great tut as usual ! excellent !
    i wonder if there are some helper functions accompanying with the package that i can use to extract extra information of face’s parts like : eyes, nose, ears , forehead … positions .
    If not then is there any packages to do such extraction after detecting using this dnn
    thanks

    • Adrian Rosebrock February 28, 2018 at 1:49 pm #

      It sounds like you’re referring to facial landmarks. See this post for more details, including code.

  12. han February 28, 2018 at 2:45 am #

    Thank you for your efforts and sharing.
    I really like to read your posts!!

    • Adrian Rosebrock February 28, 2018 at 1:50 pm #

      Thanks Han, I’m glad you’re enjoying them!

  13. GabriellaK February 28, 2018 at 5:16 am #

    Great, Is there something new for full body detection?

    • Adrian Rosebrock February 28, 2018 at 1:50 pm #

      What specifically regarding full body detection? Detecting the presence of a body in an image? Localizing each of the arms, legs, joints, etc.?

  14. Mark February 28, 2018 at 7:17 am #

    Thanks Adrian,

    Brilliant stuff as usual, I managed to replicate that in C++ but still very slow on my RPi.
    Any idea if Movidius stick can be used to boost the recognition part here?

    Regards,
    Mark

    • Adrian Rosebrock February 28, 2018 at 1:52 pm #

      I haven’t tried this code on the Movidius but from the previous post I used a Caffe model weights + architecture for a MobileNet + SSD. It seems to reason that a ResNet + SSD would work as well. I would try loading the face detector via the Movidius but I get the feeling that you might have to work with it a bit.

  15. Chris Burns February 28, 2018 at 11:38 am #

    Adrian, thank you for this post and some excellent insight into OpenCV. Can you elaborate on why you chose to use the VideoStream feature of imutils? I have a non-traditional set up (Rpi3 with custom ARM64 (aarch64) kernel. I’ve compiled OpenCV and everything looks good there but the imutils – vs.read() call is returning null. I was thinking about going to OpenCV.videoCapture but thought I would ask the above question before I started. Thanks!

    • Adrian Rosebrock February 28, 2018 at 1:53 pm #

      The VideoStream is my implementation of a faster, threaded frame polling class. It is compatible with both USB/built-in webcams along with the Raspberry Pi camera module. You could use OpenCV’s cv2.VideoCapture function but you’ll get an extra performance boost from VideoStream which is a must when working with the Raspberry Pi. You can read more about the VideoStream class here.

  16. Anish Varsha March 1, 2018 at 12:07 pm #

    Hey Adrian,

    I find the tutorial very useful with the differences between SSD and HOG detection are night and day. Can you suggest me where I can find the Face Recognition Using Deep Learning in OpenCV? Thanks!

  17. Dominic Pritham March 3, 2018 at 3:41 pm #

    This is super cool Adrian. It is very reliable. In traditional face detection, I have had issues when the face leaves the frame and re-enters. This is so exciting. Thank you so very much for writing this blog.

    • Adrian Rosebrock March 7, 2018 at 9:42 am #

      Thanks Dominic, I’m glad you enjoyed the post! 🙂

  18. Peshmerge March 6, 2018 at 10:36 am #

    Hi Adrian,

    Thanks for your great article. It’s really helpful! I have couple of notes based on my observation while testing your code.
    I am writing now my thesis at Amsterdam University of Applied Sciences, and it’s about Facial detection and recognition on children. My target group is children aged between 0 and 12 years old.
    I am stil busy with researching, but I tried your code just to build a fast proof-of-concept and it didn’t work well in the beginning. I have adjusted 3 parameters and it did it well.
    Those parameters were:
    1) –confidence, which is now 0.40
    2) x and y while calling cv2.dnn.blobFromImage(). In your original code it’s 300*300 in my code I just changed to be the height of the input image.

    Here is the result of the running your code without changing anything
    https://imgur.com/a/BdcPl

    here the result after changing the parameters (the confidence doesn’t matter 0.4 or 0.5)
    https://imgur.com/a/QQFrN

    Do you have any explanation?

    Kind regards, Peshmerge

    • Adrian Rosebrock March 7, 2018 at 9:10 am #

      The confidence is the minimum probability. It is used to filter out weak(er) detections. You can set the confidence as you see best fit for your project/application.

      • Peshmerge March 12, 2018 at 7:18 am #

        Thanks for you reply! To be honest adjusting the x and y just made the difference for me! I wonder why did you choose to give it 300*300?

        • Adrian Rosebrock March 14, 2018 at 1:07 pm #

          300×300 is the typical input image size for SSDs and Faster R-CNNs.

  19. Amal March 6, 2018 at 11:02 pm #

    hey Adrian
    I applied the same code on my raspberry pi 3 but it work very slowly and reboot after few scond each time I run the code

    • Adrian Rosebrock March 7, 2018 at 9:08 am #

      Hey Amal — the OpenCV deep learning face detector will certainly run slow on the Pi. For fast, but less accurate face detection you should use Haar cascades. As for the Pi automatically rebooting, I’m not sure what the problem is. It sounds like a hardware problem or your Pi is overheating.

  20. Ed n. March 7, 2018 at 3:45 pm #

    Hi Adrian,

    Is there a good way to covert Caffe based code to Keras? Using Caffe in the production os kind of hassle.

    thanks,
    Ed.

    • Adrian Rosebrock March 9, 2018 at 9:26 am #

      I would start by going through this resource which includes a number of deep learning model/weight converters.

  21. saverio March 8, 2018 at 9:24 am #

    I tried to run the compiled graph on movidius, but I’am a little bit confused about the retruned value from graph.GetResult(): an array of shape (1407,)!

    For sure a made some mistake…

  22. Adanalı March 8, 2018 at 6:03 pm #

    Hi Adrian
    This is a bit off topic, but I was wondering if you would be so kind as to write an article on making a “people counting” with OpenCV — that is, a program that counts people going in and out of a building via a live webcam feed. There are no great resources available online for this, so if you would write one I’m sure it would drive plenty of traffic to your site. It’s a win for both of us!
    Thanks

    • Adrian Rosebrock March 9, 2018 at 8:54 am #

      Thank you for the suggestion. I will certainly consider it for the future.

      • kaisar khatak July 5, 2018 at 1:09 am #

        Some folks have trained a head detector for people counting to get around occlusion issues…

  23. gopalakrishna March 9, 2018 at 9:26 pm #

    hi
    I am new to OpenCV it would be a great help if you tell how to add the path in argument parse (line 8 in the code)

    • Adrian Rosebrock March 14, 2018 at 1:28 pm #

      Take a look at this post to get started with command line arguments.

  24. lii March 15, 2018 at 9:25 am #

    hi, can someone help me with this error:

    [INFO] loading model…
    [INFO] starting video stream…
    Traceback (most recent call last):

    (h, w) = image.shape[:2]
    AttributeError: ‘NoneType’ object has no attribute ‘shape’

    • Adrian Rosebrock March 19, 2018 at 5:53 pm #

      OpenCV cannot access your webcam. See this post for more details.

  25. Kapil Goyal March 20, 2018 at 4:27 pm #

    This code is not working for a group photo with 50 people.

    • Adrian Rosebrock March 22, 2018 at 10:09 am #

      That’s not too surprising. If there are 50 people in the photo the faces are likely very small. Try increasing the resolution of the image before passing it into the network for detection.

  26. Kapil Goyal March 22, 2018 at 10:14 am #

    I want to create my own training set. How to train my own neural network in python for my college project?

    • Adrian Rosebrock March 22, 2018 at 10:18 am #

      I demonstrate how to train your first Convolutional Neural Network in this post. I would suggest starting there. If you’re interested in a deeper dive and understanding of how to train your own custom networks I would suggest Deep Learning for Computer Vision with Python where I discuss deep learning + computer vision in detail (including how to train your own custom networks). I hope that helps point you in the right direction!

      • Kapil Goyal March 22, 2018 at 12:31 pm #

        This means that these files are not opensource and we can’t generate these files by own and using these files will create copyright issue?

        • Adrian Rosebrock March 27, 2018 at 6:50 am #

          You would need to check the license associated with each model you wanted to use. Some are allowed for open source projects, others are just academic, and some do not allow commercial use. Typically if you wanted to use a model without restrictions you would need to train it yourself.

  27. Luan March 22, 2018 at 1:18 pm #

    Congratulations Adrian, great post.
    I am Brazilian would like to know if it has a way to decrease the quality of the image, or the frames per second, it was very slow running on the raspberry

    • Adrian Rosebrock March 27, 2018 at 6:48 am #

      This method really isn’t meant to be ran on the Raspberry Pi. You can decrease the resolution of the input image but it will still be too slow. See this post for more details. For the Raspberry Pi you should consider using Haar cascades if you need speed.

  28. Martina Rathna March 26, 2018 at 12:34 pm #

    Can you please tell me what went wrong?
    [INFO] loading model…
    OpenCV(3.4.1) Error: Unspecified error (FAILED: fs.is_open(). Can’t open “res10_300x300_ssd_iter_140000.caffemode”) in ReadProtoFromBinaryFile, file /home/pi/opencv-3.4.1/modules/dnn/src/caffe/caffe_io.cpp, line 1126
    Traceback (most recent call last):
    File “detect_faces.py”, line 23, in
    net = cv2.dnn.readNetFromCaffe(args[“prototxt”], args[“model”])
    (-2) FAILED: fs.is_open(). Can’t open “res10_300x300_ssd_iter_140000.caffemode” in function ReadProtoFromBinaryFile

    • Adrian Rosebrock March 27, 2018 at 6:18 am #

      It looks like your path to the input .caffemodel file is incorrect. This is most likely due to not correctly specifying the command line arguments path. If you’re new to command line arguments, that’s okay, but you should read up on them first.

  29. Abhilash March 26, 2018 at 3:20 pm #

    please let us know how to add PROTOTXT and MODEL path

  30. chopin March 27, 2018 at 2:36 am #

    Hi,Adrian,Thank you very much for your generosity.I am very fortunate to meet you here.

    I read carefully your blogs about object detection and pi project many times.It is undeniable that these has helped me very much and has given me a lot of inspiration.I admire your knowledge and ability, I almost became your fan.
    I‘m a ’sophomore and I am really interested in computer vision.I want to train my object detection model In defining the scene after read your blog three weeks ago(Real-time object detection with deep learning and OpenCV),it’s great and very funny.so I use these days to search about object detection papers amd I know YOLO SSD are great. so I successfully configured the environment about caffe-ssd(git clone https://github.com/weiliu89/caffe.git ) on Ubuntu16.04 .It can run about ssd_pascal_webcam.py and ssd_pascal_video.py, but when I run exmples/ssd/ssd_pascal.py to train pascal VOC data,I got an error.I spent three days trying to fix this error. I can’t remember how many times it was recompiled, but the problem persists. I asked Github but I didn’t get a reply.(The error issues:https://github.com/weiliu89/caffe/issues/875)

    I remember I received your patient reply. It feels warm. I would like you to take a look at this error and give me some advice to work it if you have time and like to do.Thanks again.

    • Adrian Rosebrock March 27, 2018 at 6:09 am #

      Hey Chopin — thank you for the kind words, I really appreciate that. Your comment really made my day 🙂

      While I’ve used Caffe quite a bit to train image classification networks I must admit that I have not used it to build an LMDB database and train it for object detection via an SSD so I’m not sure what the exact error is. Most of the work I’ve done with object detection involve either Keras, mxnet, or the TensorFlow Object Detection API (TFOD API). I would recommend starting with the TFOD API to get your feet wet.

  31. Cedric March 28, 2018 at 11:30 am #

    Hi Adrian,
    I tried this code using the Movidius and the Raspberry Pi. I interpreted the output similar to this post of yours:
    https://www.pyimagesearch.com/2018/02/19/real-time-object-detection-on-the-raspberry-pi-with-the-movidius-ncs/

    Unfortunately the face/faces are not detected in the right positions. The maximum confidence of all bounding boxes is around 40 %.

    Any advice on how I can update the model for usage on the Movidius?

    Thanks a lot!

    • Adrian Rosebrock March 30, 2018 at 7:18 am #

      Hey Cedric — are you confident that it’s the model itself? If the bounding boxes are in an incorrect position there might be a bug in decoding the (x, y)-coordinates from the model.

      • Chao Wu June 25, 2018 at 1:59 am #

        Hi Adrian,

        I did the same try, and got an (1407,) array. the result structure is strange. Do you have any example for this caffe model? I think it maybe movidius isse.

        • Adrian Rosebrock June 25, 2018 at 1:46 pm #

          This definitely sounds like a Movidius issue; however, I must admit that I’m not sure what the error is.

  32. Martin Faucheux March 29, 2018 at 11:27 am #

    Hey Adrian ! Everytime I’m looking for some help on a computer vision project, I come back to one of your tutorials. They are just excellent, clear and complete ! Thanks a lot for this you’re really helping me !

    • Adrian Rosebrock March 30, 2018 at 6:58 am #

      Thanks so much Martin, I really appreciate that! 🙂

  33. ryan March 29, 2018 at 3:36 pm #

    I tried the script and overall it works well. However, I noticed that when I moved my face to be very close to my camera, it started drawing a second rectangle adjacent to the correctly identified one. This issue appeared more easily (meaning that the distance between the face and the camera is shorter) when I increase the image size of the frame input (e.g. from width=300 -> width=300) to cv2.dnn.blobFromImage. Any advice on why it’s happening and how to fix would be much appreciated!

    • Adrian Rosebrock March 30, 2018 at 6:51 am #

      Object detectors are not perfect so you are bound to see some false-positives. The SSD algorithm works (at a very simplistic level) by dividing your image into boxes and classifying each of them, class-wise. Since your face most of the frame being close up to the camera, there are likely a large number of boxes that contain face-like regions. This would imply why you may see a detection adjacent to the real one.

  34. Martin Faucheux March 30, 2018 at 5:47 am #

    Hey Adrian ! Thanks again for this post, it is great !
    I need to recognize smaller faces on my video stream. Is it possible to adjust some parameters here to fit my problem without training my own model ? Maybe something in the blobToImage function ? I lack time and compute power.

    • Adrian Rosebrock March 30, 2018 at 6:41 am #

      Yes, you’ll want to modify this line:

      And change it o:

      blob = cv2.dnn.blobFromImage(cv2.resize(image, (NEW_WIDTH, NEW_HEIGHT)), 1.0,
      (300, 300), (104.0, 177.0, 123.0))

      Using the larger values.

      • Martin Faucheux March 30, 2018 at 11:02 am #

        Cool thanks ! I also read the blob tutorial but I didn’t really get why you need to resize the image. Also, what is the other size parameter (the provided (300,300) ) ?
        should this size match the resize image ?

  35. Peshmerge April 3, 2018 at 5:26 am #

    Hi Adrian,

    I have a question! Can I give feedback back to OpenCV to edit the model? I will explain what I mean. For example, I run the program on an image to detect faces, but what I get is that one of the detected objects isn’t a face, the program just identify it as a face. Is there a way which I can return value/parameter back to the program such that it edit it’s model and learn that that detected object isn’t a face so that it will correct itself!

    I hope my question is clear!

    Kind regards,
    Peshmerge

    • Adrian Rosebrock April 4, 2018 at 12:16 pm #

      There are less parameters to tune with the CNN-based detectors, as opposed to HOG + Linear SVM or Haar cascades, which is both a good thing and a bad thing. I would suggest trying different image sizes, both smaller and larger, to see if it has an impact on the quality of your detections.

      • Peshmerge April 6, 2018 at 5:49 am #

        Thanks Adrian!

  36. Ajithkumar April 4, 2018 at 2:33 am #

    drawing multiple boxes for single face

  37. A.N. O'Nyme April 11, 2018 at 5:15 am #

    Hi,

    I think that the mean value for the colors should be 104, 117, 123 instead of 104, 177, 123 (it is the mean value used in the training prototxt)

  38. Yunui April 12, 2018 at 12:51 pm #

    Hi Adrian,
    Thanks again for the great post. I just have a simple question.

    Which training dataset is used for this res10_300x300_ssd_iter_140000 model?

    I have searched a lot online but the only thing I have found is this link : https://github.com/opencv/opencv/blob/master/samples/dnn/face_detector/how_to_train_face_detector.txt

    The link says “The model was trained in Caffe framework on some huge and available online dataset.”

    Do you know the dataset in which the model is trained?

    Kind Regards

    Yunui

    • Adrian Rosebrock April 13, 2018 at 6:45 am #

      I do not know off the top of my head. You would need to reach out to Aleksandr Rybnikov, the creator of the model and “dnn” module in OpenCV.

  39. Abdulkadir April 18, 2018 at 3:02 am #

    Face recognition is in the process of registering. If more than one person passes in front of the camera, faces are confused. How can I separate them? You help me.

    • Adrian Rosebrock April 18, 2018 at 2:57 pm #

      You would need to detect both faces in the frame and identify each of them individually. Whatever model you are using for detection should localize each. If a face is too obfuscated you will not be able to recognize it.

  40. Mat April 19, 2018 at 2:43 pm #

    Hi Adrian!

    Is it possible to count the number of people in the screen at the same with this code? (Using a webcam)

    Thx!

    Mat

    • Adrian Rosebrock April 20, 2018 at 9:58 am #

      Yep. You can create a “counter” variable that counts the total number of faces. Something like:

      Would work well.

      If you’re new to working with OpenCV and computer vision for these types of applications I would suggest reading through Practical Python and OpenCV. Inside you’ll learn the fundamentals of computer vision and image processing — I also include chapters on face counting as well which would resolve your exact question.

      • Vaibhav Gupta August 20, 2018 at 4:58 pm #

        Now I want to display the serial number of face like “Face1” etc, or like the total number of faces being detected at a given instance , at the top of the video , how can I do that ?

        • Adrian Rosebrock August 22, 2018 at 9:48 am #

          You’ll want to use the cv2.putText function.

  41. Ali Hormati May 10, 2018 at 12:45 pm #

    Hi Adrian

    Thanks for this great post.

    If one wants to combine multiple detectors to get a higher accuracy, what approach do you suggest? Will it be useful?

    Thanks

    • Adrian Rosebrock May 14, 2018 at 12:14 pm #

      I’m not sure what you mean by “combine” in this context. Are you referring combining Haar cascades, HOG + Linear SVM, and deep learning-based detectors into a sort of “meta” detector?

      • Ayush September 5, 2018 at 1:33 am #

        Yes, how do we do that?

  42. Raunak May 13, 2018 at 11:24 pm #

    Hi. I’m running the code on a google colab python notebook, with the required files uploaded to my drive. I’m getting the following output on running the code:

    [INFO] loading model…
    [INFO] computing object detections…
    [ INFO:0] Initialize OpenCL runtime…
    : cannot connect to X server

    Please advise.
    Great article, though.

    • Adrian Rosebrock May 14, 2018 at 11:55 am #

      I’m not familiar with the Google cloud setup here, but I assume the Google cloud notebook does not have an X server installed. You won’t be able to access your webcam or display a video stream using the notebook. I would suggest executing the code on your local system.

  43. Pierre May 27, 2018 at 9:28 pm #

    Hello, I need information on facial recognition and not just facial detection.
    Can you help me?

    • Adrian Rosebrock May 28, 2018 at 9:32 am #

      Hey Pierre, thanks for the comment. I cover facial recognition inside the PyImageSearch Gurus course. Be sure to give it a look!

  44. Mukul Sharma May 31, 2018 at 3:09 pm #

    As usual a very good post. I have a question, what if I use Mobilenet for SSD, will it be faster than the given by Opencv, also what are the accuracy tradeoffs of using Mobilenet. Since we are pushing towards embedded system, what according to you is the best system to run on raspberry pi (with good accuracy)?

    • Adrian Rosebrock June 5, 2018 at 8:34 am #

      I would suggest you read the MobileNet and SSD papers to understand speed/accuracy tradeoffs. The gist is that using MobileNet as a base network to an SSD is typically faster but less accurate. Again, you’ll want to read the papers for more details.

  45. David June 5, 2018 at 2:01 pm #

    Hey Adrian,

    thanks so much for the useful tutorials and code! They helped me a lot. I have a somewhat weird question: is there any way to implement the 5-point landmark detection (from your later post: https://www.pyimagesearch.com/2018/04/02/faster-facial-landmark-detector-with-dlib/) with the deep learning face detection? Because the DL face detection works better with profile views of the face and this would be sth really useful for my research. Thanks for your help! Cheers

    • Adrian Rosebrock June 7, 2018 at 3:19 pm #

      Yes. Once you have the bounding box coordinates of the face you can convert them to a dlib “rectangle” object and then apply the facial landmark detector. This post shows you how to do it but you’ll want to swap out the 68-point detector for the 5-point detector.

  46. dan June 6, 2018 at 5:04 pm #

    Is there some way to specify the camera to be used with the code? I have multiple cameras and want to specify the one for face detection. I am using a udev rule that creates a symlink for each camera, so that there are unique names for them, such as “/dev/vidFaceDetector”

    I noticed that if I put the Ubuntu assigned name, such as “/dev/video1” into line 25, it works:
    vs = VideoStream(‘/dev/video1’).start()

    but putting in the symlink name does not:
    vs = VideoStream(‘/dev/vidFaceDetector’).start()

    but the Ubuntu assigned name changes, so it is no more useful than just the index number.

    • dan June 6, 2018 at 5:09 pm #

      here is the error that occurs when using the symlink:

      [INFO] starting video stream…
      Unable to stop the stream: Inappropriate ioctl for device

  47. dan June 6, 2018 at 6:35 pm #

    Sorry for the string of replies, there does not seem to be a way to edit the original one.

    I tried to see if I could get the ubuntu assigned name from the symlink using

    for camera in glob.glob(“/dev/vid*”):
    print(camera, os.readlink(camera))

    but the output is the bus address:

    /dev/vidFaceDetector bus/usb/001/006

    Is there a way to map the bus address to the corresponding “/dev/videoX” device?

    • Adrian Rosebrock June 7, 2018 at 3:05 pm #

      This is a great question and I remember another reading asking the same question on another blog post. To be totally candid I do not know the solution to this problem as I’ve never encountered it but it apparently does happen. I would suggest either (1) posting on the official OpenCV forums or (2) opening an Issue on OpenCV’s GitHub.

      If you do find out the solution can you come back and post it on this thread so others can learn from it as well?

      Thanks so much!

  48. murat June 8, 2018 at 7:49 am #

    Hi,

    I can make your code work by adjusting (300, 300) to the size of images I have and it is normally working perfectly. However now I have to use 12 MP (4056×3040) images. I adjusted the size argument in blobFromImage function in the same way I used to, but somehow it is not working anymore. I also tried to adjust the input_shape part in deploy.prototxt.txt file but I couldn’t get any result.

    Do you have any advice for this problem ?

    Thanks so much

    • Adrian Rosebrock June 13, 2018 at 6:15 am #

      Correct, this network is fully convolutional so you can update the size of the input images and it should still work. As far as your 12MP images go, I’m not sure what the problem is there. What were the previous image sizes you were using that the network was still working?

  49. Aditya Mishra June 13, 2018 at 3:34 am #

    Hi Adrian,

    Thanks for the awesome article! However, I was unable to understand why did the detect_faces.py would only detect faces. There are only 2 things that seem to do the trick
    1. deploy.prototxt file
    2. res10_300x300_ssd_iter_140000.caffemodel

    The prototxt file, shows the configuration of the model, so I assume the “res10_300x300_ssd_iter_140000.caffemodel” is responsible for the face detection.

    So, I wanted to know whether is it possible to replace the above model weights with any other model weights used for detecting other objects (say car, tree, street light, etc) as well as prototxt file & follow the rest of the tutorial as it is and expect it to work just fine?

    Could you point me to some other example for detecting other object that you know of following the similar approach?

    • Adrian Rosebrock June 13, 2018 at 5:24 am #

      Correct, the .prototxt file contains the model definition and the .caffemodel contains the actual weights for the model. Together, they are used to detect objects in images — in this case faces. You can replace these files with other models trained on various objects and recognize them as well.

  50. Huzefa June 15, 2018 at 12:30 am #

    Hey great post! I had one question. Is this algorithm cloud based or can also work on edge?

    • Adrian Rosebrock June 15, 2018 at 12:03 pm #

      This algorithm is not cloud-based. It will run locally.

  51. huzefa June 15, 2018 at 1:53 am #

    hey adrian! What is the use of opencl in this algorithm?

    • Adrian Rosebrock June 15, 2018 at 12:02 pm #

      Sorry, are you asking how to use OpenCL with this example? Or what OpenCL is?

  52. Eric Nguyen June 18, 2018 at 3:17 pm #

    Hi Adrian,

    Should this work ok with dlib’s 68 point facial landmark predictor? I tried taking the bounding box from this tutorial and passing it to dlib’s keypoint predictor, but it’s really unstable, ie, when moving my head side to side (pitch-wise) the points are predicted incorrectly and moving everywhere. I even cropped the bounding box to make it square to pass to dlib, and it still didn’t work there well. It seemed like dlib’s HOG face detector worked better. Any idea why? Thanks so much!

    • Adrian Rosebrock June 19, 2018 at 8:37 am #

      OpenCV and dlib order bounding box coordinates differently so I think that might be your issue. Take a look at this blog post where I take the bounding box coordinates from a Haar cascade and construct a dlib rectangle object from it. You should do the same with the deep learning face detection coordinates.

  53. ranindrastia June 25, 2018 at 5:17 am #

    Hi Adrian,
    Really great post you’re having.
    But I got some error while running the command

    net = cv2.dnn.readNetFromCafee(args[“prototxt”], args[“model”])
    AttributeError: ‘module’ object has no attribute ‘dnn’
    I tried to search online for the solution, also read this post comment.
    And I still can’t point out which way to solve this error. :'(

    • Adrian Rosebrock June 25, 2018 at 1:42 pm #

      You need at least OpenCV 3.3 (or greater) to access the “dnn” module — previous versions of OpenCV do not have it. You’ll want to re-compile and re-install OpenCV.

      • ranindrastia June 25, 2018 at 10:09 pm #

        I run:
        $pkg-config –modversion opencv
        And it giving me:
        3.4.1
        Is there any way how to access the dnn or at least check it?

      • ranindrastia June 25, 2018 at 11:41 pm #

        I am redo all the process installation for OpenCV, I follow exactly like written on: https://www.pyimagesearch.com/2018/05/28/ubuntu-18-04-how-to-install-opencv/
        But then I realize something weird.
        All is fine until this step:
        $ls /usr/local/lib/python3.6/site-packages/
        There’s no “site-packages”…. -__-” I don’t understand why….

        • Adrian Rosebrock June 28, 2018 at 8:26 am #

          If there isn’t a “site-packages” directory there should definitely be a “dist-packages” directory. Can you check there?

      • Frances September 13, 2018 at 4:11 am #

        Hi Adrian,

        I had the same error but I m using the Ubuntu Virtual Machine from your book package – how do I reinstall OpenCV onto this?

        • Adrian Rosebrock September 14, 2018 at 9:39 am #

          If you would like to install a new version of OpenCV on the VM (or any machine) you would follow my OpenCV install tutorials.

  54. Mutlucan Tokat June 25, 2018 at 11:42 am #

    Hi Adrian,

    Couldn’t we deploy it via Flask + Apache on Raspberry Pi? When i try it i have got 504 Timeout Error. Have you ever tried this model as a web application?

    • Adrian Rosebrock June 25, 2018 at 1:38 pm #

      It’s totally possible to use this code as a web application. Have you tried using this blog post as a starting point?

  55. Srinivasan June 26, 2018 at 3:32 pm #

    Does this require installing OpenCV with CUDA support?

    • Adrian Rosebrock June 28, 2018 at 8:14 am #

      No, this method will work on your CPU. In fact, I gathered the results and demos for this post using my CPU.

  56. Eric June 26, 2018 at 4:46 pm #

    Does this algorithm do non-max suppression as well? I suspect it’s not because when I feed it to the dlib landmark predictor, it goes crazy. Maybe it’s because has multiple boxes around the face in one frame?

    • Adrian Rosebrock June 28, 2018 at 8:15 am #

      Yes, the algorithm is internally doing NMS. I’m not sure what you mean by the dlib landmark predictor going crazy though. Keep in mind that OpenCV orders coordinates differently than dlib. You’ll need to rearrange the coordinates as I do in this post.

  57. Rohan June 26, 2018 at 8:48 pm #

    Hi Adrian,

    Great post! Very helpful. I have two questions:

    1) I have replicated this solution using C++, but for some reason the framerate is not as good as the Python version, despite the input blob being created and passed through the model in the exact same way (i.e. resized to 300×300). Relevant code:

    =================

    // Downsample frame for input to model
    cv::Mat blobFrame;
    cv::resize(rawFrame, blobFrame, cv::Size(300, 300), 0, 0, cv::INTER_AREA);

    // Prepare blob and pass through the network
    cv::Mat blob = cv::dnn::blobFromImage(blobFrame, 1.0, cv::Size(300, 300), cv::Scalar(104.0, 177.0, 123.0);
    net.setInput(blob);
    cv::Mat detections = net.forward();

    =================

    If I comment out net.forward(), everything runs at a smooth framerate. I’m using OpenCV 3.4.1 on Ubuntu. I don’t expect you to magically know what my problem is from the limited information I have provided, but I’m just hoping you might have encountered this before. Others seem to have encountered it as well: http://www.died.tw/2017/11/opencv-dnn-speed-compare-in-python-c-c.html

    2) Can the model be used for commercial applications? I think I found the original source (https://github.com/opencv/opencv_3rdparty/tree/dnn_samples_face_detector_20170830), but it doesn’t have any info on usage.

    • Adrian Rosebrock June 28, 2018 at 8:13 am #

      Hey Rohan, I have not encountered this before, actually. I’m not sure why the C++ version would be running slower than the Python version. My only guess here is that you’re not threading your video stream like I do in this blog post which in turn reduces your FPS throughput rate. Take a look at this blog post for more information on threading.

      As for your second question, I did not train the model so I do not want to speak on behalf of anyone. You should post the question on the OpenCV GitHub page.

      • Rohan Liston July 4, 2018 at 11:03 pm #

        Thanks Adrian. I implemented a threaded version in C++ and gained a noticeable improvement, though it’s still not quite as smooth as the Python version (but still usable).

        Will post a question to the author as suggested.

  58. Shohruh July 6, 2018 at 7:50 am #

    Hey there! What is the angle detection of the face detector

  59. Razmik Karabed July 11, 2018 at 2:41 pm #

    Dear Adrian, would the following 2 changes make the fps faster for Raspberry pi?
    1) Change 300×300 to 150×150 in lines 37 & 38 of detect_faces_video.py?
    blob = cv2.dnn.blobFromImage(cv2.resize(frame, (150, 150)), 1.0,
    (150, 150), (104.0, 177.0, 123.0))

    2) Use gray image instead of RGB?
    Thanks,
    Razmik Karabed

    • Adrian Rosebrock July 13, 2018 at 5:12 am #

      Hey Razmik — I suggest you try and see! PyImageSearch is an educational blog. I always do my best to help readers but I also suggest readers play with the code. Try new things! Experiment! Update the code, get an error, resolve it, and keep going. Learn by doing — that’s how I learned.

      The only hint I’ll give you is that the network was trained on RGB images so you won’t be able to use a single channel grayscale image.

      • Razmik Karabed July 17, 2018 at 12:46 am #

        Hi Adrian:

        I praise you for being a best teacher.

        After playing with the code, I learned:

        1) Changing 300×300 to 100×100 increases the fps for Raspberry pi significantly, but
        2) using gray images fails as you kindly hinted.

        Would any of the classes you teach or your recent book on deep learning cover face detection of networks trained with grayscale instead of RGB?

        • Adrian Rosebrock July 17, 2018 at 7:11 am #

          Hey Razmik — it’s totally possible to perform face recognition using grayscale images, it’s just a matter of training your network on them. I demonstrate how to train networks for smile detection and facial expression/emotion recognition inside Deep Learning for Computer Vision with Python. That would be my ideal course/book suggestion for you.

        • Domenick Poster July 18, 2018 at 12:02 pm #

          When you say using gray images fails, do you mean the performance is bad, or you actually get an error?

          Cause the error probably would have to do with a discrepancy of expected and given image shapes (300x300x3 vs 300×300). The easy work-around to use grayscale images with networks trained on RGB is to just copy the pixel data into the extra channels, such that you have 3 channels all with the same values. np.stack() will probably work (https://stackoverflow.com/questions/40119743/convert-a-grayscale-image-to-a-3-channel-image)

  60. Jing July 17, 2018 at 10:39 am #

    Hi, thank you for the great post!
    I am using your code and I am wondering what is the range of the detection box values. I thought it should be [0,1], but the range I got for detections[0,0,:,:] is [-0.16, 4.98]. And the boxes with values all within [0,1] have much smaller confidence while the maximum confidence is around 0.7.
    Is it because I am working with small images (usually 100*50 in size)? If so, how can I modify the code accordingly?

    Thank you!

    • Jing July 17, 2018 at 11:32 am #

      Alright, I change the line
      box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])
      to
      box = detections[0, 0, i, 3:7] * np.array([w, h, w, h]) * np.array([w, h, w, h]) / 300

      to multiply the box with image size : 300 * 300 ratio.

      And it works now.

  61. M July 19, 2018 at 6:41 am #

    I am just in LOVE with this blog and the work you put in every tutorial.

    thanks loads! 🙂

    • Adrian Rosebrock July 20, 2018 at 6:36 am #

      Thank you, I appreciate that 🙂

  62. Shashank July 24, 2018 at 5:12 am #

    Can we use it with Dlib ? I mean face detection using this and recognition using dlib for 128D vector for face.

  63. Sandeep Singh July 26, 2018 at 8:36 am #

    Hi All,

    I am able to detect the human faces. But unable recognize the face if it is animal.

    Is there any API for recognizing the faces of Animal?

    • Adrian Rosebrock July 31, 2018 at 12:09 pm #

      Not that I know of. You would likely need to train a network to detect animal faces.

  64. maduabuchi July 28, 2018 at 6:06 pm #

    Hi,
    maybe there is something wrong with machine but even with your code I get this error

    usage: detect_faces.py [-h] -p PROTOTXT -m MODEL [-c CONFIDENCE]
    detect_faces.py: error: the following arguments are required: -m/–model

    any workaround?

    • Adrian Rosebrock July 31, 2018 at 9:57 am #

      If you’re new to command line arguments, that’s okay, but you need to read up on them first. Read the tutorial I just linked you to and you’ll gain a new understanding and be able to execute the script 🙂

  65. Ali Nawaz August 2, 2018 at 12:01 pm #

    After running this code as it is. I get these exceptions:

    ipykernel_launcher.py: error: the following arguments are required: -i/–image, -p/–prototxt, -m/–model

    • Adrian Rosebrock August 7, 2018 at 7:47 am #

      You’re using Jupyter Notebooks to run the code, which is fine, but you’ll need to read this post to understand command line arguments first. Once you understand how to use them you can update the code.

  66. Diego August 4, 2018 at 3:31 am #

    It Is posible to ruin this example online opencv2?
    Good tutorial!

    • Adrian Rosebrock August 7, 2018 at 6:53 am #

      Just to clarify, do you mean OpenCV 2.4? If so, no, you need OpenCV 3.3+ for this example.

  67. prashant August 8, 2018 at 3:23 am #

    Hi Adrian,
    Thank you for the tutorial,
    I tried it and it worked well

    I face a problem while executing video face detection, once the exit condition is met the video display window closes but the python kernel remains busy for indefinite amount of time till I manually close it from task manager.

    I have tried with waitkey value but of no use,
    Have you encountered similar issue before?

    Thanks

    • Adrian Rosebrock August 9, 2018 at 2:57 pm #

      Hey Prashant, I have not encountered that error before. Are you using the example images/videos from this post or your own? Additionally, what OS are you using?

      • prashant August 20, 2018 at 2:28 am #

        Error happens when I use web-cam stream face detect, other wise the code runs fine. I was using these codes on windows system in jupyter notebook setup installed using anaconda

  68. Vamshi August 13, 2018 at 5:16 am #

    Hi Adrian thanks for this implementation.. In my raspberry pi 3b+ it is slow… Can i record video or take picture during recognition of a person using usb web cam..

    • Adrian Rosebrock August 15, 2018 at 8:51 am #

      Yes. You can use the cv2.imwrite function to write images to disk and cv2.VideoWriter to write videos to dask.

  69. Muthu Mariappan H August 16, 2018 at 5:57 am #

    Hi Adrian,
    Thank you very much for these codes, it’s working great. In my case I want to track both hands and face. How to extract the skin tones from the face ? so that I can track the hands. Thanks in advance.

    • Adrian Rosebrock August 17, 2018 at 7:23 am #

      You can detect skin tones but a more robust method would be to use a hand detector, similar to how we detected the face in the image. You could also detect the entire body and then “fit” a skeleton to the body. I’ll be covering that in a future blog post so stay tuned!

      • Muthu Mariappan H August 18, 2018 at 4:56 am #

        Thanks Adrian, awaiting for the future blog post.

  70. pragati August 17, 2018 at 1:48 am #

    hey Adrian
    while executing source code for single input image face detection i face the following problem please help me

    Can’t open “deploy.prototxt.txt” in function ‘ReadProtoFromTextFile’

  71. Yan August 18, 2018 at 12:46 am #

    How to detect face using c++ and opencv & dnn model?

    • Adrian Rosebrock August 22, 2018 at 10:14 am #

      Sorry, I only provide Python code here on the PyImageSearch blog. Best of luck with the project!

  72. pratap August 18, 2018 at 6:14 am #

    hi Adrian, thanks for a great post. One question though, how did you arrive at the mean value for scaling (104.0, 177.0, 123.0)?

    • Adrian Rosebrock August 22, 2018 at 10:12 am #

      They are the mean RGB values across all pixels in the training set. We use them to perform mean subtraction.

  73. pratap August 18, 2018 at 6:17 am #

    and one more query, blobFromImage already does resize. Then why should we pass resized image cv2.resize(image, (300,300))

    blob = cv2.dnn.blobFromImage(cv2.resize(image, (300, 300)), 1.0,(300, 300), (104.0, 177.0, 123.0))

  74. Xerebro August 22, 2018 at 5:13 pm #

    Hi Adrian:

    I’m a big fan of your work, thank for share all this knowledge!!.
    I have a problem with this example, i don get any detection with a confidence value major to 0.2, i’m sure that dnn is feed with a good quality picture with a big face, if i take the first detection it detect always “airplane” with 0.13 confidence… but the rectangle is wrong… what do you think that could cause this bad detection?

    • Adrian Rosebrock August 24, 2018 at 8:49 am #

      The code in this blog post covers face detection. It doesn’t have an “airplane” class so I’m a bit confused what you are referring to.

  75. Jasen August 23, 2018 at 6:36 am #

    Hi Adrian,
    Superb work mate. I am really happy with this work of you. I was really irritated with the accuracy of haar cascaded classifier. This was superb. I really enjoyed it. Now i wish to go through your guidance to build a face recognizer program which recognize our face. Once again thank alot Adrian. Superb post…

    • Adrian Rosebrock August 23, 2018 at 6:50 am #

      Thanks Jasen! I’m so happy you found the post helpful 🙂 If you’re interested in performing face recognition make sure you read this post to get started.

  76. Sherry August 24, 2018 at 11:50 pm #

    Hi Adrian,thanks for your sharing.
    I just have one questions: may I use it by Tensorflow rather than Caffe?

    • Adrian Rosebrock August 30, 2018 at 9:37 am #

      The code used here assumes a Caffe model. OpenCV supports some TensorFlow models but the bindings can be a bit buggy at times. Typically I recommend loading your model and then predicting via strict TensorFlow rather than trying to use pure OpenCV.

  77. Eero August 29, 2018 at 6:22 am #

    Wow

    Amazing tutorial and example, thanks.

    Greetings from Finland

    -e-

    • Adrian Rosebrock August 30, 2018 at 9:00 am #

      Thanks Eero, I’m glad you found it helpful! 🙂

  78. ishan vaid September 2, 2018 at 12:53 am #


    AttributeError: module ‘cv2.dnn’ has no attribute ‘blobFromImage’

    Getting this error unable to resolve this

    • Adrian Rosebrock September 5, 2018 at 9:05 am #

      Make sure you are running OpenCV 3.3 or newer. You need OpenCV 3.3 or better to access the “dnn” module.

  79. Shubhayu September 6, 2018 at 7:25 am #

    Hey adrian I like how to summarize information about the different open-souced NNs available. May I suggest it would we wonderful if you could analize the intuition behind the architecture of the the networks (perhaps as a link or separate post).

    A few comments and questions about this tutorial:
    The accuracy is not good. Can’t detect skin color variation. Gives close to 50-60 percent accuracy to images that are not close to a face.

    My question is :
    Should I expect such accuracy for any network which is small?
    What if I use GoogleLeNet?
    How do these CNN test accuracy then? Isn’t their test set bias itself?

    On a related note:
    Can I read a tensorflow mode myself I created (saved tf.train.saver module) using with cv2.readNet? What format should my network be of? Is there a particular way I should save the model in order to read it in opencv?

    • Adrian Rosebrock September 11, 2018 at 8:47 am #

      Hey Shubhayu — I actually discuss the theory, intuitions, and implementations behind various architectures inside Deep Learning for Computer Vision with Python. I would suggest starting there.

      There could be underlying bias in the dataset itself that was used to train the face detector. Keep in mind that I did not train this face detector. I would suggest gathering your own face detection dataset and/or training your own model on images that your system is likely to encounter in the real-world. That is by far the best way to ensure the highest accuracy rather than relying on off-the-shelf solutions.

  80. Nishant September 7, 2018 at 3:42 am #

    Hey! Can you please tell me how can I detect walls and ceilings in a room using OpenCV.

  81. Rahil Sarvaiya September 11, 2018 at 1:12 pm #

    Hey Adrian, I have a similar project, could you tell me how to detect different objects but only plastics. How can I train my model to detect only plastic models from the image?

    • Adrian Rosebrock September 12, 2018 at 2:11 pm #

      Hey Rahil — do you have an example image of what you’re working with? Are you trying to detect plastic objects or simply recognize texture?

      • Rahil Sarvaiya September 13, 2018 at 4:02 am #

        I need to detect plastic objects from an image of a garbage dump. If I take a picture of a garbage dump it should highlight the images which are plastic.

        • Adrian Rosebrock September 14, 2018 at 9:41 am #

          There are two ways to approach such a problem. The first one is via “object detection” but I think you’ll have better luck with semantic segmentation. You’ll likely need to train or fine-tune your own model on plastic images though. I can’t think of a “plastics” dataset off the top of my head so you may need to create one.

  82. Raksha September 12, 2018 at 6:26 am #

    Hi Adrian

    Thank you very much for the tips 🙂 It is really really useful. I was using haar cascades and it had some false positive. But this is way better than that. I am using this one a emotion detection system. for that I need to get face coordinates as output like from the haar cascades. Any tips or reference on how to achieve it?

    • Adrian Rosebrock September 12, 2018 at 1:55 pm #

      OpenCV’s deep learning face detector tends to be more accurate than Haar cascades. I would give that a shot.

      • Raksha September 12, 2018 at 11:49 pm #

        What I meant is I previously used Haar, but now I am using the model that you used in this post. (I am not native english speaker, I tthink therefore you misunderstood my english) However, to the emotion detection system, I need to take face cordinates as input (like what we get from haar) I can’t find a way to do it using this model. That is my problem 🙁

        • Adrian Rosebrock September 14, 2018 at 9:43 am #

          Line 47 will give you the bounding box coordinates of the face. You can use coordinates instead of the Haar cascade coordinates.

          • Raksha September 17, 2018 at 4:55 am #

            Thank you very much for everything 🙂 I fully implement my emotion detection system. <3

      • Raksha September 14, 2018 at 5:15 am #

        Did it 🙂 btw this model has a limitation. It does not identify relatively far images. 🙁 Even photos showing full body, does not corp well with this 🙁

  83. Jim September 12, 2018 at 8:14 am #

    i get this error plz help someone

    ModuleNotFoundError: No module named ‘cv2’

  84. Rencita September 14, 2018 at 1:54 am #

    Hi Adrian,
    Awesome tutorial. I love to read your posts.
    I tried for both images as well as webcam, it works well.

    I encounter a problem while executing video face detection using a mp4 file, at the last frame of the video, the display window freezes until I manually close it.

    Have you encountered similar issue before? What might be the reason?

    Thanks

    • Adrian Rosebrock September 14, 2018 at 9:23 am #

      Hm, that is strange. Unfortunately I have not encountered that problem before so I’m not sure what the root cause is.

    • nilay September 16, 2018 at 12:30 pm #

      use the condition

      if (cv2.waitkey(0)==ord(‘q’)):
      cv2.destroyAllWindows()

  85. nilay September 16, 2018 at 12:27 pm #

    Can’t open “deploy.prototxt.txt\” in function ‘cv::dnn::ReadProtoFromTextFile’

    please help with it.
    thanks

    • Adrian Rosebrock September 17, 2018 at 2:16 pm #

      Double-check your path to the input prototxt file as your current path is incorrect, leading to OpenCV throwing an error when trying to load a nonexistent file.

Trackbacks/Pingbacks

  1. Python, argparse, and command line arguments - PyImageSearch - March 12, 2018

    […] Adrian, I just downloaded the source code to your deep learning face detection blog post, but when I execute it I get the following […]

Quick Note on Comments

Please note that all comments on the PyImageSearch blog are hand-moderated by me. By moderating each comment on the blog I can ensure (1) I interact with and respond to as many readers as possible and (2) the PyImageSearch blog is kept free of spam.

Typically, I only moderate comments every 48-72 hours; however, I just got married and am currently on my honeymoon with my wife until early October. Please feel free to submit comments of course! Just keep in mind that I will be unavailable to respond until then. For faster interaction and response times, you should join the PyImageSearch Gurus course which includes private community forums.

I appreciate your patience and thank you being a PyImageSearch reader! I will see you when I get back.

Leave a Reply