Face recognition with OpenCV, Python, and deep learning

In today’s blog post you are going to learn how to perform face recognition in both images and video streams using:

  • OpenCV
  • Python
  • Deep learning

As we’ll see, the deep learning-based facial embeddings we’ll be using here today are both (1) highly accurate and (2) capable of being executed in real-time.

To learn more about face recognition with OpenCV, Python, and deep learning, just keep reading!

Looking for the source code to this post?
Jump right to the downloads section.

Face recognition with OpenCV, Python, and deep learning

Inside this tutorial, you will learn how to perform facial recognition using OpenCV, Python, and deep learning.

We’ll start with a brief discussion of how deep learning-based facial recognition works, including the concept of “deep metric learning”.

From there, I will help you install the libraries you need to actually perform face recognition.

Finally, we’ll implement face recognition for both still images and video streams.

As we’ll discover, our face recognition implementation will be capable of running in real-time.

Understanding deep learning face recognition embeddings

So, how does deep learning + face recognition work?

The secret is a technique called deep metric learning.

If you have any prior experience with deep learning you know that we typically train a network to:

  • Accept a single input image
  • And output a classification/label for that image

However, deep metric learning is different.

Instead, of trying to output a single label (or even the coordinates/bounding box of objects in an image), we are instead outputting a real-valued feature vector.

For the dlib facial recognition network, the output feature vector is 128-d (i.e., a list of 128 real-valued numbers) that is used to quantify the face. Training the network is done using triplets:

Figure 1: Facial recognition via deep metric learning involves a “triplet training step.” The triplet consists of 3 unique face images — 2 of the 3 are the same person. The NN generates a 128-d vector for each of the 3 face images. For the 2 face images of the same person, we tweak the neural network weights to make the vector closer via distance metric. Image credit: Adam Geitgey’s “Machine Learning is Fun” blog

Here we provide three images to the network:

  • Two of these images are example faces of the same person.
  • The third image is a random face from our dataset and is not the same person as the other two images.

As an example, let’s again consider Figure 1 above where we provided three images: one of Chad Smith and two of Will Ferrell.

Our network quantifies the faces, constructing the 128-d embedding (quantification) for each.

From there, the general idea is that we’ll tweak the weights of our neural network so that the 128-d measurements of the two Will Ferrel will be closer to each other and farther from the measurements for Chad Smith.

Our network architecture for face recognition is based on ResNet-34 from the Deep Residual Learning for Image Recognition paper by He et al., but with fewer layers and the number of filters reduced by half.

The network itself was trained by Davis King on a dataset of ~3 million images. On the Labeled Faces in the Wild (LFW) dataset the network compares to other state-of-the-art methods, reaching 99.38% accuracy.

Both Davis King (the creator of dlib) and Adam Geitgey (the author of the face_recognition module we’ll be using shortly) have written detailed articles on how deep learning-based facial recognition works:

I would highly encourage you to read the above articles for more details on how deep learning facial embeddings work.

Install your face recognition libraries

In order to perform face recognition with Python and OpenCV we need to install two additional libraries:

The dlib library, maintained by Davis King, contains our implementation of “deep metric learning” which is used to construct our face embeddings used for the actual recognition process.

The face_recognition  library, created by Adam Geitgey, wraps around dlib’s facial recognition functionality, making it easier to work with.

Learn from Adam Geitgey and Davis King at PyImageConf 2018

I assume that you have OpenCV installed on your system. If not, no worries — just visit my OpenCV install tutorials page and follow the guide appropriate for your system.

From there, let’s install dlib  and the face_recognition  packages.

Note: For the following installs, ensure you are in a Python virtual environment if you’re using one. I highly recommend virtual environments for isolating your projects — it is a Python best practice. If you’ve followed my OpenCV install guides (and installed virtualenv  + virtualenvwrapper ) then you can use the workon  command prior to installing dlib  and face_recognition .

Installing dlib without GPU support

If you do not have a GPU you can install dlib  using pip by following this guide:

Or you can compile from source:

Installing dlib with GPU support (optional)

If you do have a CUDA compatible GPU you can install dlib  with GPU support, making facial recognition faster and more efficient.

For this, I recommend installing dlib  from source as you’ll have more control over the build:

Install the face_recognition package

The face_recognition module is installable via a simple pip command:

Install imutils

You’ll also need my package of convenience functions, imutils. You may install it in your Python virtual environment via pip:

Our face recognition dataset

Figure 2: An example face recognition dataset was created programmatically with Python and the Bing Image Search API. Shown are six of the characters from the Jurassic Park movie series.

Since Jurassic Park (1993) is my favorite movie of all time, and in honor of Jurassic World: Fallen Kingdom (2018) being released this Friday in the U.S., we are going to apply face recognition to a sample of the characters in the films:

This dataset was constructed in < 30 minutes using the method discussed in my How to (quickly) build a deep learning image dataset tutorial. Given this dataset of images we’ll:

  • Create the 128-d embeddings for each face in the dataset
  • Use these embeddings to recognize the faces of the characters in both images and video streams

Face recognition project structure

Our project structure can be seen by examining the output from the tree  command:

Our project has 4 top-level directories:

  • dataset/ : Contains face images for six characters organized into subdirectories based on their respective names.
  • examples/ : Has three face images for testing that are not in the dataset.
  • output/ : This is where you can store your processed face recognition videos. I’m leaving one of mine in the folder — the classic “lunch scene” from the original Jurassic Park movie.
  • videos/ : Input videos should be stored in this folder. This folder also contains the “lunch scene” video but it hasn’t undergone our face recognition system yet.

We also have 6 files in the root directory:

  • search_bing_api.py : Step 1 is to build a dataset (I’ve already done this for you). To learn how to use the Bing API to build a dataset with my script, just see this blog post.
  • encode_faces.py : Encodings (128-d vectors) for faces are built with this script.
  • recognize_faces_image.py : Recognize faces in a single image (based on encodings from your dataset).
  • recognize_faces_video.py : Recognize faces in a live video stream from your webcam and output a video.
  • recognize_faces_video_file.py : Recognize faces in a video file residing on disk and output the processed video to disk. I won’t be discussing this file today as the bones are from the same skeleton as the video stream file.
  • encodings.pickle : Facial recognitions encodings are generated from your dataset via encode_faces.py and then serialized to disk.

After a dataset of images is created (with search_bing_api.py ), we’ll run encode_faces.py  to build the embeddings.

From there, we’ll run the recognize scripts to actually recognize the faces.

Encoding the faces using OpenCV and deep learning

Figure 3: Facial recognition via deep learning and Python using the face_recognition module method generates a 128-d real-valued number feature vector per face.

Before we can recognize faces in images and videos, we first need to quantify the faces in our training set. Keep in mind that we are not actually training a network here — the network has already been trained to create 128-d embeddings on a dataset of ~3 million images.

We certainly could train a network from scratch or even fine-tune the weights of an existing model but that is more than likely overkill for many projects. Furthermore, you would need a lot of images to train the network from scratch.

Instead, it’s easier to use the pre-trained network and then use it to construct 128-d embeddings for each of the 218 faces in our dataset.

Then, during classification, we can use a simple k-NN model + votes to make the final face classification. Other traditional machine learning models can be used here as well.

To construct our face embeddings open up encode_faces.py  from the “Downloads” associated with this blog post:

First, we need to import required packages. Again, take note that this script requires imutils , face_recognition , and OpenCV installed. Scroll up to the “Install your face recognition libraries” to make sure you have the libraries ready to go on your system.

Let’s handle our command line arguments that are processed at runtime with argparse :

If you’re new to PyImageSearch, let me direct your attention to the above code block which will become familiar to you as you read more of my blog posts. We’re using argparse  to parse command line arguments. When you run a Python program in your command line, you can provide additional information to the script without leaving your terminal. Lines 10-17 do not need to be modified as they parse input coming from the terminal. Check out my blog post about command line arguments if these lines look unfamiliar.

Let’s list out the argument flags and discuss them:

  • --dataset : The path to our dataset (we created a dataset with search_bing_api.py  described in method #2 of last week’s blog post).
  • --encodings : Our face encodings are written to the file that this argument points to.
  • --detection-method : Before we can encode faces in images we first need to detect them. Or two face detection methods include either hog  or cnn . Those two flags are the only ones that will work for --detection-method .

Now that we’ve defined our arguments, let’s grab the paths to the files in our dataset (as well as perform two initializations):

Line 21 uses the path to our input dataset directory to build a list of all imagePaths  contained therein.

We also need to initialize two lists before our loop,  knownEncodings  and knownNames , respectively. These two lists will contain the face encodings and corresponding names for each person in the dataset (Lines 24 and 25).

It’s time to begin looping over our Jurassic Park character faces!

This loop will cycle 218 times corresponding to our 218 face images in the dataset. We’re looping over the paths to each of the images on Line 28.

From there, we’ll extract the name  of the person from the imagePath  (as our subdirectory is named appropriately) on Line 32.

Then let’s load the image  while passing the imagePath  to cv2.imread  (Line 36).

OpenCV orders color channels in BGR, but the dlib  actually expects RGB. The face_recognition  module uses dlib , so before we proceed, let’s swap color spaces on Line 37, naming the new image rgb .

Next, let’s localize the face and compute encodings:

This is the fun part of the script!

For each iteration of the loop, we’re going to detect a face (or possibly multiple faces and assume that it is the same person in multiple locations of the image — this assumption may or may not hold true in your own images so be careful here).

For example, let’s say that rgb  contains a picture (or pictures) of Ellie Sattler’s face.

Lines 41 and 42 actually find/localize the faces of her resulting in a list of face boxes . We pass two parameters to the face_recognition.face_locations  method:

  • rgb : Our RGB image.
  • model : Either cnn  or hog (this value is contained within our command line arguments dictionary associated with the "detection_method"  key). The CNN method is more accurate but slower. HOG is faster but less accurate.

Then, we’re going to turn the bounding boxes  of Ellie Sattler’s face into a list of 128 numbers on Line 45. This is known as encoding the face into a vector and the face_recognition.face_encodings  method handles it for us.

From there we just need to append the Ellie Sattler encoding  and name  to the appropriate list ( knownEncodings  and knownNames ).

We’ll continue to do this for all 218 images in the dataset.

What would be the point of encoding the images unless we could use the encodings  in another script which handles the recognition?

Let’s take care of that now:

Line 56 constructs a dictionary with two keys — "encodings"  and "names" .

From there Lines 57-59 dump the names and encodings to disk for future recall.

How should I run the encode_faces.py  script in the terminal?

To create our facial embeddings open up a terminal and execute the following command:

As you can see from our output, we now have a file named encodings.pickle  — this file contains the 128-d face embeddings for each face in our dataset.

On my Titan X GPU, processing the entire dataset took a little over a minute, but if you’re using a CPU, be prepared to wait awhile for this script complete!

On my Macbook Pro (no GPU), encoding 218 images required 21min 20sec.

You should expect much faster speeds if you have a GPU and compiled dlib with GPU support.

Recognizing faces in images

Figure 4: John Hammond’s face is recognized using Adam Geitgey’s deep learning face_recognition Python module.

Now that we have created our 128-d face embeddings for each image in our dataset, we are now ready to recognize faces in image using OpenCV, Python, and deep learning.

Open up recognize_faces_image.py  and insert the following code (or better yet, grab the files and image data associated with this blog post from the “Downloads” section found at the bottom of this post, and follow along):

This script requires just four imports on Lines 2-5. The face_recognition  module will do the heavy lifting and OpenCV will help us to load, convert, and display the processed image.

We’ll parse three command line arguments on Lines 8-15:

  • --encodings : The path to the pickle file containing our face encodings.
  • --image : This is the image that is undergoing facial recognition.
  • --detection-method : You should be familiar with this one by now — we’re either going to use a hog  or cnn  method depending on the capability of your system. For speed, choose hog  and for accuracy, choose cnn .

IMPORTANT! If you are:

  1. Running the face recognition code on a CPU
  2. OR you using a Raspberry Pi
  3. …you’ll want to set the --detection-method  to hog  as the CNN face detector is (1) slow without a GPU and (2) the Raspberry Pi won’t have enough memory to run the CNN either.

From there, let’s load the pre-computed encodings + face names and then construct the 128-d face encoding for the input image:

Line 19 loads our pickled encodings and face names from disk. We’ll need this data later during the actual face recognition step.

Then, on Lines 22 and 23 we load and convert the input image  to rgb  color channel ordering (just as we did in the encode_faces.py  script).

We then proceed to detect all faces in the input image and compute their 128-d encodings  on Lines 29-31 (these lines should also look familiar).

Now is a good time to initialize a list of names  for each face that is detected — this list will be populated in the next step.

Next, let’s loop over the facial encodings :

On Line 37 we begin to loop over the face encodings computed from our input image.

Then the facial recognition magic happens!

We attempt to match each face in the input image ( encoding ) to our known encodings dataset (held in data["encodings"] ) using face_recognition.compare_faces  (Lines 40 and 41).

This function returns a list of True / False  values, one for each image in our dataset. For our Jurassic Park example, there are 218 images in the dataset and therefore the returned list will have 218 boolean values.

Internally, the compare_faces  function is computing the Euclidean distance between the candidate embedding and all faces in our dataset:

  • If the distance is below some tolerance (the smaller the tolerance, the more strict our facial recognition system will be) then we return True , indicating the faces match.
  • Otherwise, if the distance is above the tolerance threshold we return False  as the faces do not match.

Essentially, we are utilizing a “more fancy” k-NN model for classification. Be sure to refer to the compare_faces implementation for more details.

The name  variable will eventually hold the name string of the person — for now, we leave it as "Unknown"  in case there are no “votes” (Line 42).

Given our matches  list we can compute the number of “votes” for each name (number of True  values associated with each name), tally up the votes, and select the person’s name with the most corresponding votes:

If there are any True  votes in matches  (Line 45) we need to determine the indexes of where these True  values are in matches . We do just that on Line 49 where we construct a simple list of matchedIdxs  which might look like this for example_01.png :

We then initialize a dictionary called counts  which will hold the character name as the key the number of votes as the value (Line 50).

From there, let’s loop over the matchedIdxs  and set the value associated with each name while incrementing it as necessary in counts .  The counts  dictionary might look like this for a high vote score for Ian Malcolm:

Recall that we only have 41 pictures of Ian in the dataset, so a score of 40 with no votes for anybody else is extremely high.

Line 61 extracts the name with the most votes from counts , in this case it would be 'ian_malcolm' .

The second iteration of our loop (as there are two faces in our example image) of the main facial encodings loop yields the following for   counts :

That is definitely a smaller vote score, but still, there is only one name in the dictionary so we likely have found Alan Grant.

Note: The PDB Python Debugger was used to verify values of the counts  dictionary. PDB usage is outside the scope of this blog post; however, you can discover how to use it on the Python docs page.

As shown in Figure 5 below, both Ian Malcolm and Alan Grant have been correctly recognized, so this part of the script is working well.

Let’s move on and loop over the bounding boxes and labeled names for each person and draw them on our output image for visualization purposes:

On Line 67 we begin looping over the detected face bounding  boxes  and predicted  names . To create an iterable object so we can easily loop through the values, we call zip(boxes, names)  resulting in tuples that we can extract the box coordinates and name from.

We use the box coordinates to draw a green rectangle on Line 69.

We also use the coordinates to calculate where we should draw the text for the person’s name (Line 70) followed by actually placing the name text on the image (Lines 71 and 72). If the face bounding box is at the very top of the image, we need to move the text below the top of the box (handled on Line 70), otherwise the text would be cut off.

We then proceed to display the image until a key is pressed (Lines 75 and 76).

How should you run the facial recognition Python script?

Using your terminal, first ensure you’re in your respective Python correct virtual environment with the workon  command (if you are using a virtual environment, of course).

Then run the script while providing the two command line arguments at a minimum. If you choose to use the HoG method, be sure to pass --detection-method hog  as well (otherwise it will default to the deep learning detector).

Let’s go for it!

To recognize a face using OpenCV and Python open up your terminal and execute our script:

Figure 5: Alan Grant and Ian Malcom’s faces are recognized using our Python + OpenCV + deep learning method.

A second face recognition example follows:

Figure 6: Face recognition with OpenCV and Python.

Recognizing faces in video

Figure 7: Facial recognition in video via Python, OpenCV, and deep learning.

Now that we have applied face recognition to images let’s also apply face recognition to videos (in real-time) as well.

Important Performance Note: The CNN face recognizer should only be used in real-time if you are working with a GPU (you can use it with a CPU, but expect less than 0.5 FPS which makes for a choppy video). Alternatively (you are using a CPU), you should use the HoG method (or even OpenCV Haar cascades covered in a future blog post) and expect adequate speeds. 

The following script draws many parallels from the previous recognize_faces_image.py  script. Therefore I’ll be breezing past what we’ve already covered and just review the video components so that you understand what is going on.

Once you’ve grabbed the “Downloads”, open up recognize_faces_video.py  and follow along:

We import packages on Lines 2-8 and then proceed to parse our command line arguments on Lines 11-20.

We have four command line arguments, two of which you should recognize from above ( --encodings  and --detection-method ). The other two arguments are:

  • --output : The path to the output video.
  • --display : A flag which instructs the script to display the frame to the screen. A value of 1  displays and a value of  0  will not display the output frames to our screen.

From there we’ll load our encodings and start our VideoStream :

To access our camera we’re using the VideoStream  class from imutilsLine 29 starts the stream. If you have multiple cameras on your system (such as a built-in webcam and an external USB cam), you can change the src=0  to src=1  and so forth.

We’ll be optionally writing processed video frames to disk later, so we initialize writer  to None (Line 30). Sleeping for 2 complete seconds allows our camera to warm up (Line 31).

From there we’ll start a while  loop and begin to grab and process frames:

Our loop begins on Line 34 and the first step we take is to grab a frame  from the video stream (Line 36).

The remaining Lines 40-50 in the above code block are nearly identical to the lines in the previous script with the exception being that this is a video frame and not a static image. Essentially we read the frame , preprocess, and then detect face bounding boxes  + calculate encodings  for each bounding box.

Next, let’s loop over the facial encodings  associated with the faces we have just found:

In this code block, we loop over each of the encodings  and attempt to match the face. If there are matches found, we count the votes for each name in the dataset. We then extract the highest vote count and that is the name associated with the face. These lines are identical to the previous script we reviewed, so let’s move on.

In this next block, we loop over the recognized faces and proceed to draw a box around the face and the display name of the person above the face:

Those lines are identical too, so let’s focus on the video related code.

Optionally, we’re going to write the frame to disk, so let’s see how writing video to disk with OpenCV works:

Assuming we have an output file path provided in the command line arguments and we haven’t already initialized a video writer  (Line 99), let’s go ahead and initialize it.

On Line 100, we initialize the VideoWriter_fourcc . FourCC is a 4-character code and in our case we’re going to use the “MJPG” 4-character code.

From there, we’ll pass that object into the VideoWriter  along with our output file path, frames per second target, and frame dimensions (Lines 101 and 102).

Finally, if the writer  exists, we can go ahead and write a frame to disk (Lines 106-107).

Let’s handle whether or not we should display the face recognition video frames on the screen:

If our display command line argument is set, we go ahead and display the frame (Line 112) and check if the quit key ( "q" ) has been pressed (Lines 113-116), at which point we’d break  out of the loop (Line 117).

Lastly, let’s perform our housekeeping duties:

In Lines 120-125, we clean up and release the display, video stream, and video writer.

Are you ready to see the script in action?

To demonstrate real-time face recognition with OpenCV and Python in action, open up a terminal and execute the following command:

Below you can find an output example video that I recorded demonstrating the face recognition system in action:

Face recognition in video files

As I mentioned in our “Face recognition project structure” section, there’s an additional script included in the “Downloads” for this blog post — recognize_faces_video_file.py .

This file is essentially the same as the one we just reviewed for the webcam except it will take an input video file and generate an output video file if you’d like.

I applied our face recognition code to the popular “lunch scene” from the original Jurassic Park movie where the cast is sitting around a table sharing their concerns with the park:

Here’s the result:

Note: Recall that our model was trained on four members of the original cast: Alan Grant, Ellie Sattler, Ian Malcolm, and John Hammond. The model was not trained on Donald Gennaro (the lawyer) which is why his face is labeled as “Unknown”. This behavior was by design (not an accident) to show that our face recognition system can recognize faces it was trained on while leaving faces it cannot recognize as “Unknown”.

And in the following video I have put together a “highlight reel” of Jurassic Park and Jurassic World clips, mainly from the trailers:

As we can see, we can see, our face recognition and OpenCV code works quite well!

Can I use the this face recognizer code on the Raspberry Pi?

Kinda, sorta. There are a few limitations though:

  1. The Raspberry Pi does not have enough memory to utilize the more accurate CNN-based face detector…
  2. …so we are limited to HOG instead
  3. Except that HOG is far too slow on the Pi for real-time face detection…
  4. …so we need to utilize OpenCV’s Haar cascades

And once you get it running you can expect only 1-2 FPS, and even reaching that level of FPS takes a few tricks.

The good news is that I’ll be back next week to discuss how to run our face recognizer on the Raspberry Pi, so stay tuned!


In this tutorial, you learned how to perform face recognition with OpenCV, Python, and deep learning.

Additionally, we made use of Davis King’s dlib library and Adam Geitgey’s face_recognition module which wraps around dlib’s deep metric learning, making facial recognition easier to accomplish.

As we found out, our face recognition implementation is both:

  • Accurate
  • Capable of being executed in real-time with a GPU

I hope you enjoyed today’s blog post on face recognition!

To download the source code to this post, and be notified when future tutorials are published here on PyImageSearch, just enter your email address in the form below!


If you would like to download the code and images used in this post, please enter your email address in the form below. Not only will you get a .zip of the code, I’ll also send you a FREE 11-page Resource Guide on Computer Vision and Image Search Engines, including exclusive techniques that I don’t post on this blog! Sound good? If so, enter your email address and I’ll send you the code immediately!

, , , , , ,

205 Responses to Face recognition with OpenCV, Python, and deep learning

  1. Gagandeep Singh June 18, 2018 at 10:58 am #

    Hi Adrian,
    Can we achieve the same in tensorflow framework? Is it possible to use dlib landmark and feature extractor models in tensorflow? I tried loading the model in tf session but it ofcourse failed during parsing.

    • Adrian Rosebrock June 18, 2018 at 11:06 am #

      You’re asking a few different questions here, so let me try to take apart the question and ask:

      1. We are using dlib’s pre-trained deep metric facial embedding network to extract the 128-d feature vectors used to quantify each face.
      2. Facial landmarks aren’t actually covered in this post. Maybe you’re referring to this one?
      3. These models are not directly compatible with TensorFlow. You need dlib to use them out of the box.

      • Diomedes Domínguez June 18, 2018 at 11:44 am #

        So, basically, we can’t export this work to be used with the Intel Movidius stick, right?

        • Adrian Rosebrock June 18, 2018 at 12:30 pm #

          No, not directly. You could use the same algorithms and techniques but you would need to train a Caffe or TensorFlow model that is compatible with the Movidius. That said, this method can run on the Pi (I’ll be sharing a blog post on it next week).

          • Gagandeep Singh June 19, 2018 at 7:10 am #

            Thanks Adrian, will be waiting for your blog post.

  2. Sushant Tyagi June 18, 2018 at 11:10 am #

    That’s a really good job there Adrian.
    I always love seeing new posts here and there’s always something new in deep learning I learn from you.

    Thanks for your time and effort towards the community.

    • Adrian Rosebrock June 18, 2018 at 11:18 am #

      Thank you Sushant, I really appreciate that 🙂

      • Barna June 18, 2018 at 12:42 pm #

        Adrian, you should think about offering free courses in Coursera or edx(if you are not already doing it)

        Keep up the good work

        • Adrian Rosebrock June 18, 2018 at 1:25 pm #

          I’ve considered it but I really don’t think it’s going to happen. I prefer self-publishing my own content and having a better relationship with readers/students. I also really do not like how Udacity and the like treat their content creators. I don’t think it’s a very good relationship, and yes, while I could theoretically reach a larger audience, I don’t think it’s worth losing the more personal relationships here on the blog. Relationships and being able to communicate with others on a meaningful level is not something I would ever sacrifice.

  3. Ritika Agarwal June 18, 2018 at 11:30 am #

    Thank you for the wonderful post! Always wait for your post to learn new things

    • Adrian Rosebrock June 18, 2018 at 12:31 pm #

      Thanks Ritika! 🙂

  4. Anirban Ghosh June 18, 2018 at 11:44 am #


    Thanks a lot for this great tutorial. I have a question, in the encode_faces.py code we have a detection method flag which is by default cnn. I could not understand the statement ” Instead, it’s easier to use the pre-trained network and then use it to construct 128-d embeddings for each of the 218 faces in our dataset”. Where is the pretrained network or its weights?

    I checked the GitHub source of face_recognition , I could only find the author telling that the network was trained on dlib using deep learning but could not find the Deep learning network used to train the network in the code repository. It seems that by calling the flag cnn I am actually getting access to the face recognition algorithm’s weight but could not understand how. In fact , I am not able to connect this blog post with the dl4cv practitioner bundle lesson three or five(I have reached only till this.).

    This is way ahead of my knowledge and so sorry for asking a foolish question.
    Anirban Ghosh

    • Adrian Rosebrock June 18, 2018 at 12:29 pm #

      There are two networks if you use the “CNN” for face detection:

      1. One CNN is used to detect faces in an image. You could use HOG + Linear SVM or a Haar cascade here.
      2. The other CNN is the deep metric CNN. It takes an input face and computes the 128-d quantification of the face.

      • Anirban Ghosh June 18, 2018 at 12:42 pm #

        Thanks, Sir, your tutorials are just so great. It is my lack of knowledge that I have difficulty in understanding.
        Anirban Ghosh

  5. Dads June 18, 2018 at 11:46 am #

    Thanks for this toturial
    I want to simplify all of these steps for the user so that they can easily create different people to identify the face.for example i using raspberry pi…
    For example the user can easily give different images to the system and identify the system of the individual(from another computer to raspberry pi)

    • Adrian Rosebrock June 18, 2018 at 12:28 pm #

      Can you clarify what “simplify” means in this context? Are you talking about building a specific user interface or set of tools to facilitate face recognition on the Pi? Are you talking about reducing the computational complexity of this method such that it can run on the Pi?

  6. Dads June 18, 2018 at 12:21 pm #

    Please answer me,please
    Im your advocate

  7. ratman June 18, 2018 at 12:39 pm #

    Nice tutorial! I Implemented basically the same pipeline with hypersphere embedding (https://arxiv.org/abs/1704.08063), but despite the promises, it works poorly on real data. The lighting conditions and the shadows seem to split my clusters even for intuitively easy cases. What are your experiences with dlib?

    • Adrian Rosebrock June 18, 2018 at 1:21 pm #

      I’ve had wonderful experiences with dlib. The library is easy to use and well documented. Davis King, the creator, has done an incredible job with the library.

      • ratman June 18, 2018 at 4:00 pm #

        Dlib uses the facenet architecture, inspired by the openface implementation, as far I know. There is an embedding vs embedding competition in my eyes, I don’t care about the library.

        • Adrian Rosebrock June 18, 2018 at 4:44 pm #

          Compared to OpenFace I’ve found dlib to be substantially easier to use and just as accurate. Between the two I would opt for dlib but that’s just my opinion.

  8. Anirban Ghosh June 18, 2018 at 12:39 pm #

    I kept on looking finally found that the dlib library is actually trained on a DL network. It is written in C++ and since I am also studying C++ on the side, this is something I understand(at least superfluously), now it is clear as to why we can just pass in the image for detecting the face in the image in encoding_image.py, as because the dlib library is trained on 3 million plus images and here we just used its weights for finding the faces in the images and appended the name to the appropriate 128-D vector embedding that was created. Need to learn more before I can have a complete grasp of this subject , any way great explanation, I had seen this GitHub repo of Adam Gitgey before but could not understand than how to use it. Today , I at least understand it a bit and it looks the concept is pretty similar to word embeddings used in NLP. Thanks anyways. for the nice tutorials.


    Anirban Ghosh.

    • Adrian Rosebrock June 18, 2018 at 1:23 pm #

      Your understand is correct, Anirban. We must first detect the face in the image (using a deep learning based detector, HOG + Linear SVM detector, Haar cascade detector, etc.) and then pass the detected face region into the deep learning embedding network provided by dlib. This network then produces the 128-d embedding of the face.

      • Nitin Khurana June 29, 2018 at 2:48 am #

        Hi Adrian,

        Thanks for explaining the Face recognition Technology with different combinations of algo.

        Would it be possible for you to provide some more details on “128-d embedding of the face”.

        I understand that the network calculate this , but I am curious to know taking what features of face is this calculated.

        Does it is calculated like distance between nose to eyes , nose to lips , eyes to eyes etc.?

        I want to understand what points / features exactly taken and how it becomes 128 vector only…..

        • Adrian Rosebrock June 29, 2018 at 5:35 am #

          Take a look at the original articles by Davis King and Adam Geitgey that I linked to in the “Understanding deep learning face recognition embeddings” section. Adam’s article in particular goes more into the details on the embedding.

          • Nitin Khurana July 5, 2018 at 5:31 am #

            Hi Adrian,

            Thanks for guiding.
            But the above article also didn’t go into details , what features are these embeddings signifies.

            Is there any other tutorial or link you can guide me to
            OR if you can give information on the features these embedding signifies.

          • Adrian Rosebrock July 5, 2018 at 6:10 am #

            I would suggest you read the FaceNet paper as well as the DeepFace paper.

  9. Dads June 18, 2018 at 12:40 pm #

    Yes i mean is user interface application for interacting raspberry pi server..

    • Adrian Rosebrock June 18, 2018 at 1:21 pm #

      There are a few ways to accomplish this, but if you’re talking strictly about building a user interface that doesn’t really have a whole lot to do with computer vision or deep learning (I’m also certainly not a GUI expert). If you’re interested in building user interfaces take a look at libraries such as Tkinter, Qt, and Kivy. From there you’ll want to take a look at different mechanisms to pass data back and forth between the client and server. A message massing library like ZeroMQ or RabbitMQ would be a good option.

  10. Manivannan Murugavel June 18, 2018 at 12:41 pm #

    Please use my pyfacy python package for Face Recognition with user friendly.
    Refer link:

  11. Mansoor Nasir June 18, 2018 at 12:46 pm #

    Great tutorial Adrian.

    I am anxiously waiting for your post every Monday.

    Although, it would be very nice of you if you could show us how to train a Face recognition system from the scratch using a standard detection model (Yolo, MobileNet, SqueezeNet etc.) specifically build for low power single board computers i.e., Raspberry Pi using Keras and Tensorflow or Digits+Caffe etc.

    I really appreciate your effort and time that you put into organizing these tutorials.

    Thanks again!

    • Adrian Rosebrock June 18, 2018 at 1:26 pm #

      I’ll be doing a face recognition + Raspberry Pi tutorial next week 😉

  12. nikhil June 18, 2018 at 1:21 pm #

    Nice post Adrian. I’m getting some error $ python encode_faces.py –dataset dataset –encodings encodings.pickle
    Illegal instruction (core dumped)
    How can I fix this?

    • Adrian Rosebrock June 18, 2018 at 1:28 pm #

      It sounds like one of the libraries is causing a segfault of some sort. Try inserting “print” statements or using “pdb” to find the line that is causing the issue.

      • nikhil June 19, 2018 at 12:32 am #

        it is the problem with dlib library, I tried building with AVX_INSTRUCTIONS= 0, I was able to run a program but the process is to slow and laggy, my computer got frozen, I think it’s not running with GPU.

        • Adrian Rosebrock June 19, 2018 at 8:30 am #

          Your computer likely isn’t frozen, it’s just taking awhile to detect the faces. Try using the HOG + Linear SVM face detector rather than the CNN detector.

          • Nikhil R June 20, 2018 at 12:07 pm #

            What changes should I do if I use HOG SVM method? I never used those before

          • Adrian Rosebrock June 20, 2018 at 12:49 pm #

            Please refer to the post as I discuss the difference between the two methods. You change your --detection-method from cnn to hog. Again, refer to the post.

  13. Devarshi June 18, 2018 at 1:32 pm #

    Hey Adrian,

    Thanks for the detailed explanations. I ran the above code on my laptop and it appears very slow, the webcam stream is almost frozen. The compare method will compare each detected face with all the encodings, that will a lot of time for each frame i think. Please let me know your thoughts for the same.


    • Adrian Rosebrock June 18, 2018 at 2:05 pm #

      It’s actually not the embedding and comparison that is taking most of the time, it’s detecting the face itself via CNN face detector. Try using HOG + Linear SVM by switching the --detection-method command line argument to hog. Additionally, you could use Haar cascades as well.

  14. Bruce Lunde June 18, 2018 at 1:58 pm #

    I can’t wait to get home to “play” with this information from your post. Your blog, and the Practical Python and OpenCV system i purchased are really helping me become educated in this field!

    • Adrian Rosebrock June 18, 2018 at 2:05 pm #

      Thanks Bruce, I really appreciate that! Enjoy hacking with the code and always feel free to reach out if you have any questions.

  15. Devarshi June 18, 2018 at 2:15 pm #

    The encode_face.py took its time, The issue is coming up when i run the recognise_face_video.py. The Frames are frozen. I am using a i5 with 8gb ram. What should be hardware specifications for a decent real time face recognition sytem?

    • Adrian Rosebrock June 18, 2018 at 2:29 pm #

      Hey there Devarshi, make sure you read my reply to your original question. I have already answered it for you. The frame is not “frozen”, it’s that the CNN face detector is taking a long time to run on your CPU. Change the --detection-method to hog and it will run on your CPU.

      If you would like to use the more accurate CNN face detector rather than HOG + Linear SVM or Haar cascades you should have a GPU.

      • Devarshi June 18, 2018 at 2:39 pm #

        Got It! I misunderstood the previous comment. Thank you for correction. You are the best!

  16. Naser June 18, 2018 at 6:13 pm #

    Thank you for this great tutarial.
    Which one of face recognition architectures you used in this tutarial??facenet or deepface or other one??

  17. Steve June 18, 2018 at 8:57 pm #

    Thanks Adrian for all you time and effort . Very cool face recognition example !!!!

    • Adrian Rosebrock June 19, 2018 at 8:32 am #

      Thanks Steve!

  18. afsane June 18, 2018 at 11:16 pm #

    Hello dear adrian…
    You are wonderful
    Grade 1👌

    • Adrian Rosebrock June 19, 2018 at 8:32 am #

      Thank you for the kind words 🙂

  19. ritika agarwal June 19, 2018 at 12:11 am #

    I am getting MemoryError: bad allocation when I am using detection-method=’cnn’.My laptop configuration is i7 processor with 8gb ram and 4GB graphics card
    It is working fine with hog. Can you please suggest how can i use CNN face detector with such configuration

    • Adrian Rosebrock June 19, 2018 at 8:31 am #

      It looks like your machine is running out of memory. Can you confirm whether the error is for your RAM or for your GPU memory?

    • Ritika Agarwal June 19, 2018 at 12:23 pm #

      Error is for my RAM

      • Adrian Rosebrock June 20, 2018 at 10:17 am #

        That’s odd that you would be running out of physical RAM. 8GB should be sufficient. Are you using the exact same code + dataset I am using in the blog post?

        • Matt Barker June 26, 2018 at 10:18 pm #

          I have the same error however I was able to solve it, for images only, by using the full path to the image and reducing the image size to a 448×320 75kb jpg.

          I noticed that cnn takes a long time and appears to only be using one thread.I also do not notice any activitiy on the GPU (Win10 Task Manager GPU).

          I tried face_detection_cli.py (face Reconition Site Packages) to test Multithreaded CPU with CNN on the original sized example_01.png and it worked with no memory error and appeared to be using multiple CPU threads.

          I am unsure if the Memory Error: Bad Allocation is simple as my GPU only has 4gb dedicated memory and/or whether the integrated Intel HD Graphics 4600 is causing a problem with the Nvidia GTX 860M in my laptop.

          Win 10 x64, 16GB, Intel HD 4600 + GTX 860M

    • mattyb June 25, 2018 at 10:43 pm #

      I received this error also – I just needed to put the full path name in for the input image file and it worked fine.

      I am still getting the MemoryError: bad allocation when running recognize_faces_video_file.py however and using full path name is not fixing that…

      i7, 16gb, Win 10 x64, Geforce 860M 4gb
      Dlib Use Cuda and Use AVX selected

      • Adrian Rosebrock June 27, 2018 at 2:44 pm #

        Your GPU itself is likely running out of memory as it cannot hold both (1) the face detector and (2) the encoding model. I would suggest using the CPU + HOG method for face detection.

        • Matt Baker June 30, 2018 at 2:29 am #

          So I bit the bullet and managed to successfully follow your wonderful guide ‘Setting up Ubuntu 16.04 + CUDA + GPU for deep learning with Python’

          On the very same laptop I now can run all using ‘cnn’ and Nvidia-smi (average) when running recognize_faces_video_file.py (with –display 1) shows;

          GPU Util = 87%
          GPU Mem usage Python = 697MiB
          GPU Mem usage ‘other’ total = 400MiB

          Therefore in conclusion for me Windows kinda worked but Ubuntu the way forward!

          None of this would have been possible for me without your tutorials Adrian – thank you

          NB Took me about 5 attempts to get Ubuntu up and running (and I can still dual boot to Windows 10 if needed)

          • Adrian Rosebrock July 3, 2018 at 8:46 am #

            Nice job Matt! 🙂

  20. Simon Hyde June 19, 2018 at 1:40 am #

    Great tutorial, clearly explained and easy to follow.

    • Adrian Rosebrock June 19, 2018 at 8:26 am #

      Thanks so much Simon! 😀

  21. Nishant Nishiket June 19, 2018 at 5:48 am #

    Does this algorithm work better than inception v3 model for image classification?

    • Adrian Rosebrock June 19, 2018 at 8:24 am #

      They are two totally different techniques used for two totally different applications. The Inception network is used for (typically) image classification. The network here is used to compute a 128-d quantification of a face. You can’t compare them or say which one is better, it would be like comparing apples and oranges.

  22. YoCo June 19, 2018 at 6:16 am #

    I got an error on 2nd image

    rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    (-215) scn == 3 || scn == 4 in function cvtColor

    • Adrian Rosebrock June 19, 2018 at 8:22 am #

      Double-check your command line arguments. I believe the path to your input --image path is incorrect. If it the path does not exist, the cv2.imread function will return “None”.

    • Jerry July 3, 2018 at 8:20 pm #

      Do you solve this error?
      Could you tell me how to solve it?

  23. kubet June 19, 2018 at 7:55 am #

    Hi Adrian
    i test this with dlib without gpu and i was getting low fps,so i installed dlib with gpu and again low fps so how to use hog or other for better fps.Sorry for asking i am beginer.

    • Adrian Rosebrock June 19, 2018 at 8:20 am #

      Hey Kubet — what type of GPU are you using? Can you confirm that dlib is actually accessing your GPU?

  24. Sayan June 19, 2018 at 8:31 am #

    I am getting this error when I use cnn –

    MemoryError: std::bad_alloc

    Any idea why is it so ? Because everything is working fine in hog

    • Adrian Rosebrock June 19, 2018 at 8:45 am #

      Your machine is running out of memory and it cannot load the CNN. You would need to use the HOG method.

  25. Murali June 19, 2018 at 8:39 am #

    Hi Awesome post.

    Is there any way to get this to work on windows/anaconda env?

    Many thanks

    • Adrian Rosebrock June 19, 2018 at 8:45 am #

      Sure. You would need to install OpenCV and dlib into your Anaconda environment. OpenCV should be available from the standard Anaconda repo. Both dlib and imutils are pip-installable as well.

  26. Murali June 19, 2018 at 8:53 am #

    I’ve managed to install OpenCV 3, dlib, and imutils, but I am having issues with face_recognition which doesn’t seem to be supported either via pip install or conda install

    • Adrian Rosebrock June 19, 2018 at 9:06 am #

      The face_recognition module is certainly supported via a pip install. Double-check your output and ensure there are no errors. If you are running into issues installing it on your Windows system I would suggest posting on the face_recognition GitHub page.

      • Murali June 19, 2018 at 9:16 am #

        Thank you , I will try the installation steps, otherwise I am eagerly awaiting the rpi blog!

  27. YoCo June 19, 2018 at 10:20 am #

    it worked for me finally! nice program man! thank you!

    • Adrian Rosebrock June 19, 2018 at 11:03 am #

      Congrats on getting the face recognition code up and running!

  28. YoCo June 19, 2018 at 10:31 am #

    Why when running last code, on the video file, my MAC is moving soo slow. Is like time stad still. Is this due to my MAC’s memory? Thank you Adrian!

    • Adrian Rosebrock June 19, 2018 at 11:04 am #

      Please read the other comments. I’ve mentioned that if you are using a CPU you should be using the HOG + Linear SVM detection method instead of the CNN face detector to speedup the face detection process.

  29. Bruce Lunde June 19, 2018 at 12:02 pm #

    Adrian, I am wondering if you have experience with cloning the virtual environment? I just built this ENV per your instructions, and am wondering if I can clone it before I use, then I have a “boiler plate” version to copy from for similar projects. Having a baseline image of OpenCV, dlib and face_recognition to spin off new projects from would be great. I looked at the pip freeze and requirements.txt but when I ran that, it does not show the OpenCV or the dlib? If I am off track on this just let me know. I am thinking ahead to other projects with out having completing this one I know.

    • Adrian Rosebrock June 20, 2018 at 10:16 am #

      I don’t really like trying to “clone” a virtual environment. It’s too likely to introduce some sort of errors. Instead, I recommend using a pip freeze and requirements.txt as you noted. However, keep in mind that libraries that are hand compiled and hand installed WILL NOT appear in your pip freeze. You would need to either (1) recompile or reinstall or (2) my preferred method, sym-link the libraries into the site-packages directory of the new virtual environment.

  30. Fed June 19, 2018 at 12:45 pm #

    Hello, the articles you publish are very useful even for beginners like me. I would like to know if you can use the facial recognition implemented in the code “recognize_faces_video.py” inside a main, that is: I need my raspberry to recognize ONLY my face and in case of recognition perform other operations.
    Thanks in advance

    • Adrian Rosebrock June 20, 2018 at 10:15 am #

      Next week I’ll be discussing how to run this script on your Raspberry Pi. Be sure to keep an eye out for the post!

  31. Bart June 19, 2018 at 12:57 pm #

    Guess you made my day!!!! Spent a day and a half compiling dlib without result, when i saw your post. Think it works now.

    • Adrian Rosebrock June 20, 2018 at 10:14 am #

      Congrats on getting dlib installed, Bart!

  32. Oliver R June 19, 2018 at 1:01 pm #

    Jesus this is a tutorial with a lot of depth. Thank you for that! I can’t find a use for it at my current job, but in my private life, I’ll try using this!

    • Adrian Rosebrock June 20, 2018 at 10:14 am #

      Thanks Oliver, I’m glad you liked it! 🙂

  33. Sasha June 19, 2018 at 3:41 pm #

    On the raspberry pi 3, it’s will be work in real time without freeze?

    • Adrian Rosebrock June 20, 2018 at 10:14 am #

      There are a bunch of considerations you’ll want to handle with the Raspberry Pi — I’ll be covering them in next week’s blog post.

  34. Kirankumar June 20, 2018 at 3:00 am #

    I have used HOG detection method to speed up the face detection method and its working fine. but if i want to use ” cnn ” face detector so what configurations are needed for smoothly and fast face recognition process ?

    • Adrian Rosebrock June 20, 2018 at 10:13 am #

      You would want to (1) have a GPU and (2) install and configure dlib with GPU support (this post demonstrates how, refer to the install instructions). I personally recommend NVIDIA/CUDA compatible GPUs.

  35. Konrad June 20, 2018 at 4:20 am #

    Hey, you got a typo in a comment.

    # load the input image and convert it from RGB (OpenCV ordering)
    # to dlib ordering (RGB)
    image = cv2.imread(imagePath)

    It should be “convert it from BGR…”
    Good read, thanks.

    • Adrian Rosebrock June 20, 2018 at 10:13 am #

      Fixed, thanks for catching this!

  36. tuấn June 20, 2018 at 6:36 am #

    Thank for great tutorial, my bro!
    I installed dlib use GPU. But how to know whether dlib uses GPU or not when running encoding_face.py?
    Thank so much

    • Adrian Rosebrock June 20, 2018 at 10:12 am #

      If you are using an NVIDIA GPU you can run nvidia-smi to check GPU utilization. If you are using another GPU you should refer to the docs for it.

      • tuấn June 20, 2018 at 9:47 pm #

        I’m using Nvidia Geforce GT 705 2GB. When running encoding_face.py with 51 input images + runnig nvidia-smi, i think my dlib did not use GPU. I didn’t see anything with nvidia-smi.
        My syntax:
        – nvidia-smi.exe
        (correct if i wrong)

        • Adrian Rosebrock June 21, 2018 at 5:36 am #

          Are you using Windows? Sorry, I haven’t used Windows in 10+ years. I’m not sure how you would check GPU utilization on Windows. Also keep in mind that the face_recognition module only “unofficially” supports Windows so that might be part of the issue as well.

          • tuấn June 22, 2018 at 12:21 am #

            I ran pip install face_recognition successfully on Windows.
            Is it OK If I build OpenCV not use GPU but build dlib use GPU?

          • Adrian Rosebrock June 24, 2018 at 6:18 am #

            That should not be a problem.

  37. Bhargav Ravat June 20, 2018 at 6:36 am #

    Hey Adrian!

    Its really wonderful and helped me a lot:)

    I am presently running with one issue.
    When I have inserted the new dataset images with the size between 10-30KB it was working very fine.

    However when I have given the images of 1.3MB “Encode Face” python code was not encoding all the images and in between it slows down.

    I am not using any GPU system presently.
    Do I need to use HOG + Linear SVM for this as well or is there any other issue ?

    Cheers ,

    • Adrian Rosebrock June 20, 2018 at 10:11 am #

      Hey Bhargav — what do you mean by “not encoding all the images”? Are you referring to the encode_faces.py script? Is it taking awhile to encode the faces in the training set? Does it exit with an error?

      • Bhargav Ravat June 21, 2018 at 1:41 am #

        Hi !

        It does not exit with an error , however it does not go ahead when I am running encode_faces.py script.

        I am not having GPU in my system , is that the reason for this ? If not then can you suggest what can be the possible reasons for this?

        Note: While I am running the same code with slower images [3 to 10 KB] , that time it works fine. However the desired accuracy is not being achieved with these images and thus I want to feed high quality images in my dataset.

        • Adrian Rosebrock June 21, 2018 at 5:33 am #

          Be sure to check the activity monitor on your system, in particular the CPU usage. I have a feeling that the script is indeed running, it’s just taking a longer time to process more times. These systems are not magic. If you provide more data to the system it will take longer to process.

  38. Robert June 20, 2018 at 7:48 am #

    Given the hassles around trying to get some of this setup on Windows, is there an OpenCV/Python Docker image that you would recommend for trying out these tutorials?

    • Adrian Rosebrock June 20, 2018 at 10:11 am #

      Inside Practical Python and OpenCV I offer a VM that comes pre-configured with OpenCV/Python. I would suggest starting there (plus you can learn the fundamentals of OpenCV/Python as well).

  39. Kartik June 21, 2018 at 1:56 am #

    Hi adrain,

    I am having problems with installing dlib and face_recognition module on windows!

    Can you help?


    • Adrian Rosebrock June 21, 2018 at 5:32 am #

      Hey Kartik, I don’t support Windows here on the PyImageSearch blog. I would highly recommend you use a Unix-based system such as Linux (ideally Ubuntu) or macOS. If you take a look at the face_recognition GitHub you’ll find install instructions but the library does not officially support Windows. A link is provided if you want to give it a try, but again, you should consider using Linux here.

    • PythonImager July 3, 2018 at 5:12 am #

      Hi Kartik,

      Install Anaconda, and try to install Dlib from there.
      After that use –no dependencies setting to pip face_recognition modules…

      I was able to get it on my windows…later you would face other problems like GPU etc…

  40. Abkul June 21, 2018 at 8:43 am #

    Like always thanks for your great tutorial.

    I am an avid reader of your blog. I have images of 6000 individuals who appear in the cctv images one time or the other. I want to experiment as explained in this post.

    My questions is should i have each individuals name in the file as in indicated in the “dataset” section of “Face recognition project structure” where you listed the file limit to be 10 as indicated in the blog? kindly advise.

    • Adrian Rosebrock June 21, 2018 at 9:12 am #

      I think you might be confusing the --filelimit switch to the tree command. I used tree to show the project directory structure. I am not limiting each person to only 10 example images. I’m allowing all images for each individual to be used. You should do the same.

  41. Carlos June 21, 2018 at 11:30 am #

    Hey Adrian,

    I’ve set up dlib with CUDA support but it seems my GPU might not be up to the task. I have a GeForce GTX 950M 2gb. After about encoding the fourth image I get a runtime error from CUDA: “Error while calling cudaMalloc(&data, n), reason: out of memory”. This makes sense since by the third image I’ve used over 50% of my graphics cards memory. I don’t see how this could possibly scale up to 218 images, even for high-end graphics cards. I have to be doing something wrong here. Any input would be greatly appreciated.

    • Adrian Rosebrock June 21, 2018 at 2:20 pm #

      A 2GB card is likely not enough for this task, especially if you are trying to use the CNN face detector.

      I also think you have a misunderstanding on how the face embeddings are extracted. All 218 images are not passed in at once, they are passed in as batches. In this case the batch size is trivially one, implying only one image at a time is fed through the network and only one image at a time would be pass through the network on the GPU.

      • Kirankumar June 22, 2018 at 3:10 am #

        Hey Adrain ,
        I really would be glad to know from your side that, for the smooth running of these entire code structure what will be the minimum system requirement in terms of GPU , CPU and Processor .
        Look forward to hear from you.

        • Adrian Rosebrock June 24, 2018 at 6:21 am #

          On my system I was running a 3 Ghz processor and 16GB of RAM for the HOG detector. When using the GPU, I had a 3.4Ghz processor, 64GB of RAM (you wouldn’t need that much), and a Titan X GPU. A GPU with at least 6GB of memory is preferable for deep learning tasks.

  42. Tim Richards June 21, 2018 at 1:49 pm #

    Cool stuff. I would like to see this with people who are not white. Would you still receive the same results? Just a thought…

    • Adrian Rosebrock June 21, 2018 at 2:17 pm #

      The network was trained on millions of images, both white and non-white, and obtains over 99% accuracy on the LFW (mentioned in the post) which includes many non-white examples as well. That said, there is an unconscious bias in some datasets that can lead to models performing not as well on non-whites. Great question!

  43. Junior Montilla June 21, 2018 at 3:53 pm #

    cnn does not work (gives me a MemoryError: bad allocation) and I try to use Hog but not work, when I run the python file recognize_faces_video.py nothing happens I have 8 GB ram with a usable 3.49 GB ram in a windows 10 32 bits

    • Adrian Rosebrock June 24, 2018 at 6:18 am #

      Could you be a bit more specific regarding “nothing happens”? Does a window open up on your screen but the frame does not process? Does the script automatically exit? Do you receive an error?

  44. Ard June 22, 2018 at 1:32 pm #

    Adrian, Thanks a lot for your great blog post.
    How many pics per actor do you need to make your streaming movie detection work?
    I saw that you had like 22 per actor. Is one per actor even enough?

    • Adrian Rosebrock June 24, 2018 at 6:19 am #

      You can get away with one image per person for highly simplistic projects but you should aim to be much higher, ideally in the 20-100 range.

  45. Walid June 22, 2018 at 2:25 pm #

    Thanks a lot
    You illustrated a detailed topic in a the most clear way
    With your talent, I would understand the Relativity Theory if you post an article about it 🙂

    • Adrian Rosebrock June 24, 2018 at 6:19 am #

      Thanks Walid 🙂

  46. Jeremiah June 22, 2018 at 4:12 pm #

    Good post. Love seeing people using deep learning+machine learning techniques in clever ways.

    • Adrian Rosebrock June 24, 2018 at 6:20 am #

      Thanks so much Jeremiah!

  47. YoCo June 23, 2018 at 3:08 am #

    Hi Adrian,

    I am trying to use your code for facerecognition.
    When using an image of my baby, it seems that your code doesnt run the conversion into BGR in order to make the boxes on the face and to save them into the pickle.

    If i use your own dataset, with actors from that movie, the code will run. My question to you is, did you edit the photos before runnign the code.
    If yes, what you did in order to run yoru face recognition code?

    Thank you,

    • Adrian Rosebrock June 24, 2018 at 6:23 am #

      1. Regarding the photos of your baby (congrats!) are you saying that no faces are detected? It’s sounds like that is likely the problem. Did you try the HOG and CNN methods as well?

      2. No image editing was performed at all on the code. They were downloaded to their respective directories and then I went in and manually removed irrelevant ones.

  48. YoCo June 23, 2018 at 3:22 am #

    Have changed the model from cnn to hog, is working, but there are some error–>> Invalid SOS parameters for sequential JPEG. Should i edit the pics before running the code?

    • Adrian Rosebrock June 24, 2018 at 6:23 am #

      Those are not OpenCV/dlib errors. They are actually warnings from the libraries used to load the images. It should not be an issue.

  49. CTVR June 23, 2018 at 11:00 am #

    After practicing this tutorial, i have a question. Why does each person have multiple 128-d measurements? Why do we summarize into 1?
    If distance between each above 128-d measurements > 0,6 or similar, is the result wrong when inspecting the new input image?

    • Adrian Rosebrock June 24, 2018 at 6:25 am #

      Each face in an image ins quantified with as 128-d feature vector. These feature vectors cannot be combined into one, as again, each face input to the system is represented by a 128-d vector. As for your second question, if the distance between an input face and a set of faces in a dataset is too large, the face is marked as “Unknown”.

      • CTVR June 24, 2018 at 9:03 am #

        If distance between 2 encoding input >0,6?

        • Adrian Rosebrock June 25, 2018 at 1:48 pm #

          I’m not sure what you are asking. Could you clarify?

  50. Gaurav June 23, 2018 at 9:15 pm #

    Hi Adrian

    Does the model need more training data like 30 or 40 images for each person?

    I tested the model with addition of my images(total 10) to the existing dataset and tested the model. Model was not accurate and was not able to recognize my images correctly.


    • Adrian Rosebrock June 24, 2018 at 6:26 am #

      I typically recommend 20-100 images per person. The more you can get, the better. Ideally these images should be representative of where the system will be deployed (i.e., lighting conditions, viewing angle, etc.)

  51. dauy June 24, 2018 at 1:50 am #

    Hi adrian
    i would like to add new images or delete images in database and when i do it then prior images that exist in the database are stored in encodings.pickle and only for new images encode_faces.py be done.
    I want to reduce the time to save the encoding in the encoding.pickle.
    Otherwise, a lot of time should be spent even adding a new image.

    • Adrian Rosebrock June 24, 2018 at 6:27 am #

      I would suggest you:

      1. Use encode_faces.py as you normally would but each time you run it, create a new pickle file.
      2. When you’re ready to recognize faces, loop over all pickle files, load them, and create a “data” variable that stores all information from all pickle files.

  52. Gözde June 25, 2018 at 4:09 am #

    Hi Adrian
    Thank you so much your post
    I have a problem I installed dlib easily but While I was installing face_recognition I have cmake error:
    CMake must be installed to build the following extensions: dlib
    On the other hand I have already cmake I cannot understand why I have this error
    I am studying python 2.7* andwindows 10

    • Adrian Rosebrock June 25, 2018 at 1:44 pm #

      Hey Gözde, I’m happy to help the best I can; however, I do not support Windows here on the PyImageSearch blog. I would strongly recommend that you use a Unix-based system such as Linux or macOS for developing computer vision/deep learning applications. Secondly, the face_recognition module does not officially support Windows either. You should post any errors related to the face_recognition module on the official GitHub page.

  53. sneha June 26, 2018 at 7:46 am #

    Hey Adrian,

    Its always a motivation whenever I see your blogs

    can you tell me . can this system be deployed were we have to do lakhs of persons facial recognition?

    • Adrian Rosebrock June 27, 2018 at 2:44 pm #

      I’m not sure what you mean by “lakhs” — could you clarify?

      • sunil July 10, 2018 at 2:03 pm #

        lakhs is a ‘Hindi’ language word, 1 million = 10 lakhs

  54. Andrew Baker June 26, 2018 at 9:42 pm #

    I can concur about the running out of memory error when using CNN. I can get through 26 images. When I switched to the HOG all 218 images were processed in 90 seconds.

    MacBook Pro (Mid 2014)
    Processor: 2.5 GHz Intel Core i7
    Memory: 16 GB
    Graphics: NVIDIA GeForce GT 750M 2048 MB

    Python: 3.6.5_1
    cuDNN: v7.1
    CUDA: v9.2

    CUDA Driver: 396.64
    GPU Driver: 387.

    • Adrian Rosebrock June 27, 2018 at 2:44 pm #

      Interesting. I wonder if there is some sort of memory leak issue going on. I would suggest posting the problem on dlib’s GitHub Issue page just to confirm this.

  55. vibha June 27, 2018 at 3:41 am #

    Hi Adrian,

    Very wonderful tutorial.,Thanks a lot!!!

    I want to customize the code so that it will tag the address,phone number along with the name of the recognized face..
    how could i achieve this?

    Please guide me.

    Thanks in advance.

    • Adrian Rosebrock June 27, 2018 at 2:47 pm #

      What does “tag” mean in this context? Once you recognize the face you can perform any other operations you wish. Keep in mind I focus mainly on computer vision and deep learning on this blog. Any database updates or modifications you want to make is 100% possible but you would need to code that up yourself.

      • vibha June 29, 2018 at 2:56 am #

        Yeah i got it…Thank you for your advice.

  56. Devaraj Pandian V June 27, 2018 at 3:56 am #

    Good day,
    I tried to recognize my face using opencv and deeplearning. So,I took 10 photos of mine.and I put in dataset.photos are in jpg format.
    When I was running encode_faces.py, I got error message that ‘invalid sos parameters for sequential jpeg’.could you tell me how to solve this problem?

    • Adrian Rosebrock June 27, 2018 at 2:47 pm #

      Hmm, I haven’t encountered that particular error before. How did you capture your JPEG images?

  57. Omsai_K June 28, 2018 at 12:48 am #

    Your tutorial is so good. It’s really helpful for my internship. By the way when i run encode_faces.py

    I’m using same names of all your folders and also same dataset. Even the name “dataset” remained same.

    I have got following error, Could you please help me out.

    usage: encode_faces.py [-h] -i DATASET -e ENCODINGS [-d DETECTION_METHOD]
    encode_faces.py: error: argument -i/–dataset is required

  58. Parth June 28, 2018 at 5:17 am #

    hey I’m getting following error while training on cnn model.
    return cnn_face_detector(img, number_of_times_to_upsample)
    MemoryError: std::bad_alloc

    can you please help me? I have even tried GCP

    • Adrian Rosebrock June 28, 2018 at 5:35 am #

      Please see the other comments:

      1. If you are using a GPU, your GPU does not have enough memory for the CNN face detector
      2. If you are using a CPU, your system does not have enough RAM for the CNN face detector

      Switch to the HOG detector and you’ll be able to execute the code.

      • Parth Patel June 30, 2018 at 6:18 am #

        Can you tell me what amount of memory and RAM is required?
        one more thing. recognizing the faces seems working slow in real time. can you tell me how to speed that up.

        • Adrian Rosebrock July 3, 2018 at 8:46 am #

          Parth — I’ve addressed your questions in the post and in other comments. I kindly ask you to read them.

  59. Peter June 28, 2018 at 5:43 am #

    Very interesting.

    I was wondering: If a new character pops up in the new movie, how you would add pictures and encodings? Do you need to retrain on all images or is there a way to just append to the encodings file?

    • Adrian Rosebrock June 28, 2018 at 7:56 am #

      You wouldn’t need to retrain. There isn’t actually any “training” going on. We’re effectively just performing k-NN classification. Just extract the 128-d face embeddings for the new faces and update the pickle files.

  60. Reed Guo July 2, 2018 at 11:43 pm #

    Hi, Adrian

    Can we get the ‘confidence’ of the recognition?

    As we know, LBPH can output confidence.

    • Adrian Rosebrock July 3, 2018 at 7:15 am #

      The confidence is the distance between the faces. You’ll want to refer to the face_recognition docs to obtain it.

  61. quantum_ai July 3, 2018 at 12:56 am #

    Hello, just want to share that experience with this code was a challenge. I tried implementing this from scratch on Ubuntu Beaver but ran into multiple issues when installing OpenCV. At the point where I would “make” it would fail. Also several required packages would not be found when installing with pip.

    After several days of trying, I ended up installing Ubuntu 16.4.4 LTS – followed the steps to install OpenCV with such version, and even though it took several hours to install, I finally was able to get this model working. In case some one else may run into similar issues, this is how I resolved mine.

    Thanks Adrian for your great content always

    • Adrian Rosebrock July 3, 2018 at 7:12 am #

      Congratulations on getting OPenCV installed! It can be a pain to compile OpenCV from scratch if this is your first time, but once you do it a few times, it gets significantly easier.

  62. Jerry July 3, 2018 at 1:48 pm #

    Hi, Adrian
    OpenCV(3.4.1) Error: Assertion failed (scn == 3 || scn == 4) in cv::cvtColor
    I have this problem.
    How can I solve this?
    Thank u!

    • Adrian Rosebrock July 5, 2018 at 6:46 am #

      Which face recognition method are you using? The one for images? Or the one for video streams? My guess is that your image/frame is “None” meaning that the path to the input image is invalid or OpenCV cannot access your webcam. Double-check your paths.

  63. Waseem Shariff July 4, 2018 at 3:33 am #

    Hi adrian

    Can we get the accuracy count of those recognized images.???

    • Adrian Rosebrock July 5, 2018 at 6:33 am #

      The face_recognition library doesn’t return the “accuracy” here as the accuracy is just the distance between the feature vectors. To obtain the distance you would want to extract the embeddings manually and then apply the k-NN distance calculation manually (the face_recognition library is doing all that for you under the hood).

  64. Thang July 4, 2018 at 11:19 pm #

    Hello, I see your demo it really real time. But when I run it in my computer it is very slow. maybe 0.7 FPS.
    Can you tell me why

    • Adrian Rosebrock July 5, 2018 at 6:17 am #

      Hey Thang, make sure you give the comments and blog post another read as I’ve already discussed this issue many times. If you are using the CNN face detector you will need a GPU for real-time performance. If you want to use your CPU make sure you use the HOG detector. You can supply a value of “hog” when you execute the script via command line arguments.

      • Thang July 10, 2018 at 12:03 pm #

        thank you very much

  65. Vaisakh July 7, 2018 at 10:42 am #

    hi adrian,

    First of all excellent post. I tried to run this project on my pi. after i executing the commands to encode the data set i got this error message.

    how can i solve this,i am a beginner : (

    • Adrian Rosebrock July 10, 2018 at 8:43 am #

      It is okay if you are a beginner but I would ask you to read the other comments to the post. I’ve addressed this question both in the post and in the comments section. The gist is that you need to use the HOG face detector rather than the CNN face detector. Read the post and comments for more details.

  66. Mushif Ali Nawaz July 8, 2018 at 6:24 am #

    Hi Adrian,

    Thank you so much for the detailed explanation. I am having a problem with recognizing faces, I am using webcam embedded in my Laptop to collect dataset of images (using your other code) and then using this code to recognize people. It is not showing unknown for people who doesn’t have the images in dataset and it displays incorrect names from the dataset randomly. What could be the possible reason?

    • Adrian Rosebrock July 10, 2018 at 8:38 am #

      Try setting the tolerance parameter to a lower value, such as 0.4. That should help. Take a look at the documentation to the face_recognition.compare_faces documentation as well.

      • Mushif Ali Nawaz July 10, 2018 at 10:30 am #

        Hi Adrian,

        Thank you for the response. Basically, I am working on a college project named “Home Surveillance System” that includes a Face recognition and physical features analysis to identify the intruders (unknown people).

        First I planned to use use Raspberry Pi for that purpose but after reading your blog post about Raspberry Pi (which seems quite slow), I am planning to use only my Laptop for that purpose. Now I have to purchase a camera for this project, can you suggest me a good camera that would satisfy my needs and would be compatible with Python and OpenCV?

        • Adrian Rosebrock July 10, 2018 at 11:06 am #

          You might actually want to start by taking a look at the PyImageSearch Gurus course where I build a project that is nearly identical to what you are describing. In the course I demonstrate how to recognize “known” people vs. “intruders” and in the case of an intruder, send a txt message alert to your phone.

          As far as cameras go, I really like the Logitech C920. It’s affordable, high quality, and plug and play compatible with the Pi.

  67. lordcenzin July 9, 2018 at 11:53 am #

    Hello Adrian,
    thanks a lot for your effort in clarifying all those interesting topics.
    Some questions related to the face_recognition module you are using here.
    In the beginning you talk about the neural network needed to create the embeddings. To my understanding the face_recognition library does this:
    – produces the embedding (calling dlib)
    – provides the distance function between two images.

    Moreover, together with the face_recognition the system downloads also the “model” which I suppose is the one that obtains 99+% accuracy on LFW.

    In my case i have a dataset of about 2000 people with around 10 images each.
    I created encodings setting the jitter param in face_recognition =10 (putting 100 makes the system too slow)
    The point is that I am getting a lot of false positives even setting a quite low distance (e.g. 0.4)
    Do you think there is a training “intermediate” step I can do to improve the results?

    Thanks again 🙂

    • Adrian Rosebrock July 10, 2018 at 8:21 am #

      Your understanding is 100% spot on, nice job grasping it! As far as improving the accuracy of the system keep in mind that you are using just the produced face embeddings on images the network was not trained on. To fully improve the method you should train your own network using dlib on the faces you would like to recognize.

  68. Leena July 10, 2018 at 12:32 am #

    Its great tutorial. I follow ur tutorial from basic and now very comfortable in the code style you have. I would love to try this by my own and will give you the feedback soon.


    • Adrian Rosebrock July 10, 2018 at 8:13 am #

      Thanks Leena, I’m glad you enjoyed the post.

  69. Hanz July 10, 2018 at 3:56 am #

    Hi Adrian,
    Thank you for wonderful tutorial!
    I have some problems. I run it by macbook pro core i5, Intel Iris Graphics 540 1536 MB but it is very slow. Therefore, in file ecode_faces.py I replace “cnn” by “hog” and in file recognize_faces_video.py, I resize image to width=250. When but I run it real time, it is not correct all case.
    Do you have any solutions for this problem?

    • Hanz July 10, 2018 at 4:06 am #

      And how can I print the accuracy of this model?
      Thank you

      • Adrian Rosebrock July 10, 2018 at 8:11 am #

        You would need a “testing set” of faces you would like to recognize. You should know the faces in the images. Your algorithm can make predictions on these testing images. And from there you can derive how many were correct (i.e., the accuracy).

    • Adrian Rosebrock July 10, 2018 at 8:11 am #

      The image size may be too small at only 250px. Try increasing it to see if you can find a nice balance between speed and accuracy.

  70. Nnamdi Affia July 10, 2018 at 8:25 am #

    Hi Adrian,

    I just have quick question. What is the difference between HOG and CNN detection method? They seem to have significant training and testing times between them with Hog being the faster of the two.

    Thanks for the assistance!

    • Adrian Rosebrock July 10, 2018 at 9:10 am #

      1. HOG is faster but less accurate. HOG can also run on CPU in real-time.
      2. CNN is slower but more accurate. CNN is slow on CPU for real-time. For real-time a GPU should be used.

      • Nnamdi Affia July 17, 2018 at 3:14 pm #

        Thanks for the clarification Adrian, super helpful as always.

        • Nnamdi Affia July 17, 2018 at 3:25 pm #

          In order to enable GPU support, where do I find the CMAKE_PREFIX_PATH to include my cudnn folder. This has been eluding me for hours.

          Thanks again!

          • Nnamdi Affia July 17, 2018 at 3:55 pm #

            I have managed to solve it by organising the cuDNN files in the appropriate cuda directories and downgrading apple clang version to 8.1. it works now!

            Thanks for the confidence!

          • Adrian Rosebrock July 20, 2018 at 6:51 am #

            Congrats on resolving the issue!

  71. Ben July 12, 2018 at 9:20 am #

    Hi Adrian,

    Is there a certain threshold you would use for knowing frame rate is too slow for good results? Like above 10 FPS = good, but below 5 FPS = try something different…

    Is there a certain frame rate where the memory of the computer just cannot keep up and ultimately you will not get good results? Or is video ‘choppiness’ just a good gauge/rule of thumb?? I am referring to video file analysis of a movie (.avi) like you are doing with Jurassic park in real time.


    • Adrian Rosebrock July 13, 2018 at 5:05 am #

      Hey Ben — I think you need to clarify what you mean by “good results” here as I’m not sure whether you are referring to:

      1. The accuracy of the face recognition algorithm
      2. Or the “quality” of the output video file

      • Ben Bartling July 13, 2018 at 3:53 pm #

        Hi Adrian,

        I was attempting to learn a little bit more about how you mention in your blog post to resort to HOG & SVM, because the computer memory cannot keep up without GPU support.

        If I have a trained algorithm with accuracy detecting in a real time, is there a certain frame rate where the algorithm will not detect very well because the video is choppy and it appears the computer is bogged down…?

        Not really referring to the algorithm accuracy itself but just the computer memory issues … Can my results be poor because of poor frame rates even tho the overall accuracy of the algorithm is good…?

        If so what frame rates are considered poor?


        • Adrian Rosebrock July 17, 2018 at 8:08 am #

          The frame rate itself will not affect the accuracy of your system. The face recognition code will still see the same frames and the accuracy will be the same, but the throughput itself will be slower.

  72. Kalyanramu July 12, 2018 at 1:44 pm #

    Hi Adrian,

    Great Tutorial. I am trying to understand advantage of deep metric learning network here. Why not take the output of the face detection box and feed directly through a common classification network to label them?

    • Adrian Rosebrock July 13, 2018 at 4:59 am #

      Try it and see! And then compare your accuracy. You’ll find that your face recognition pipeline is much less accurate. By using triplet learning we can obtain embeddings that perform well.

      Secondly, by using embeddings we can more easily apply transfer learning (like we did in this blog post) to recognize faces the network wasn’t trained on. If we used standard feature extraction using the network instead of the embeddings we again would not obtain as high accuracy.

  73. Kalyanramu July 13, 2018 at 4:38 pm #

    Thanks Adrian for your reply. I will definitely give it a try, it is the best suggestion to learn. I am always looking for more ways to improve accuracy.
    Your answer solidified the thoughts!

  74. Johnhome July 14, 2018 at 8:24 am #

    Thanks a lot. one question:can it detect copy images or real face picture? I need live face detection difference to copy image. Thanks.

    • Adrian Rosebrock July 17, 2018 at 7:35 am #

      I think what you are referring to is “liveliness detection” which is an entirely different facial application. I don’t have any tutorials on that topic on the moment but I will certainly consider it for the future.

  75. Yuliya July 16, 2018 at 1:28 pm #

    Dear Adrian,

    Thank you for this tutorial. I would like to ask several questions regarding it.
    1) Do you have an idea,which methods for feature extraction and which kind of cnn / classifier used in face_recognition implementation?
    2) Did you had a chance to test this framework on face images with different insolation? I.e. where the angle of light direction is different in training and recognition set. If so, could you please share the results?
    3) Which face rec pretrained model would you recommend for tensorflow?

    Thank you in advance.

    • Adrian Rosebrock July 17, 2018 at 7:16 am #

      1. Take a look at FaceNet and DeepFace.
      2. Yes. Take a look at this post.
      3. Refer to #1

      • Yuliya July 18, 2018 at 11:21 am #

        Hi! Thanks for the response.
        1. Basically I already get acquaintance of these publications. However it is not possible to conduct an experiment on own dataset, as both prototypes provide per-traned models and at least in publications there is no information regarding re-configuretion the model. More over they are using Inception model, as you said in your previous comment, it is more used for object recognition. In fact, is Inception is used in face_recognition (it is not clear from the code) ?
        2. Great! Looks promising!

        Thank you!

        • Adrian Rosebrock July 20, 2018 at 6:44 am #

          Just to clarify — the Inception architecture can be used for a variety of image recognition tasks. The Inception network was originally trained on the ImageNet dataset for classification. However, it can be used as a base model for object detection and other tasks. Typically we’ll remove layers from the Inception network and use the rich set of filters it has learned to accomplish whatever tasks the practitioner is using it for. We call this “transfer learning” as we’ll utilizing what the model learned from one task and applying it to another.

  76. Asha July 17, 2018 at 2:25 am #

    Hi Adrian.
    Have been experimenting with Facenet for generating face embeddings. Saw your post on dlib and face_recognizer and read that it was built using the architecture of Deep Residual Learning for Image Recognition. Checked it out, but I still need to check against a bigger corpus of data to see how well they do. How do you think they compare considering both the papers came out in a short span of couple of months.

    • Adrian Rosebrock July 17, 2018 at 7:10 am #

      How are you quantifying “compare” on this context? A standard is the LFW dataset and all of those methods reportedly perform well on it.

  77. Anuj July 19, 2018 at 1:25 pm #

    Hi Adrian,

    I love your blogs and have been following this since a few months. I was trying to implement this particular code but I am getting an error with the face_recognition module. I have installed it successfully using pip install face_recognition but when I try to import it, I get this error – ImportError: DLL load failed: The specified module could not be found.I have installed dlib successfully. I am running this on a windows 10 OS using anaconda and python 3.6. Please let me know how to fix this.

    • Adrian Rosebrock July 20, 2018 at 6:31 am #

      Windows is not officially supported by the face_recognition module. If you’re having trouble trying to install it be sure to post on their GitHub.

Leave a Reply