Facial landmarks with dlib, OpenCV, and Python

Last week we learned how to install and configure dlib on our system with Python bindings.

Today we are going to use dlib and OpenCV to detect facial landmarks in an image.

Facial landmarks are used to localize and represent salient regions of the face, such as:

  • Eyes
  • Eyebrows
  • Nose
  • Mouth
  • Jawline

Facial landmarks have been successfully applied to face alignment, head pose estimation, face swapping, blink detection and much more.

In today’s blog post we’ll be focusing on the basics of facial landmarks, including:

  1. Exactly what facial landmarks are and how they work.
  2. How to detect and extract facial landmarks from an image using dlib, OpenCV, and Python.

In the next blog post in this series we’ll take a deeper dive into facial landmarks and learn how to extract specific facial regions based on these facial landmarks.

To learn more about facial landmarks, just keep reading.

Looking for the source code to this post?
Jump right to the downloads section.

Facial landmarks with dlib, OpenCV, and Python

The first part of this blog post will discuss facial landmarks and why they are used in computer vision applications.

From there, I’ll demonstrate how to detect and extract facial landmarks using dlib, OpenCV, and Python.

Finally, we’ll look at some results of applying facial landmark detection to images.

What are facial landmarks?

Figure 1: Facial landmarks are used to label and identify key facial attributes in an image (source).

Detecting facial landmarks is a subset of the shape prediction problem. Given an input image (and normally an ROI that specifies the object of interest), a shape predictor attempts to localize key points of interest along the shape.

In the context of facial landmarks, our goal is detect important facial structures on the face using shape prediction methods.

Detecting facial landmarks is therefore a two step process:

  • Step #1: Localize the face in the image.
  • Step #2: Detect the key facial structures on the face ROI.

Face detection (Step #1) can be achieved in a number of ways.

We could use OpenCV’s built-in Haar cascades.

We might apply a pre-trained HOG + Linear SVM object detector specifically for the task of face detection.

Or we might even use deep learning-based algorithms for face localization.

In either case, the actual algorithm used to detect the face in the image doesn’t matter. Instead, what’s important is that through some method we obtain the face bounding box (i.e., the (x, y)-coordinates of the face in the image).

Given the face region we can then apply Step #2: detecting key facial structures in the face region.

There are a variety of facial landmark detectors, but all methods essentially try to localize and label the following facial regions:

  • Mouth
  • Right eyebrow
  • Left eyebrow
  • Right eye
  • Left eye
  • Nose
  • Jaw

The facial landmark detector included in the dlib library is an implementation of the One Millisecond Face Alignment with an Ensemble of Regression Trees paper by Kazemi and Sullivan (2014).

This method starts by using:

  1. A training set of labeled facial landmarks on an image. These images are manually labeled, specifying specific (x, y)-coordinates of regions surrounding each facial structure.
  2. Priors, of more specifically, the probability on distance between pairs of input pixels.

Given this training data, an ensemble of regression trees are trained to estimate the facial landmark positions directly from the pixel intensities themselves (i.e., no “feature extraction” is taking place).

The end result is a facial landmark detector that can be used to detect facial landmarks in real-time with high quality predictions.

For more information and details on this specific technique, be sure to read the paper by Kazemi and Sullivan linked to above, along with the official dlib announcement.

Understanding dlib’s facial landmark detector

The pre-trained facial landmark detector inside the dlib library is used to estimate the location of 68 (x, y)-coordinates that map to facial structures on the face.

The indexes of the 68 coordinates can be visualized on the image below:

Figure 2: Visualizing the 68 facial landmark coordinates from the iBUG 300-W dataset (higher resolution).

These annotations are part of the 68 point iBUG 300-W dataset which the dlib facial landmark predictor was trained on.

It’s important to note that other flavors of facial landmark detectors exist, including the 194 point model that can be trained on the HELEN dataset.

Regardless of which dataset is used, the same dlib framework can be leveraged to train a shape predictor on the input training data — this is useful if you would like to train facial landmark detectors or custom shape predictors of your own.

In the remaining of this blog post I’ll demonstrate how to detect these facial landmarks in images.

Future blog posts in this series will use these facial landmarks to extract specific regions of the face, apply face alignment, and even build a blink detection system.

Detecting facial landmarks with dlib, OpenCV, and Python

In order to prepare for this series of blog posts on facial landmarks, I’ve added a few convenience functions to my imutils library, specifically inside face_utils.py.

We’ll be reviewing two of these functions inside face_utils.py  now and the remaining ones next week.

The first utility function is rect_to_bb , short for “rectangle to bounding box”:

This function accepts a single argument, rect , which is assumed to be a bounding box rectangle produced by a dlib detector (i.e., the face detector).

The rect  object includes the (x, y)-coordinates of the detection.

However, in OpenCV, we normally think of a bounding box in terms of “(x, y, width, height)” so as a matter of convenience, the rect_to_bb  function takes this rect  object and transforms it into a 4-tuple of coordinates.

Again, this is simply a matter of conveinence and taste.

Secondly, we have the shape_to_np  function:

The dlib face landmark detector will return a shape  object containing the 68 (x, y)-coordinates of the facial landmark regions.

Using the shape_to_np  function, we cam convert this object to a NumPy array, allowing it to “play nicer” with our Python code.

Given these two helper functions, we are now ready to detect facial landmarks in images.

Open up a new file, name it facial_landmarks.py , and insert the following code:

Lines 2-7 import our required Python packages.

We’ll be using the face_utils  submodule of imutils  to access our helper functions detailed above.

We’ll then import dlib . If you don’t already have dlib installed on your system, please follow the instructions in my previous blog post to get your system properly configured.

Lines 10-15 parse our command line arguments:

  • --shape-predictor : This is the path to dlib’s pre-trained facial landmark detector. You can download the detector model here or you can use the “Downloads” section of this post to grab the code + example images + pre-trained detector as well.
  • --image : The path to the input image that we want to detect facial landmarks on.

Now that our imports and command line arguments are taken care of, let’s initialize dlib’s face detector and facial landmark predictor:

Line 19 initializes dlib’s pre-trained face detector based on a modification to the standard Histogram of Oriented Gradients + Linear SVM method for object detection.

Line 20 then loads the facial landmark predictor using the path to the supplied --shape-predictor .

But before we can actually detect facial landmarks, we first need to detect the face in our input image:

Line 23 loads our input image from disk via OpenCV, then pre-processes the image by resizing to have a width of 500 pixels and converting it to grayscale (Lines 24 and 25).

Line 28 handles detecting the bounding box of faces in our image.

The first parameter to the detector  is our grayscale image (although this method can work with color images as well).

The second parameter is the number of image pyramid layers to apply when upscaling the image prior to applying the detector (this it the equivalent of computing cv2.pyrUp N number of times on the image).

The benefit of increasing the resolution of the input image prior to face detection is that it may allow us to detect more faces in the image — the downside is that the larger the input image, the more computaitonally expensive the detection process is.

Given the (x, y)-coordinates of the faces in the image, we can now apply facial landmark detection to each of the face regions:

We start looping over each of the face detections on Line 31.

For each of the face detections, we apply facial landmark detection on Line 35, giving us the 68 (x, y)-coordinates that map to the specific facial features in the image.

Line 36 then converts the dlib shape  object to a NumPy array with shape (68, 2).

Lines 40 and 41 draw the bounding box surrounding the detected face on the image  while Lines 44 and 45 draw the index of the face.

Finally, Lines 49 and 50 loop over the detected facial landmarks and draw each of them individually.

Lines 53 and 54 simply display the output image  to our screen.

Facial landmark visualizations

Before we test our facial landmark detector, make sure you have upgraded to the latest version of imutils  which includes the face_utils.py  file:

Note: If you are using Python virtual environments, make sure you upgrade the imutils  inside the virtual environment.

From there, use the “Downloads” section of this guide to download the source code, example images, and pre-trained dlib facial landmark detector.

Once you’ve downloaded the .zip archive, unzip it, change directory to facial-landmarks , and execute the following command:

Figure 3: Applying facial landmark detection using dlib, OpenCV, and Python.

Notice how the bounding box of my face is drawn in green while each of the individual facial landmarks are drawn in red.

The same is true for this second example image:

Figure 4: Facial landmarks with dlib.

Here we can clearly see that the red circles map to specific facial features, including my jawline, mouth, nose, eyes, and eyebrows.

Let’s take a look at one final example, this time with multiple people in the image:

Figure 5: Detecting facial landmarks for multiple people in an image.

For both people in the image (myself and Trisha, my fiancée), our faces are not only detected but also annotated via facial landmarks as well.

Summary

In today’s blog post we learned what facial landmarks are and how to detect them using dlib, OpenCV, and Python.

Detecting facial landmarks in an image is a two step process:

  1. First we must localize a face(s) in an image. This can be accomplished using a number of different techniques, but normally involve either Haar cascades or HOG + Linear SVM detectors (but any approach that produces a bounding box around the face will suffice).
  2. Apply the shape predictor, specifically a facial landmark detector, to obtain the (x, y)-coordinates of the face regions in the face ROI.

Given these facial landmarks we can apply a number of computer vision techniques, including:

  • Face part extraction (i.e., nose, eyes, mouth, jawline, etc.)
  • Facial alignment
  • Head pose estimation
  • Face swapping
  • Blink detection
  • …and much more!

In next week’s blog post I’ll be demonstrating how to access each of the face parts individually and extract the eyes, eyebrows, nose, mouth, and jawline features simply by using a bit of NumPy array slicing magic.

To be notified when this next blog post goes live, be sure to enter your email address in the form below!

Downloads:

If you would like to download the code and images used in this post, please enter your email address in the form below. Not only will you get a .zip of the code, I’ll also send you a FREE 11-page Resource Guide on Computer Vision and Image Search Engines, including exclusive techniques that I don’t post on this blog! Sound good? If so, enter your email address and I’ll send you the code immediately!

, , ,

35 Responses to Facial landmarks with dlib, OpenCV, and Python

  1. Anto April 3, 2017 at 11:24 am #

    Superrrrrbbbbbbbbbb blog .searched continuosly about facial landmarks.wellll explained.looking forward for face recongnition using facial landmark measurements……please go ahead soon..!!!!
    awesome blog….!!!!!!

    • Adrian Rosebrock April 3, 2017 at 1:50 pm #

      Thanks Anto!

  2. Danny April 3, 2017 at 3:51 pm #

    Thank you so much Adrian!!

  3. Mário April 3, 2017 at 8:37 pm #

    Very good job Adrian.
    All of your explanation are very useful.
    This one, in special, is very important for my research.
    Thank you a lot!!!

    • Adrian Rosebrock April 5, 2017 at 12:03 pm #

      Thank you Mário! 🙂 I wish you the best of luck with your research.

  4. Abkul Orto April 4, 2017 at 4:47 am #

    This is a clear, clean, and EXCELLENT demystification of the concept.

    Any plan to include this concept and the Deep learning version of training and implementation in your up coming Deep learning book?

    • Adrian Rosebrock April 5, 2017 at 12:01 pm #

      Thanks Abkul, I’m glad you enjoyed the tutorial!

      I don’t plan on covering how to train custom facial landmark detectors via deep learning inside Deep Learning for Computer Vision with Python, but I will consider it for a future tutorial.

  5. Dimitri April 4, 2017 at 6:33 am #

    This blog is a goldmine. Thank you so much for writing this.

    • Adrian Rosebrock April 5, 2017 at 12:00 pm #

      I’m glad you’re enjoying the blog Dimitri, I’m happy to share 🙂

  6. Thimira Amaratunga April 4, 2017 at 12:23 pm #

    Hi Adrian,

    This is a great article. Cant wait for next week’s article on how to access face features individually.

    After some experimenting (and with your hint on array slicing), I managed to extract the features. Here is the method I used,
    http://www.codesofinterest.com/2017/04/extracting-individual-facial-features-dlib.html

    Undoubtedly, this method I used could use more improvements.
    So, waiting for your article 🙂

    • Adrian Rosebrock April 5, 2017 at 11:57 am #

      Nice job Thimira. The method I will demonstrate in next week’s blog post is similar, but uses the face_utils sub-module of imutils for convenience. I’ll also be demonstrating how to draw semi-transparent overlays for each region of the face.

  7. Neeraj Kumar April 4, 2017 at 2:58 pm #

    Hey Adrian,

    I have already configured dlib with your previous week blog and now when i am trying to run “python facial_landmarks.py –shape-predictor shape_predictor_68_face_landmarks.dat \
    –image images/example_01.jpg” command my ubuntu terminal is showing error
    “python: can’t open file ‘facial_landmarks.py’ : [Errno 2] no such file or directory “.

    PS : I have already downloaded your code and files and i am running my code inside that ‘facial-landmarks’ folder. All the files are present as well.

  8. Neeraj Kumar April 4, 2017 at 3:13 pm #

    Dear Adrian,

    Fixed the previous issue by providing the full path of py file. Thanks for this great blog.

    Thanks and Regards
    Neeraj Kumar

  9. Neeraj Kumar April 4, 2017 at 3:25 pm #

    Dear Adrian,

    I tried working for side faces but its not working, can you please guide what can be the possibilities for side face landmark detection and yes i was also trying working on your example_02.jpg there imutils.resize() method was giving some error.
    Attribute error : ‘NoneType’ object has no attribute ‘shape’.

    Thanks and Regards
    Neeraj Kumar

    • Adrian Rosebrock April 5, 2017 at 11:56 am #

      If you’re getting a “NoneType” error it’s because you supplied an invalid path to the input image. You can read more about NoneType errors in this blog post.

      • Neeraj Kumar April 7, 2017 at 6:02 am #

        Fixed Buddy. Thanks a ton.
        can you please help me out with – how can i detect landmarks in video and compare with existing dataset of images.

        • Adrian Rosebrock April 8, 2017 at 12:51 pm #

          I will be discussing landmark detection in video streams in next weeks blog post.

  10. Manh Nguyen April 5, 2017 at 2:01 am #

    I hope next post you can use infrared camera

    • Adrian Rosebrock April 5, 2017 at 11:49 am #

      I don’t have any plans right now to cover IR cameras, but I’ll add it to my potential list of topics to cover.

  11. Sachin April 5, 2017 at 3:57 am #

    Nice article Adrian! Btw shouldn’t the shape points in Figure 2 be 0 based?

    • Adrian Rosebrock April 5, 2017 at 11:49 am #

      Figure 2 was provided by the creators of the iBUG dataset. They likely used MATLAB which is 1-indexed rather than 0-indexed.

    • Oli April 6, 2017 at 4:06 am #

      I also came across this. I have created an image and printed the index numbers as they are with dlib and python here: http://cvdrone.de/dlib-facial-landmark-detection.html

  12. Parag Jain April 5, 2017 at 10:47 am #

    Isn’t Independent Component Analysis used to find local features of a face? How is that approach different from this? Advantages? Drawbacks?

  13. Mansoor April 5, 2017 at 11:30 am #

    Adrian, i’m a huge fan! i don’t know how to thank you for this.

    I don’t know but i am having trouble running this code. It says that imutils package does not contain face_utils. I think it is not upgrading properly.

    • Adrian Rosebrock April 5, 2017 at 11:46 am #

      Make sure you run:

      $ pip install --upgrade imutils

      To make sure you have the latest version of imutils installed. You can check which version is installed via pip freeze

  14. addouch April 6, 2017 at 3:23 pm #

    amazing adrian

    I hope next time to show us how to recognize emotions on image

  15. tony April 6, 2017 at 3:53 pm #

    Hi Adrian ,thanks for this great post

    how dlib eye landmarks can be used to detect eye blinks ?

    • Adrian Rosebrock April 8, 2017 at 12:53 pm #

      Hi Tony — I’ll be covering how to perform blink detection with facial landmarks and dlib in the next two weeks. Stay tuned!

  16. bumbu April 10, 2017 at 8:59 am #

    May we have a tutorial about apply deep learning(CNN) using keras and tensorflow to classify some dataset, thanks sir, you are super!!!

  17. Rijul Paul April 24, 2017 at 4:38 am #

    Hey Adrian,thanks for this blog post.IS there a way so that we can create our own custom shape predictor?

    • Adrian Rosebrock April 24, 2017 at 9:32 am #

      Yes, but you will have to use the dlib library + custom C++ code to train the custom shape predictor.

Trackbacks/Pingbacks

  1. Real-time facial landmark detection with OpenCV, Python, and dlib - PyImageSearch - April 17, 2017

    […] We’ve started off by learning how to detect facial landmarks in an image. […]

Leave a Reply