Detect eyes, nose, lips, and jaw with dlib, OpenCV, and Python

Today’s blog post is part three in our current series on facial landmark detection and their applications to computer vision and image processing.

Two weeks ago I demonstrated how to install the dlib library which we are using for facial landmark detection.

Then, last week I discussed how to use dlib to actually detect facial landmarks in images.

Today we are going to take the next step and use our detected facial landmarks to help us label and extract face regions, including:

  • Mouth
  • Right eyebrow
  • Left eyebrow
  • Right eye
  • Left eye
  • Nose
  • Jaw

To learn how to extract these face regions individually using dlib, OpenCV, and Python, just keep reading.

Looking for the source code to this post?
Jump right to the downloads section.

Detect eyes, nose, lips, and jaw with dlib, OpenCV, and Python

Today’s blog post will start with a discussion on the (x, y)-coordinates associated with facial landmarks and how these facial landmarks can be mapped to specific regions of the face.

We’ll then write a bit of code that can be used to extract each of the facial regions.

We’ll wrap up the blog post by demonstrating the results of our method on a few example images.

By the end of this blog post, you’ll have a strong understanding of how face regions are (automatically) extracted via facial landmarks and will be able to apply this knowledge to your own applications.

Facial landmark indexes for face regions

The facial landmark detector implemented inside dlib produces 68 (x, y)-coordinates that map to specific facial structures. These 68 point mappings were obtained by training a shape predictor on the labeled iBUG 300-W dataset.

Below we can visualize what each of these 68 coordinates map to:

Figure 1: Visualizing each of the 68 facial coordinate points from the iBUG 300-W dataset (higher resolution).

Examining the image, we can see that facial regions can be accessed via simple Python indexing (assuming zero-indexing with Python since the image above is one-indexed):

  • The mouth can be accessed through points [48, 68].
  • The right eyebrow through points [17, 22].
  • The left eyebrow through points [22, 27].
  • The right eye using [36, 42].
  • The left eye with [42, 48].
  • The nose using [27, 35].
  • And the jaw via [0, 17].

These mappings are encoded inside the FACIAL_LANDMARKS_IDXS  dictionary inside face_utils of the imutils library:

Using this dictionary we can easily extract the indexes into the facial landmarks array and extract various facial features simply by supplying a string as a key.

Visualizing facial landmarks with OpenCV and Python

A slightly harder task is to visualize each of these facial landmarks and overlay the results on an input image.

To accomplish this, we’ll need the visualize_facial_landmarks  function, already included in the imutils library:

Our visualize_facial_landmarks  function requires two arguments, followed by two optional ones, each detailed below:

  • image : The image that we are going to draw our facial landmark visualizations on.
  • shape : The NumPy array that contains the 68 facial landmark coordinates that map to various facial parts.
  • colors : A list of BGR tuples used to color-code each of the facial landmark regions.
  • alpha : A parameter used to control the opacity of the overlay on the original image.

Lines 45 and 46 create two copies of our input image — we’ll need these copies so that we can draw a semi-transparent overlay on the output image.

Line 50 makes a check to see if the colors  list is None , and if so, initializes it with a preset list of BGR tuples (remember, OpenCV stores colors/pixel intensities in BGR order rather than RGB).

We are now ready to visualize each of the individual facial regions via facial landmarks:

On Line 56 we loop over each entry in the FACIAL_LANDMARKS_IDXS  dictionary.

For each of these regions, we extract the indexes of the given facial part and grab the (x, y)-coordinates from the shape  NumPy array.

Lines 63-69 make a check to see if we are drawing the jaw, and if so, we simply loop over the individual points, drawing a line connecting the jaw points together.

Otherwise, Lines 73-75 handle computing the convex hull of the points and drawing the hull on the overlay.

The last step is to create a transparent overlay via the cv2.addWeighted  function:

After applying visualize_facial_landmarks  to an image and associated facial landmarks, the output would look similar to the image below:

Figure 2: A visualization of each facial landmark region overlaid on the original image.

To learn how to glue all the pieces together (and extract each of these facial regions), let’s move on to the next section.

Extracting parts of the face using dlib, OpenCV, and Python

Before you continue with this tutorial, make sure you have:

  1. Installed dlib according to my instructions in this blog post.
  2. Have installed/upgraded imutils to the latest version, ensuring you have access to the face_utils  submodule:  pip install --upgrade imutils

From there, open up a new file, name it detect_face_parts.py , and insert the following code:

The first code block in this example is identical to the one in our previous tutorial.

We are simply:

  • Importing our required Python packages (Lines 2-7).
  • Parsing our command line arguments (Lines 10-15).
  • Instantiating dlib’s HOG-based face detector and loading the facial landmark predictor (Lines 19 and 20).
  • Loading and pre-processing our input image (Lines 23-25).
  • Detecting faces in our input image (Line 28).

Again, for a more thorough, detailed overview of this code block, please see last week’s blog post on facial landmark detection with dlib, OpenCV, and Python.

Now that we have detected faces in the image, we can loop over each of the face ROIs individually:

For each face region, we determine the facial landmarks of the ROI and convert the 68 points into a NumPy array (Lines 34 and 35).

Then, for each of the face parts, we loop over them and on Line 38.

We draw the name/label of the face region on Lines 42 and 43, then draw each of the individual facial landmarks as circles on Lines 47 and 48.

To actually extract each of the facial regions we simply need to compute the bounding box of the (x, y)-coordinates associated with the specific region and use NumPy array slicing to extract it:

Computing the bounding box of the region is handled on Line 51 via cv2.boundingRect .

Using NumPy array slicing we can extract the ROI on Line 52.

This ROI is then resized to have a width of 250 pixels so we can better visualize it (Line 53).

Lines 56-58 display the individual face region to our screen.

Lines 61-63 then apply the visualize_facial_landmarks  function to create a transparent overlay for each facial part.

Face part labeling results

Now that our example has been coded up, let’s take a look at some results.

Be sure to use the “Downloads” section of this guide to download the source code + example images + dlib facial landmark predictor model.

From there, you can use the following command to visualize the results:

Notice how my mouth is detected first:

Figure 3: Extracting the mouth region via facial landmarks.

Followed by my right eyebrow:

Figure 4: Determining the right eyebrow of an image using facial landmarks and dlib.

Then the left eyebrow:

Figure 5: The dlib library can extract facial regions from an image.

Next comes the right eye:

Figure 6: Extracting the right eye of a face using facial landmarks, dlib, OpenCV, and Python.

Along with the left eye:

Figure 7: Extracting the left eye of a face using facial landmarks, dlib, OpenCV, and Python.

And finally the jawline:

Figure 8: Automatically determining the jawline of a face with facial landmarks.

As you can see, the bounding box of the jawline is m entire face.

The last visualization for this image are our transparent overlays with each facial landmark region highlighted with a different color:

Figure 9: A transparent overlay that displays the individual facial regions extracted via the image with facial landmarks.

Let’s try another example:

This time I have created a GIF animation of the output:

Figure 10: Extracting facial landmark regions with computer vision.

The same goes for our final example:

Figure 11: Automatically labeling eyes, eyebrows, nose, mouth, and jaw using facial landmarks.

Summary

In this blog post I demonstrated how to detect various facial structures in an image using facial landmark detection.

Specifically, we learned how to detect and extract the:

  • Mouth
  • Right eyebrow
  • Left eyebrow
  • Right eye
  • Left eye
  • Nose
  • Jawline

This was accomplished using dlib’s pre-trained facial landmark detector along with a bit of OpenCV and Python magic.

At this point you’re probably quite impressed with the accuracy of facial landmarks — and there are clear advantages of using facial landmarks, especially for face alignment, face swapping, and extracting various facial structures.

…but the big question is:

“Can facial landmark detection run in real-time?”

To find out, you’ll need to stay tuned for next week’s blog post.

To be notified when next week’s blog post on real-time facial landmark detection is published, be sure to enter your email address in the form below!

See you then.

Downloads:

If you would like to download the code and images used in this post, please enter your email address in the form below. Not only will you get a .zip of the code, I’ll also send you a FREE 11-page Resource Guide on Computer Vision and Image Search Engines, including exclusive techniques that I don’t post on this blog! Sound good? If so, enter your email address and I’ll send you the code immediately!

, , , ,

57 Responses to Detect eyes, nose, lips, and jaw with dlib, OpenCV, and Python

  1. Neeraj Kumar April 10, 2017 at 1:53 pm #

    Dear Adrian,

    You have seriously got a fan man, amazing explanations. It’s something like now i do wait for your new upcoming blog posts just as i do wait for GOT new season.
    Actually you are the one who make this computer vision concept very simple, which is otherwise not.
    Respect.

    Thanks and Regards
    Neeraj Kumar

    • Adrian Rosebrock April 12, 2017 at 1:14 pm #

      Thank you Neeraj, I really appreciate that. Comments like these really make my day 🙂

      • Neeraj Kumar April 12, 2017 at 3:16 pm #

        Keep smiling and keep posting awesome blog posts.

        • Adrian Rosebrock April 16, 2017 at 9:09 am #

          Thanks Neeraj!

  2. Anthony The Koala April 10, 2017 at 2:31 pm #

    Dear Dr Adrian,
    There is a lot to learn in your blogs and I thank you for these blogs. I hope I am not off-topic. Recently in the news there was a smart phone that could detect whether the image of a face was from a real person or a photograph. If the image was that of a photograph the smart phone would not allow the user to use the phone’s facility.
    To apply my question to today’s blog in detecting eyes, nose and jaw, is there a way to tell whether the elements of the face can be from a real face or a photo of a face?
    Thank you
    Anthony of Sydney NSW

    • Adrian Rosebrock April 12, 2017 at 1:13 pm #

      There are many methods to accomplish this, but the most reliable is to use stereo/depth cameras so you can determine the depth of the face versus a flat 2D space. As for the actual article you’re referring to, I haven’t read it so it would be great if you could link to it.

      • Anthony The Koala April 13, 2017 at 3:23 am #

        Dear Dr Adrian,
        I apologise by forgetting to put the word ‘not’ between. It should read “there was a smart phone that could not detect whether the image of a face was from a real person or a photograph.

        Similar article, with a statement from Samsung saying that facial recognition currently “..cannot be used to authenticate access to Samsung Pay or Secure Folder….”

        Solution may well that your authentication system may well need two cameras for 3D or more ‘clever’ 2D techniques such that the authentication system cannot be ‘tricked’.

        Regards
        Anthony of Sydney NSW

  3. ulzii April 10, 2017 at 9:50 pm #

    you can detect beautiful lady as well 🙂

    • Adrian Rosebrock April 12, 2017 at 1:12 pm #

      That beautiful lady is my fiancée.

  4. Leena April 10, 2017 at 11:54 pm #

    Excellent . simple and detail code instructions. Can you more details on how to define the facial part. Definition of eye/nose/jaw…..

    Thanks

    • Adrian Rosebrock April 12, 2017 at 1:12 pm #

      Hi Leena — what do you mean by “define the facial part”?

  5. Rencita April 11, 2017 at 2:49 am #

    Hello Adrian,
    I tried your post for detecting facial features, but it gives me a error saying:
    RuntimeError: Unable to open shape_predictor_68_face_landmarks.dat

    where could i have been gone wrong?

    Thanks in advance

    • Adrian Rosebrock April 12, 2017 at 1:10 pm #

      Make sure you use the “Downloads” section of this blog post to download the shape_predictor_68_face_landmarks.dat and source code. From there the example will work.

    • Salma August 11, 2017 at 7:45 am #

      hello , so basiclly if you dow,loaded “shape_predictor_68_face_landmarks.dat.bz2” with the wget method , you need to unzip it

      bzip2 -d filename.bz2 or bzip2 -dk filename.bz2 if you want to keep the original archive

  6. aravind April 12, 2017 at 11:57 am #

    Hi adrian,

    how can this be used for video instead of the image as argument.

    • Adrian Rosebrock April 12, 2017 at 12:54 pm #

      I will be demonstrating how to apply facial landmarks to video streams in next week’s blog post.

  7. Wim Valcke April 17, 2017 at 1:21 pm #

    First of all, nice blog post. Keep up doing this, i learned a lot about computer vision in a limited amount of time. I will buy definitely your new book about deep learning.

    There is a small error in the face_utils.py of the imutils library
    In the definition of FACIAL_LANDMARKS_IDXS

    (“nose”, (27, 35)),

    Thus should be (“nose”, (27, 36)),

    As the nose should contain 9 points, in the existing implementation this is only 8 points
    This can be seen in the example images too.

    • Adrian Rosebrock April 19, 2017 at 12:56 pm #

      Good catch — thanks Wim! I’ll make sure this is change is made in the latest version of imutils and I’ll also get the blog post is updated as well.

  8. Taimur Bilal April 17, 2017 at 2:16 pm #

    This is amazing. I just wanted to ask one question.

    If you are running a detector on these images coming from polling a video stream, can we say you are tracking the facial features? Is it true that tracking can be implemented by implementing an “ultra fast” detector on every frame of a video stream.

    Again, thanks a lot for these amazing tutorials.

    • Adrian Rosebrock April 19, 2017 at 12:55 pm #

      “Tracking” algorithms by definition try to incorporate some other extra temporal information into the algorithm so running an ultra fast face detector on a video stream isn’t technically a true tracking algorithm, but it would give you the same result, provided that no two faces overlap in the video stream.

  9. Dimas April 18, 2017 at 1:23 am #

    Daamn.. ENGINEER WILL ALWAYS WIN!
    http://www.pyimagesearch.com/wp-content/uploads/2017/03/detect_face_parts_example_03.gif

  10. Anthony The Koala April 24, 2017 at 1:11 pm #

    Dear Dr Adrian,
    In figure 10 “Extracting facial landmark regions with computer vision,” how is it that the program could differentiate between the face of a human and of a non-human. This is contrast to figure 11 where there are two humans where the algorithm could detect two humans.

    Thank you,
    Anthony of Sydney NSW

    • Adrian Rosebrock April 28, 2017 at 10:00 am #

      The best way to handle determining if a face is a “real” face or simply a photo of a face is to apply face detection + use a depth camera so you can compute the depth of the image. If the face is “flat” then you know it’s a photograph.

      • Anthony The Koala April 28, 2017 at 9:49 pm #

        Dear Dr Adrian,
        I think I should have been clearer. The question was not about distinguishing a fake from a real which you addressed earlier.

        I should have been more direct. In figure 10, there is a picture of you with your dog.
        How does the algorithm make the distinction between a human face, yours and a non-human face, the dog.

        Or put it another way, how did the algorithm make the distinction between a human face and a dog. In other words how did the algorithm detect that the dog’s face was not human.

        This is in contrast to figure 11, there is a picture of you with your special lady. The algorithm could detect that there were two human faces present in contrast to figure 10 with one human face and one non-human face.

        I hope I was clearer

        I thank you for your tutorial/blogs,

        Regards

        Anthony, Sydney NSW
        Anthony from Sydney nSW

        • Adrian Rosebrock May 1, 2017 at 1:46 pm #

          To start, I think it would be beneficial to understand how object detectors work. The dlib library ships with an object detector that is pre-trained to detect human faces. That is why we can detect human faces in the image but not dog faces.

          • Anthony The Koala May 1, 2017 at 8:27 pm #

            Dear Dr Adrian,
            Thank you for the link http://www.pyimagesearch.com/2014/11/10/histogram-oriented-gradients-object-detection/. The article provides a quick review of the various object detection techniques and how the early detection methods for example using Haar wavelets produces a false positive as demonstrated by the soccer player’s face and part of the soccer field’s side advertising being detected as a face.
            The article goes on a brief step-by-step guide on object detection based on papers by Felzenszwalb et al. and Tomasz.
            A lot to learn
            Anthony of Sydney NSW Australia

  11. Hailay Berihu April 25, 2017 at 5:44 am #

    Thank you very much Dr. Adrian! All your blogs are amazing and timely. However, this one is so special to me!! I am enjoying all your blogs! Keep it up!

    • Adrian Rosebrock April 25, 2017 at 11:46 am #

      Thanks Hailay!

  12. Helen April 27, 2017 at 6:06 am #

    Hi,Adrian. Your work is amazing and very useful to me. I’m a undergraduate student and l’m learning things about opencv and computer vision. In my graduation project, I want to finish a program to realize simple virtual makeup. I’m a lazy girl, I want to know what will I look if I put on makeup. I have an idea that I want to draw different colors to different parts of the face,like red color to lips or pink color to cheek or something like that. Now, I can detect 65 points of one face using the realtime camera. I’m writing to ask you, using your way, can I realize my virtual makeup program? And I want to know if you have any good ideas about my virtual makeup program. Your advice will be welcome and appreciate.

    best wishes!
    Helen

    • Adrian Rosebrock April 28, 2017 at 9:30 am #

      Yes, this is absolutely possible. Once you have detected each facial structure you can apply alpha blending and perhaps even a bit of seamless cloning to accomplish this. It’s not an easy project and will require much research on your part, but again, it’s totally possible.

      • Helen April 28, 2017 at 9:33 pm #

        Thank you Adrian. I’ll try my best to finish it.

  13. Rishabh Gupta April 29, 2017 at 11:43 am #

    I’ve two questions

    #1. While extracting the ROI of the face region as a separate image, on line 52 why have you used
    roi = image[y:y+h, x:x+w] . Shouldn’t it be the reverse ? i.e. roi = image[x:x+w, y:y+h] ??

    #2. What does INTER_CUBIC mean ? I’ve checked the documentation. It says INTER_CUBIC is slow. So, why use it at the first place if you’ve a better alternate(INTER_LINEAR) available ?

    Thanks in advance.

    • Adrian Rosebrock May 1, 2017 at 1:35 pm #

      1. No, images are matrices. We access an individual element of a matrix by supplying the row value first (the y-coordinate) followed by the column number (the x-coordinate). Therefore, roi = image[y:y+h, x:x+w] is correct, although it may feel awkward to write.

      2. This implies that we are doing cubic interpolation, which is indeed slower that linear interpolation, but is better at upsampling images.

      If you’re just getting started with computer vision, image processing, and OpenCV I would definitely suggest reading through Practical Python and OpenCV as this will help you learn the fundamentals quickly. Be sure to take a look!

  14. Anthony The Koala May 11, 2017 at 3:05 am #

    Dear Dr Adrian,
    Suppose a camera was fitted with a fisheye lens. Recall that a fisheye lens produces a wide angled image. As a result the image will be distorted.
    Question:
    If the image is distorted, is there a way of ‘processing’/’correcting’ the distorted image to a normal image then apply face detection.
    Alternatively, if the camera has a fisheye lens, can a detection algorithm such as yours handle face detection.
    Alternatively, is there a correction algorithm for a fisheye lens.
    Thank you,
    Anthony of Sydney Australia

    • Adrian Rosebrock May 11, 2017 at 8:43 am #

      If you’re using a fisheye lens you can actually “correct” for the fisheye distortion. I would suggest starting here for more details.

  15. Memoona May 13, 2017 at 8:34 am #

    Hi adrian, thanks a lot for this blog and all others too i have learnt a lot from you. I have a few questions please.

    1. If i want to detect a smile on a face by measuring distance between landmarks 49 and 65 by applying simple distance fornula where unit of distance will be pixil. So my question is how can i know x and y coordinates for particular landmarks so i can apply mathematics and compare with database image?
    2. I want to do both face recognition and emotion detection so is there any way i can make it faster? Atleast near to realtime?

    Stay blessed
    Memoona

    • Adrian Rosebrock May 15, 2017 at 8:49 am #

      1. It’s certainly possible to build a smile detector using facial landmarks, but it would be very error prone. Instead, it would be better to train a machine learning classifier on smiling vs. not smiling faces. I cover how to do this inside Deep Learning for Computer Vision with Python.

      2. I also cover real-time facial expression/emotion detection inside Deep Learning for Computer Vision with Python as well.

  16. Arun VIJAY June 15, 2017 at 6:40 am #

    Hi Andrain,

    Is it possible to face verification with your face recognition code like the input is two images one is in ID card of the company which is having my face and the other one is my selfie image i need to compare and find both the person are same

  17. bharath grandhi June 19, 2017 at 7:01 am #

    File “detect_face_parts.py”, line 5, in
    from imutils import face_utils
    ImportError: cannot import name face_utils
    sir can i know solution for this error….

    • Adrian Rosebrock June 20, 2017 at 10:59 am #

      Make sure you update your version of “imutils” to the latest version:

      $ pip install imutils

  18. Valeriano July 4, 2017 at 3:09 pm #

    Dear Adrian. Thanks so much for your clear explanation. This blog is very useful, i’ve learnt about computer vision in a couple days. I have a problem when a I’m trying to execute this program in Ubuntu Terminal. The next error apear:

    Illegal instruction (core dumped)

    I’ve read about it and it’s probably a Boost.Python problema. Can you give me some help to solve this problem?

    • Adrian Rosebrock July 5, 2017 at 5:56 am #

      I would insert a bunch of “print” statements to determine which line is throwing the error. My guess is that the error is coming from “import dlib”, in which case you are importing the library into a different version than it was compiled against.

  19. Abhishek Mane July 6, 2017 at 12:45 pm #

    Hey Mr. Adrian, nice tutorial , I wanted to ask can I do this for android since it requires too many libraries? I’m trying to create an augmented reality program for android using unity game engine so can you tell me relative to unity?

    • Adrian Rosebrock July 7, 2017 at 9:53 am #

      I don’t have any experience developing Android applications. There are Java + OpenCV bindings, but you’ll need to look for a Java implementation of facial landmarks.

  20. Arick Chen July 18, 2017 at 10:37 am #

    Dear Adrian,
    Thanks a lot for all your works in this blog and codes. It is really amazing that it can detect eyes more accurate than many other face detection APIs.

    There is a question I want to ask.
    Recently, I am doing a research which needs really accurate eyes landmarks and this tutorial almost meets my needs.
    However, I also need the landmark of pupils. Have you ever done it before? Or how can I get an accurate pupil landmark of the eye.

    • Adrian Rosebrock July 21, 2017 at 9:06 am #

      I personally haven’t done any work in pupil detection, but I’ve heard that others have had good luck with this tutorial.

  21. CHIARA ANDREA LANTAJO August 4, 2017 at 7:02 am #

    Thank you so much for this tutorial! It worked perfectly. I have a question though, can this method of detecting facial features work with images that does not contain whole faces? For example, the picture that I’m about to process only contains the nose, eyes and eyebrows (basically zoomed up images). Or does it only work on images with all the facial features specified above?

    I’m actually trying to detect the center of the nose using zoomed up images that only contain the eyes and the nose and a little bit of the eyebrows. If this method of detection will not work, can you please suggest any other method that I can use. Thank you so much. 🙂

    • Adrian Rosebrock August 4, 2017 at 7:13 am #

      If you can detect the face via a Haar cascade or HOG + Linear SVM detector provided by dlib then you can fit the facial landmarks to the face. The problem is that if you do not have the entire face in view, then the landmarks may not be entirely accurate.

      If you are working with zoomed in images, I would suggest training your own custom nose and eye detectors. OpenCV has a number of Haar cascades for this in their Reply

  • Shahnawaz Shaikh August 5, 2017 at 10:17 am #

    Hi adrian fantatic post…after using hog i am able to track the landmarks of face in the video.But is it possible to track the face just the way you did for green ball example.So as to track a persons attention.Like if he moves his face up down or sideways there has to be a prompt like subject is distracted…Help much appreciated.

    • Adrian Rosebrock August 10, 2017 at 9:09 am #

      You could monitor the (x, y)-coordinates of the facial landmarks. If they change in direction you can use this to determine if the person is changing the viewing angle of their face.

  • Abhranil August 7, 2017 at 2:57 pm #

    How will I detect the nose,eyes and other features in the face? I am a beginner.
    Thanks in advance.

    • Adrian Rosebrock August 10, 2017 at 9:00 am #

      Hi Abhranil — I’m not sure what you mean. This blog post explains how to extract the nose, eyes, etc.

  • Hansani August 14, 2017 at 5:02 am #

    Hello Adrian,
    I need to detect face when eyes are covered with hand using a 2D video. I could’t do this because both eyes and the hands are in skin color. Could you please help me?
    Thank You

    • Adrian Rosebrock August 14, 2017 at 1:07 pm #

      I would suggest using either pre-trained OpenCV Haar cascades for nose/lip detection or training your own classifier here. This will help if the eyes are obstructed.

  • preethi August 22, 2017 at 8:36 am #

    hi sir!!
    thanks a lot for your wonderful blog posts.. i did face recognition api and eye detection from your blog only..now am trying to do eye recognition api i detected pupil and iris but i dont know how to recognize it.. can you please help me!!

  • Trackbacks/Pingbacks

    1. Drowsiness detection with OpenCV - PyImageSearch - May 8, 2017

      […] The facial landmarks produced by dlib are an indexable list, as I describe here: […]

    Leave a Reply