Detect eyes, nose, lips, and jaw with dlib, OpenCV, and Python

Today’s blog post is part three in our current series on facial landmark detection and their applications to computer vision and image processing.

Two weeks ago I demonstrated how to install the dlib library which we are using for facial landmark detection.

Then, last week I discussed how to use dlib to actually detect facial landmarks in images.

Today we are going to take the next step and use our detected facial landmarks to help us label and extract face regions, including:

  • Mouth
  • Right eyebrow
  • Left eyebrow
  • Right eye
  • Left eye
  • Nose
  • Jaw

To learn how to extract these face regions individually using dlib, OpenCV, and Python, just keep reading.

Looking for the source code to this post?
Jump right to the downloads section.

Detect eyes, nose, lips, and jaw with dlib, OpenCV, and Python

Today’s blog post will start with a discussion on the (x, y)-coordinates associated with facial landmarks and how these facial landmarks can be mapped to specific regions of the face.

We’ll then write a bit of code that can be used to extract each of the facial regions.

We’ll wrap up the blog post by demonstrating the results of our method on a few example images.

By the end of this blog post, you’ll have a strong understanding of how face regions are (automatically) extracted via facial landmarks and will be able to apply this knowledge to your own applications.

Facial landmark indexes for face regions

The facial landmark detector implemented inside dlib produces 68 (x, y)-coordinates that map to specific facial structures. These 68 point mappings were obtained by training a shape predictor on the labeled iBUG 300-W dataset.

Below we can visualize what each of these 68 coordinates map to:

Figure 1: Visualizing each of the 68 facial coordinate points from the iBUG 300-W dataset (higher resolution).

Examining the image, we can see that facial regions can be accessed via simple Python indexing (assuming zero-indexing with Python since the image above is one-indexed):

  • The mouth can be accessed through points [48, 68].
  • The right eyebrow through points [17, 22].
  • The left eyebrow through points [22, 27].
  • The right eye using [36, 42].
  • The left eye with [42, 48].
  • The nose using [27, 35].
  • And the jaw via [0, 17].

These mappings are encoded inside the FACIAL_LANDMARKS_IDXS  dictionary inside face_utils of the imutils library:

Using this dictionary we can easily extract the indexes into the facial landmarks array and extract various facial features simply by supplying a string as a key.

Visualizing facial landmarks with OpenCV and Python

A slightly harder task is to visualize each of these facial landmarks and overlay the results on an input image.

To accomplish this, we’ll need the visualize_facial_landmarks  function, already included in the imutils library:

Our visualize_facial_landmarks  function requires two arguments, followed by two optional ones, each detailed below:

  • image : The image that we are going to draw our facial landmark visualizations on.
  • shape : The NumPy array that contains the 68 facial landmark coordinates that map to various facial parts.
  • colors : A list of BGR tuples used to color-code each of the facial landmark regions.
  • alpha : A parameter used to control the opacity of the overlay on the original image.

Lines 45 and 46 create two copies of our input image — we’ll need these copies so that we can draw a semi-transparent overlay on the output image.

Line 50 makes a check to see if the colors  list is None , and if so, initializes it with a preset list of BGR tuples (remember, OpenCV stores colors/pixel intensities in BGR order rather than RGB).

We are now ready to visualize each of the individual facial regions via facial landmarks:

On Line 56 we loop over each entry in the FACIAL_LANDMARKS_IDXS  dictionary.

For each of these regions, we extract the indexes of the given facial part and grab the (x, y)-coordinates from the shape  NumPy array.

Lines 63-69 make a check to see if we are drawing the jaw, and if so, we simply loop over the individual points, drawing a line connecting the jaw points together.

Otherwise, Lines 73-75 handle computing the convex hull of the points and drawing the hull on the overlay.

The last step is to create a transparent overlay via the cv2.addWeighted  function:

After applying visualize_facial_landmarks  to an image and associated facial landmarks, the output would look similar to the image below:

Figure 2: A visualization of each facial landmark region overlaid on the original image.

To learn how to glue all the pieces together (and extract each of these facial regions), let’s move on to the next section.

Extracting parts of the face using dlib, OpenCV, and Python

Before you continue with this tutorial, make sure you have:

  1. Installed dlib according to my instructions in this blog post.
  2. Have installed/upgraded imutils to the latest version, ensuring you have access to the face_utils  submodule:  pip install --upgrade imutils

From there, open up a new file, name it detect_face_parts.py , and insert the following code:

The first code block in this example is identical to the one in our previous tutorial.

We are simply:

  • Importing our required Python packages (Lines 2-7).
  • Parsing our command line arguments (Lines 10-15).
  • Instantiating dlib’s HOG-based face detector and loading the facial landmark predictor (Lines 19 and 20).
  • Loading and pre-processing our input image (Lines 23-25).
  • Detecting faces in our input image (Line 28).

Again, for a more thorough, detailed overview of this code block, please see last week’s blog post on facial landmark detection with dlib, OpenCV, and Python.

Now that we have detected faces in the image, we can loop over each of the face ROIs individually:

For each face region, we determine the facial landmarks of the ROI and convert the 68 points into a NumPy array (Lines 34 and 35).

Then, for each of the face parts, we loop over them and on Line 38.

We draw the name/label of the face region on Lines 42 and 43, then draw each of the individual facial landmarks as circles on Lines 47 and 48.

To actually extract each of the facial regions we simply need to compute the bounding box of the (x, y)-coordinates associated with the specific region and use NumPy array slicing to extract it:

Computing the bounding box of the region is handled on Line 51 via cv2.boundingRect .

Using NumPy array slicing we can extract the ROI on Line 52.

This ROI is then resized to have a width of 250 pixels so we can better visualize it (Line 53).

Lines 56-58 display the individual face region to our screen.

Lines 61-63 then apply the visualize_facial_landmarks  function to create a transparent overlay for each facial part.

Face part labeling results

Now that our example has been coded up, let’s take a look at some results.

Be sure to use the “Downloads” section of this guide to download the source code + example images + dlib facial landmark predictor model.

From there, you can use the following command to visualize the results:

Notice how my mouth is detected first:

Figure 3: Extracting the mouth region via facial landmarks.

Followed by my right eyebrow:

Figure 4: Determining the right eyebrow of an image using facial landmarks and dlib.

Then the left eyebrow:

Figure 5: The dlib library can extract facial regions from an image.

Next comes the right eye:

Figure 6: Extracting the right eye of a face using facial landmarks, dlib, OpenCV, and Python.

Along with the left eye:

Figure 7: Extracting the left eye of a face using facial landmarks, dlib, OpenCV, and Python.

And finally the jawline:

Figure 8: Automatically determining the jawline of a face with facial landmarks.

As you can see, the bounding box of the jawline is m entire face.

The last visualization for this image are our transparent overlays with each facial landmark region highlighted with a different color:

Figure 9: A transparent overlay that displays the individual facial regions extracted via the image with facial landmarks.

Let’s try another example:

This time I have created a GIF animation of the output:

Figure 10: Extracting facial landmark regions with computer vision.

The same goes for our final example:

Figure 11: Automatically labeling eyes, eyebrows, nose, mouth, and jaw using facial landmarks.

Summary

In this blog post I demonstrated how to detect various facial structures in an image using facial landmark detection.

Specifically, we learned how to detect and extract the:

  • Mouth
  • Right eyebrow
  • Left eyebrow
  • Right eye
  • Left eye
  • Nose
  • Jawline

This was accomplished using dlib’s pre-trained facial landmark detector along with a bit of OpenCV and Python magic.

At this point you’re probably quite impressed with the accuracy of facial landmarks — and there are clear advantages of using facial landmarks, especially for face alignment, face swapping, and extracting various facial structures.

…but the big question is:

“Can facial landmark detection run in real-time?”

To find out, you’ll need to stay tuned for next week’s blog post.

To be notified when next week’s blog post on real-time facial landmark detection is published, be sure to enter your email address in the form below!

See you then.

Downloads:

If you would like to download the code and images used in this post, please enter your email address in the form below. Not only will you get a .zip of the code, I’ll also send you a FREE 11-page Resource Guide on Computer Vision and Image Search Engines, including exclusive techniques that I don’t post on this blog! Sound good? If so, enter your email address and I’ll send you the code immediately!

, , , ,

22 Responses to Detect eyes, nose, lips, and jaw with dlib, OpenCV, and Python

  1. Neeraj Kumar April 10, 2017 at 1:53 pm #

    Dear Adrian,

    You have seriously got a fan man, amazing explanations. It’s something like now i do wait for your new upcoming blog posts just as i do wait for GOT new season.
    Actually you are the one who make this computer vision concept very simple, which is otherwise not.
    Respect.

    Thanks and Regards
    Neeraj Kumar

    • Adrian Rosebrock April 12, 2017 at 1:14 pm #

      Thank you Neeraj, I really appreciate that. Comments like these really make my day 🙂

      • Neeraj Kumar April 12, 2017 at 3:16 pm #

        Keep smiling and keep posting awesome blog posts.

        • Adrian Rosebrock April 16, 2017 at 9:09 am #

          Thanks Neeraj!

  2. Anthony The Koala April 10, 2017 at 2:31 pm #

    Dear Dr Adrian,
    There is a lot to learn in your blogs and I thank you for these blogs. I hope I am not off-topic. Recently in the news there was a smart phone that could detect whether the image of a face was from a real person or a photograph. If the image was that of a photograph the smart phone would not allow the user to use the phone’s facility.
    To apply my question to today’s blog in detecting eyes, nose and jaw, is there a way to tell whether the elements of the face can be from a real face or a photo of a face?
    Thank you
    Anthony of Sydney NSW

    • Adrian Rosebrock April 12, 2017 at 1:13 pm #

      There are many methods to accomplish this, but the most reliable is to use stereo/depth cameras so you can determine the depth of the face versus a flat 2D space. As for the actual article you’re referring to, I haven’t read it so it would be great if you could link to it.

      • Anthony The Koala April 13, 2017 at 3:23 am #

        Dear Dr Adrian,
        I apologise by forgetting to put the word ‘not’ between. It should read “there was a smart phone that could not detect whether the image of a face was from a real person or a photograph.

        Similar article, with a statement from Samsung saying that facial recognition currently “..cannot be used to authenticate access to Samsung Pay or Secure Folder….”

        Solution may well that your authentication system may well need two cameras for 3D or more ‘clever’ 2D techniques such that the authentication system cannot be ‘tricked’.

        Regards
        Anthony of Sydney NSW

  3. ulzii April 10, 2017 at 9:50 pm #

    you can detect beautiful lady as well 🙂

    • Adrian Rosebrock April 12, 2017 at 1:12 pm #

      That beautiful lady is my fiancée.

  4. Leena April 10, 2017 at 11:54 pm #

    Excellent . simple and detail code instructions. Can you more details on how to define the facial part. Definition of eye/nose/jaw…..

    Thanks

    • Adrian Rosebrock April 12, 2017 at 1:12 pm #

      Hi Leena — what do you mean by “define the facial part”?

  5. Rencita April 11, 2017 at 2:49 am #

    Hello Adrian,
    I tried your post for detecting facial features, but it gives me a error saying:
    RuntimeError: Unable to open shape_predictor_68_face_landmarks.dat

    where could i have been gone wrong?

    Thanks in advance

    • Adrian Rosebrock April 12, 2017 at 1:10 pm #

      Make sure you use the “Downloads” section of this blog post to download the shape_predictor_68_face_landmarks.dat and source code. From there the example will work.

  6. aravind April 12, 2017 at 11:57 am #

    Hi adrian,

    how can this be used for video instead of the image as argument.

    • Adrian Rosebrock April 12, 2017 at 12:54 pm #

      I will be demonstrating how to apply facial landmarks to video streams in next week’s blog post.

  7. Wim Valcke April 17, 2017 at 1:21 pm #

    First of all, nice blog post. Keep up doing this, i learned a lot about computer vision in a limited amount of time. I will buy definitely your new book about deep learning.

    There is a small error in the face_utils.py of the imutils library
    In the definition of FACIAL_LANDMARKS_IDXS

    (“nose”, (27, 35)),

    Thus should be (“nose”, (27, 36)),

    As the nose should contain 9 points, in the existing implementation this is only 8 points
    This can be seen in the example images too.

    • Adrian Rosebrock April 19, 2017 at 12:56 pm #

      Good catch — thanks Wim! I’ll make sure this is change is made in the latest version of imutils and I’ll also get the blog post is updated as well.

  8. Taimur Bilal April 17, 2017 at 2:16 pm #

    This is amazing. I just wanted to ask one question.

    If you are running a detector on these images coming from polling a video stream, can we say you are tracking the facial features? Is it true that tracking can be implemented by implementing an “ultra fast” detector on every frame of a video stream.

    Again, thanks a lot for these amazing tutorials.

    • Adrian Rosebrock April 19, 2017 at 12:55 pm #

      “Tracking” algorithms by definition try to incorporate some other extra temporal information into the algorithm so running an ultra fast face detector on a video stream isn’t technically a true tracking algorithm, but it would give you the same result, provided that no two faces overlap in the video stream.

  9. Dimas April 18, 2017 at 1:23 am #

    Daamn.. ENGINEER WILL ALWAYS WIN!
    http://www.pyimagesearch.com/wp-content/uploads/2017/03/detect_face_parts_example_03.gif

  10. Hailay Berihu April 25, 2017 at 5:44 am #

    Thank you very much Dr. Adrian! All your blogs are amazing and timely. However, this one is so special to me!! I am enjoying all your blogs! Keep it up!

    • Adrian Rosebrock April 25, 2017 at 11:46 am #

      Thanks Hailay!

Leave a Reply