Optimizing dlib shape predictor accuracy with find_min_global

In this tutorial you will learn how to use dlib’s find_min_global function to optimize the options and hyperparameters to dlib’s shape predictor, yielding a more accurate model.

A few weeks ago I published a two-part series on using dlib to train custom shape predictors:

  1. Part one: Training a custom dlib shape predictor
  2. Part two: Tuning dlib shape predictor hyperparameters to balance speed, accuracy, and model size

When I announced the first post on social media, Davis King, the creator of dlib, chimed in and suggested that I demonstrate how to use dlib’s find_min_global function to optimize the shape predictor hyperparameters:

Figure 1: Dlib’s creator and maintainer, Davis King, suggested that I write content on optimizing dlib shape predictor accuracy with find_min_global.

I loved the idea and immediately began writing code and gathering results.

Today I’m pleased to share the bonus guide on training dlib shape predictors and optimizing their hyperparameters.

I hope you enjoy it!

To learn how to use dlib’s find_min_global function to optimize shape predictor hyperparameters, just keep reading!

Looking for the source code to this post?
Jump right to the downloads section.

Optimizing dlib shape predictor accuracy with find_min_global

In the first part of this tutorial, we’ll discuss dlib’s find_min_global function and how it can be used to optimize the options/hyperparameters to a shape predictor.

We’ll also compare and contrast find_min_global to a standard grid search.

Next, we’ll discuss the dataset we’ll be using for this tutorial, including reviewing our directory structure for the project.

We’ll then open up our code editor and get our hands dirty by implementing three Python scripts including:

  1. A configuration file.
  2. A script used to optimize hyperparameters via find_min_global .
  3. A script used to take the best hyperparameters found via find_min_global and then train an optimal shape predictor using these values.

We’ll wrap up the post with a short discussion on when you should be using find_min_global versus performing a standard grid hyperparameter search.

Let’s get started!

What does dlib’s find_min_global function do? And how can we use it to tune shape predictor options?

Video Source: A Global Optimization Algorithm Worth Using by Davis King

A few weeks ago you learned how to tune dlib’s shape predictor options using a systematic grid search.

That method worked well enough, but the problem is a grid search isn’t a true optimizer!

Instead, we hardcoded hyperparameter values we want to explore, the grid search computes all possible combinations of these values, and then explores them one-by-one.

Grid searches are computationally wasteful as the algorithm spends precious time and CPU cycles exploring hyperparameter combinations that will never yield the best possible results.

Wouldn’t it be more advantageous if we could instead iteratively tune our options, ensuring that with each iteration we are incrementally improving our model?

In fact, that’s exactly what dlib find_min_global function does!

Davis King, the creator of the dlib library, documented his struggle with hyperparameter tuning algorithms, including:

  • Guess and check: An expert uses his gut instinct and previous experience to manually set hyperparameters, run the algorithm, inspect the results, and then use the results to make an educated guess as to what the next set of hyperparameters to explore will be.
  • Grid search: Hardcode all possible hyperparameter values you want to test, compute all possible combinations of these hyperparameters, and then let the computer test them all, one-by-one.
  • Random search: Hardcode upper and lower limits/ranges on the hyperparamters you want to explore and then allow the computer to randomly sample the hyperparameter values within those ranges.
  • Bayesian optimization: A global optimization strategy for black-box algorithms. This method often has more hyperparameters to tune than the original algorithm itself. Comparatively, you are better off using a “guess and check” strategy or throwing hardware at the problem via grid searching or random searching.
  • Local optimization with a good initial guess: This method is good, but is limited to finding local optima with no guarantee it will find the global optima.

Eventually, Davis came across Malherbe and Vayatis’s 2017 paper, Global optimization of Lipschitz functions, which he then implemented into the dlib library via the find_min_global function.

Unlike Bayesian methods, which are near impossible to tune, and local optimization methods, which place no guarantees on a globally optimal solution, the Malherbe and Vayatis method is parameter-free and provably correct for finding a set of values that maximizes/minimizes a particular function.

Davis has written extensively on the optimization method in the following blog post — I suggest you give it a read if you are interested in the mathematics behind the optimization method.

The iBUG-300W dataset

Figure 2: The iBug 300-W face landmark dataset is used to train a custom dlib shape predictor. Using dlib’s find_min_global optimization method, we will optimize an eyes-only shape predictor.

To find the optimal dlib shape predictor hyperparameters, we’ll be using the iBUG 300-W dataset, the same dataset we used for previous our two-part series on shape predictors.

The iBUG 300-W dataset is perfect for training facial landmark predictors to localize the individual structures of the face, including:

  • Eyebrows
  • Eyes
  • Nose
  • Mouth
  • Jawline

Shape predictor data files can become quite large. To combat this, we’ll be training our shape predictor to localize only the eyes rather than all face landmarks. You could just as easily train a shape predictor to recognize only the mouth, etc.

Configuring your dlib development environment

To follow along with today’s tutorial, you will need a virtual environment with the following packages installed:

  • dlib
  • OpenCV
  • imutils
  • scikit-learn

Luckily, each of these packages is pip-installable. That said, there are a handful of prerequisites (including Python virtual environments). Be sure to follow these two guides for additional information in configuring your development environment:

The pip install commands include:

The workon  command becomes available once you install virtualenv  and virtualenvwrapper  per either my dlib or OpenCV installation guides.

Downloading the iBUG-300W dataset

To follow along with this tutorial, you will need to download the iBUG 300-W dataset (~1.7GB):

http://dlib.net/files/data/ibug_300W_large_face_landmark_dataset.tar.gz

While the dataset is downloading, you should also use the “Downloads” section of this tutorial to download the source code.

You can either (1) use the hyperlink above, or (2) use wget  to download the dataset. Let’s cover both methods so that your project is organized just like my own.

Option 1: Use the hyperlink above to download the dataset and then place the iBug 300-W dataset into the folder associated with the download of this tutorial like this:

Option 2: Rather than clicking the hyperlink above, use wget  in your terminal to download the dataset directly:

You’re now ready to follow along with the rest of the tutorial.

Project structure

Be sure to follow the previous section to both (1) download today’s .zip from the “Downloads” section, and (2) download the iBug 300-W dataset into today’s project.

From there, go ahead and execute the tree  command to see our project structure:

As you can see, our dataset has been extracted into the ibug_300W_large_face_landmark_dataset/  directory following the instructions in the previous section.

Our configuration is housed in the pyimagesearch  module.

Our Python scripts consist of:

  • parse_xml.py : First, you need to prepare and extract eyes-only landmarks from the iBug 300-W dataset, resulting in smaller XML files. We’ll review how to use the script in the next section, but we won’t review the script itself as it was covered in a previous tutorial.
  • shape_predictor_tuner.py : This script takes advantage of dlib’s find_min_global  method to find the best shape predictor. We will review this script in detail today. This script will take significant time to execute (multiple days).
  • train_best_predictor.py : After the shape predictor is tuned, we’ll update our shape predictor options and start the training process.
  • predict_eys.py : Loads the serialized model, finds landmarks, and annotates them on a real-time video stream. We won’t cover this script today as we have covered it previously.

Let’s get started!

Preparing the iBUG-300W dataset

Figure 3: In this tutorial, we will optimize a custom dlib shape predictor’s accuracy with find_min_global.

As previously mentioned in the “The iBUG-300W dataset” section above, we will be training our dlib shape predictor on solely the eyes (i.e., not the eyebrows, nose, mouth or jawline).

In order to do so, we’ll first parse out any facial structures we are not interested in from the iBUG 300-W training/testing XML files.

At this point, ensure that you have:

  1. Used the “Downloads” section of this tutorial to download the source code.
  2. Used the “Downloading the iBUG-300W dataset” section above to download the iBUG-300W dataset.
  3. Reviewed the “Project structure” section so that you are familiar with the files and folders.

Inside your directory structure there is a script named parse_xml.py — this script handles parsing out just the eye locations from the XML files.

We reviewed this file in detail in my previous Training a Custom dlib Shape Predictor tutorial. We will not review the file again, so be sure to review it in the first tutorial of this series.

Before you continue on with the rest of this tutorial you’ll need to execute the following command to prepare our “eyes only” training and testing XML files:

Now let’s verify that the training/testing files have been created. You should check your iBUG-300W root dataset directory for the labels_ibug_300W_train_eyes.xml and labels_ibug_300W_test_eyes.xml files as shown:

Notice that our *_eyes.xml  files are highlighted. These files are significantly smaller in filesize than their original, non-parsed counterparts.

Our configuration file

Before we can use find_min_global to tune our hyperparameters, we first need to create a configuration file that will store all our important variables, ensuring we can use them and access them across multiple Python scripts.

Open up the config.py file in your pyimagesearch module (following the project structure above) and insert the following code:

The os  module (Line 2) allows our configuration script to join filepaths.

Lines 5-8 join our training and testing XML landmark files.

Let’s define our training parameters:

Here you will find:

  • The path to the temporary model file (Line 11).
  • The number of threads/cores to use when training (Line 15). A value of -1  indicates that all processor cores on your machine will be utilized.
  • The maximum number of function calls that find_min_global will use when attempting to optimize our hyperparameters (Line 19). Smaller values will enable our tuning script to complete faster, but could lead to hyperparameters that are “less optimal”. Larger values will take the tuning script significantly longer to run, but could lead to hyperparameters that are “more optimal”.

Implementing the dlib shape predictor and find_min_global training script

Now that we’ve reviewed our configuration file, we can move on to tuning our shape predictor hyperparameters using find_min_global.

Open up the shape_predictor_tuner.py file in your project structure and insert the following code:

Lines 2-7 import our necessary packages, namely our config  and dlib . We will use the multiprocessing  module to grab the number of CPUs/cores our system has (Lines 10 and 11). An OrderedDict  will contain all of our dlib shape predictor options.

Now let’s define a function responsible for the heart of shape predictor tuning with dlib:

The test_shape_predictor_params function:

  1. Accepts an input set of hyperparameters.
  2. Trains a dlib shape predictor using those hyperparameters.
  3. Computes the predictor loss/error on our testing set.
  4. Returns the error to the find_min_global function.
  5. The find_min_global function will then take the returned error and use it to adjust the optimal hyperparameters found thus far in an iterative fashion.

As you can see, the test_shape_predictor_params function accepts nine parameters, each of which are dlib shape predictor hyperparameters that we’ll be optimizing.

Lines 19-28 set the hyperparameter values from the parameters (casting to integers when appropriate).

Lines 32 and 33 instruct dlib to be verbose with output and to utilize the supplied number of threads/processes for training.

Let’s finish coding the test_shape_predictor_params  function:

Lines 41 and 42 train the dlib shape predictor using the current set of hyperparameters.

From there, Lines 46-49 evaluate the newly trained shape predictor on training and testing set.

Lines 52-54 print training and testing errors for the current trial before Line 57 returns the testingError  to the calling function.

Let’s define our set of shape predictor hyperparameters:

Each value in the OrderedDict is a 3-tuple consisting of:

  1. The lower bound on the hyperparameter value.
  2. The upper bound on the hyperparameter value.
  3. A boolean indicating whether the hyperparameter is an integer or not.

For a full review of the hyperparameters, be sure to refer to my previous post.

From here, we’ll extract our upper and lower bounds as well as whether a hyperparameter is an integer:

Lines 79-81 extract the lower , upper , and isint  boolean from our  params dictionary.

Now that we have the setup taken care of, let’s optimize our shape predictor hyperparameters using dlib’s find_min_global method:

Lines 84-89 start the optimization process.

Lines 93 and 94 display the optimal parameters before Line 97 deletes the temporary model file.

Tuning shape predictor options with find_min_global

To use find_min_global to tune the hyperparameters to our dlib shape predictor, make sure you have:

  1. Used the “Downloads” section of this tutorial to download the source code.
  2. Downloaded the iBUG-300W dataset using the “Downloading the iBUG-300W dataset” section above.
  3. Executed the parse_xml.py for both the training and testing XML files in the “Preparing the iBUG-300W dataset” section.

Provided you have accomplished each of these three steps, you can now execute the shape_predictor_tune.py script:

On my iMac Pro with a 3 GHz Intel Xeon W processor with 20 cores, running a total of 100 MAX_TRIALS took ~8047m24s, or ~5.6 days. If you don’t have a powerful computer, I would recommend running this procedure on a powerful cloud instance.

Looking at the output you can see that the find_min_global function found the following optimal shape predictor hyperparameters:

  • tree_depth: 4
  • nu: 0.1033
  • cascade_depth: 20
  • feature_pool_size: 677
  • num_test_splits: 295
  • oversampling_amount: 29
  • oversampling_translation_jitter: 0
  • feature_pool_region_padding: 0.0975
  • lambda_param: 0.0251

In the next section we’ll take these values and update our train_best_predictor.py script to include them.

Updating our shape predictor options using the results from find_min_global

At this point we know the best possible shape predictor hyperparameter values, but we still need to train our final shape predictor using these values.

To do make, open up the train_best_predictor.py file and insert the following code:

Lines 2-5 import our config , multiprocessing , argparse , and dlib .

From there, we set the shape predictor options  (Lines 14-39) using the optimal values we found from the previous section.

And finally, Line 47 trains and exports the model.

For a more detailed review of this script, be sure to refer to my previous tutorial.

Training the final shape predictor

The final step is to execute our train_best_predictor.py file which will train a dlib shape predictor using our best hyperparameter values found via find_min_global:

After the command finishes executing you should have a file named best_predictor.dat in your local directory structure:

You can then take this predictor and use it to localize eyes in real-time video using the predict_eyes.py script:

When should I use dlib’s find_min_global function?

Figure 4: Using the find_min_global method to optimize a custom dlib shape predictor can take significant processing time. Be sure to review this section for general rules of thumb including guidance on when to use a Grid Search method to find a shape predictor model.

Unlike a standard grid search for tuning hyperparameters, which blindly explores sets of hyperparameters, the find_min_global function is a true optimizer, enabling it to iteratively explore the hyperparameter space, choosing options that maximize our accuracy and minimize our loss/error.

However, one of the downsides of find_min_global is that it cannot be made parallel in an easy fashion.

A standard grid search, on the other hand, can be made parallel by:

  1. Dividing all combinations of hyperparameters into N size chunks
  2. And then distributing each of the chunks across M systems

Doing so would lead to faster hyperparameter space exploration than using find_min_global.

The downside is that you may not have the “true” best choices of hyperparameters since a grid search can only explore values that you have hardcoded.

Therefore, I recommend the following rule of thumb:

If you have multiple machines, use a standard grid search and distribute the work across the machines. After the grid search completes, take the best values found and then use them as inputs to dlib’s find_min_global to find your best hyperparameters.

If you have a single machine use dlib’s find_min_global, making sure to trim down the ranges of hyperparameters you want to explore. For instance, if you know you want a small, fast model, you should cap the upper range limit of tree_depth, preventing your ERTs from becoming too deep (and therefore slower).

While dlib’s find_min_global function is quite powerful, it can also be slow, so make sure you take care to think ahead and plan out which hyperparameters are truly important for your application.

You should also read my previous tutorial on training a custom dlib shape predictor for a detailed review of what each of the hyperparameters controls and how they can be used to balance speed, accuracy, and model size.

Use these recommendations and you’ll be able to successfully tune and optimize your dlib shape predictors.

Ready to master computer vision?

There are countless Python libraries for computer vision, deep learning, machine learning, and data science.

But where do you begin?

We’ve all been there wondering where to start. In order to help you gain traction in expanding your Computer Vision knowledge and skillset, I have put together the PyImageSearch Gurus course.

The course is tailored for beginners and experts alike with topics spanning:

  • Machine learning and image classification
  • Automatic License/Number Plate Recognition (ANPR)
  • Face recognition
  • How to train HOG + Linear SVM object detectors with dlib
  • Content-based Image Retrieval (i.e., image search engines)
  • Processing image datasets with Hadoop and MapReduce
  • Hand gesture recognition
  • Deep learning fundamentals
  • …and much more!

PyImageSearch Gurus is the most comprehensive computer vision education online today, covering 13 modules broken out into 168 lessons, with other 2,161 pages of content. You won’t find a more detailed computer vision course anywhere else online, I guarantee it.

The learning does not stop with the course. PyImageSearch Gurus also includes private community forums. I participate in the Gurus forum virtually nearly every day, so it’s a great way to gain expert advice, both from me and from the other advanced students, on a daily basis.

To learn more about the PyImageSearch Gurus course + community (and grab 10 FREE sample lessons), just click the button below:

Click here to learn more about PyImageSearch Gurus!

Summary

In this tutorial you learned how to use dlib’s find_min_global function to optimize options/hyperparameters when training a custom shape predictor.

The function is incredibly easy to use and makes it dead simple to tune the hyperparameters to your dlib shape predictor.

I would also recommend you use my previous tutorial on tuning dlib shape predictor options via a grid search — combining a grid search (using multiple machines) with find_min_global can lead to a superior shape predictor.

I hope you enjoyed this blog post!

To download the source code to this post (and be notified when future tutorials are published here on PyImageSearch), just enter your email address in the form below!

Downloads:

If you would like to download the code and images used in this post, please enter your email address in the form below. Not only will you get a .zip of the code, I’ll also send you a FREE 17-page Resource Guide on Computer Vision, OpenCV, and Deep Learning. Inside you'll find my hand-picked tutorials, books, courses, and libraries to help you master CV and DL! Sound good? If so, enter your email address and I’ll send you the code immediately!

, , , , ,

2 Responses to Optimizing dlib shape predictor accuracy with find_min_global

  1. mohaned abid January 22, 2020 at 6:14 pm #

    mr.adrian cool tutorial as always !!
    I just want to say that is this possible to use on any model or is it specific to only shape predictors

    • Adrian Rosebrock January 23, 2020 at 9:16 am #

      Thanks, I’m glad you enjoyed the tutorial.

      As for find_min_global, that function will work with ANY model that returns a score that should be minimized (i.e., it’s not limited to just shape predictors).

Before you leave a comment...

Hey, Adrian here, author of the PyImageSearch blog. I'd love to hear from you, but before you submit a comment, please follow these guidelines:

  1. If you have a question, read the comments first. You should also search this page (i.e., ctrl + f) for keywords related to your question. It's likely that I have already addressed your question in the comments.
  2. If you are copying and pasting code/terminal output, please don't. Reviewing another programmers’ code is a very time consuming and tedious task, and due to the volume of emails and contact requests I receive, I simply cannot do it.
  3. Be respectful of the space. I put a lot of my own personal time into creating these free weekly tutorials. On average, each tutorial takes me 15-20 hours to put together. I love offering these guides to you and I take pride in the content I create. Therefore, I will not approve comments that include large code blocks/terminal output as it destroys the formatting of the page. Kindly be respectful of this space.
  4. Be patient. I receive 200+ comments and emails per day. Due to spam, and my desire to personally answer as many questions as I can, I hand moderate all new comments (typically once per week). I try to answer as many questions as I can, but I'm only one person. Please don't be offended if I cannot get to your question
  5. Do you need priority support? Consider purchasing one of my books and courses. I place customer questions and emails in a separate, special priority queue and answer them first. If you are a customer of mine you will receive a guaranteed response from me. If there's any time left over, I focus on the community at large and attempt to answer as many of those questions as I possibly can.

Thank you for keeping these guidelines in mind before submitting your comment.

Leave a Reply

[email]
[email]