Keras – Save and Load Your Deep Learning Models

In this tutorial, you will learn how to save and load your Keras deep learning models.

This blog post was inspired by PyImageSearch reader, Mason, who emailed in last week and asked:

Adrian, I’ve been going through your blog and reading your deep learning tutorials. Thanks for them.

I have a question though:

After training, how do you save your Keras model?

And once you have it saved, how do you load it again so you can classify new images?

I know this is a basic question but I don’t know how to save and load my Keras models.

Mason asks an excellent question — and it’s actually not as “basic” of a concept as he (and maybe even you) may think.

On the surface, saving your Keras models is as simple as calling the model.save  and load_model  function. But there’s actually more to consider than just the load and save model functions!

What’s even more important, and sometimes overlooked by new deep learning practitioners, is the preprocessing stage — your preprocessing steps for training and validation must be identical to the training steps when loading your model and classifying new images.

In the remainder of today’s tutorial we’ll be exploring:

  1. How to properly save and load your Keras deep learning models.
  2. The proper steps to preprocess your images after loading your model.

To learn how to save and load your deep learning models with Keras, just keep reading!

Looking for the source code to this post?
Jump right to the downloads section.

Keras – Save and Load Your Deep Learning Models

In the first part of this tutorial, we’ll briefly review both (1) our example dataset we’ll be training a Keras model on, along with (2) our project directory structure. From there I will show you how to:

  1. Train a deep learning model with Keras
  2. Serialize and save your Keras model to disk
  3. Load your saved Keras model from disk
  4. Make predictions on new image data using your saved Keras model

Let’s go ahead and get started!

Our example dataset

Figure 1: A subset of the Malaria Dataset provided by the National Institute of Health (NIH). We will use this dataset to develop a deep learning medical imaging classification model saved to disk with Python, OpenCV, and Keras.

The dataset we’ll be utilizing for today’s tutorial is a subset of the malaria detection and classification dataset we covered in last week’s Deep learning and Medical Image Analysis with Keras blog post.

The original dataset consists of 27,588 images belonging to two classes:

  1. Parasitized: Implying that the image contains malaria
  2. Uninfected: Meaning there is no evidence of malaria in the image

Since the goal of this tutorial is not medical image analysis, but rather how to save and load your Keras models, I have sampled the dataset down to 100 images.

I have reduced the dataset size mainly because:

  1. You should be able to run this example on your CPU (if you do not own/have access to a GPU).
  2. Our goal here is to teach the basic concept of saving and loading Keras models, not train a state-of-the-art malaria detector.
  3. And because of that, it’s better to work with a smaller example dataset

If you would like to read my full blog post on how to build a (near) state-of-the-art malaria classifier with the full dataset, please be sure to refer to this blog post.

Project structure

Be sure to grab today’s “Downloads” consisting of the reduced dataset, ResNet model, and Python scripts.

Once you’ve unzipped the files you’ll be presented with this directory structure:

Our project consists of two folders in the root directory:

  • malaria/ : Our reduced Malaria dataset. It is organized into training, validation, and testing sets via the “build dataset” script from last week.
  • pyimagesearch/ : A package included with the downloads which contains our ResNet model class.

Today, we’ll review two Python scripts as well:

  • save_model.py : A demo script which will save our Keras model to disk after it has been trained.
  • load_model.py : Our script that loads the saved model from disk and classifies a small selection of testing images.

By reviewing these files, you’ll quickly see how easy Keras makes saving and loading deep learning model files.

Saving a model with Keras

Figure 2: The steps for training and saving a Keras deep learning model to disk.

Before we can load a Keras model from disk we first need to:

  1. Train the Keras model
  2. Save the Keras model

The save_model.py  script we’re about to review will cover both of these concepts.

Go ahead and open up your save_model.py  file and let’s get started:

We begin on Lines 2-14 by importing required packages.

On Line 3 the "Agg"  matplotlib backend is specified as we’ll be saving our plot to disk (in addition to our model).

Our ResNet  CNN is imported on Line 8. In order to use this CNN, be sure to grab the “Downloads” for today’s blog post.

Using the argparse  import, let’s parse our command line arguments:

Our script requires that three arguments be provided with the command string in your terminal:

  • --dataset : The path to our dataset. We’re using a subset of the Malaria dataset that we built last week.
  • --model : You need to specify the path to the trained output model (i.e., where the Keras model is going to be saved). This is key for what we are covering today.
  • --plot : The path to the training plot. By default, the figure will be named plot.png .

No modifications are needed for these lines of code. Again, you will need to type the values for the arguments in the terminal and let argparse  do the rest. If you are unfamiliar with the concept of command line arguments, see this post.

Let’s initialize our training variables and paths:

We’ll be training for 25  epochs with a batch size of 32 .

Last week, we split the NIH Malaria Dataset into three sets, creating a corresponding directory for each:

  • Training
  • Validation
  • Testing

Be sure to review the build_dataset.py  script in the tutorial if you’re curious how the data split process works. For today, I’ve taken the resulting dataset that has been split (as well as made is significantly smaller for the purposes of this blog post).

The images paths are built on Lines 32-34, and the number of images in each split is grabbed on Lines 38-40.

Let’s initialize our data augmentation objects:

Data augmentation is the process of generating new images from a dataset with random modifications. It results in a better deep learning model and I almost always recommend it (it is especially important for small datasets).

Data augmentation is briefly covered in my Keras Tutorial blog post.  For a full dive into data augmentation be sure to read my deep learning book, Deep Learning for Computer Vision with Python.

Note: The valAug  object simply performs scaling — no augmentation is actually performed. We’ll be using this object twice: once for validation rescaling and once for testing rescaling.

Now that the training and validation augmentation objects are created, let’s initialize the generators:

The three generators above actually produce images on demand during training/validation/testing per our augmentation objects and the parameters given here.

Now we’re going to build, compile, and train our model. We’ll also evaluate our model and print a classification report:

In the code block above, we:

  • Initialize our implementation of ResNet  on Lines 84-88 (from Deep Learning for Computer Vision with Python). Notice how we’ve specified "binary_crossentropy"  because our model has two classes. You should change it to "categorical_crossentropy"  if you are working with > 2 classes.
  • Train the ResNet model  on the augmented Malaria dataset (Lines 91-96).
  • Make predictions on test set (Lines 102 and 103) and extract the highest probability class index for each prediction (Line 107).
  • Display a classification_report  in our terminal (Lines 110-111).

Now that our model is trained let’s save our Keras model to disk:

To save our Keras model to disk, we simply call .save  on the model  (Line 115).

Simple right?

Yes, it is a simple function call, but the hard work before it made the process possible.

In our next script, we’ll be able to load the model from disk and make predictions.

Let’s plot the training results and save the training plot as well:

At this point our script is complete. Let’s go ahead and train our Keras model!


To train your Keras model on our example dataset, make sure you use the “Downloads” section of the blog post to download the source code and images themselves.

From there, open up a terminal and execute the following command:

Notice the command line arguments. I’ve specified the path to the Malaria dataset directory ( --dataset malaria ) and the path to our destination model ( --model saved_model.model ). These command line arguments are key to the operation of this script. You can name your model whatever you’d like without changing a line of code!

Here you can see that our model is obtaining ~99% accuracy on the test set.

Each epoch is taking ~7 seconds on my CPU. On my GPU each epoch takes ~1 second. Keep in mind that training is faster than last week because we’re pushing less data through the network for each epoch due to the fact that I reduced today’s dataset.

After training you can list the contents of your directory and see the saved Keras model:

Figure 3: Our Keras model is now residing on disk. Saving Keras models is quite easy via the Keras API.

The saved_model.model  file is your actual saved Keras model.

You will learn how to load your saved Keras model from disk in the next section.

Loading a model with Keras

Figure 4: The process of loading a Keras model from disk and putting it to use to make predictions. Don’t forget to preprocess your data in the same manner as during training!

Now that we’ve learned how to save a Keras model to disk, the next step is to load the Keras model so we can use it for making classifications. Open up your load_model.py  script and let’s get started:

We import our required packages on Lines 2-10. Most notably we need load_model  in order to load our model from disk and put it to use.

Our two command line arguments are parsed on Lines 12-17:

  • --images : The path to the images we’d like to make predictions with.
  • --model : The path to the model we just saved previously.

Again, these lines don’t need to change. When you enter the command in your terminal you’ll provide values for both --images  and --model .

The next step is to load our Keras model from disk:

On Line 21, to load our Keras  model , we call load_model , providing the path to the model itself (contained within our parsed args  dictionary).

Given the model , we can now make predictions with it. But first we’ll need some images to work with and a place to put our results:

On Lines 24-26, we grab a random selection of testing image paths.

Line 29 initializes an empty list to hold the results .

Let’s loop over each of our imagePaths :

On Line 32 we begin looping over our imagePaths .

We begin the loop by loading our image from disk (Line 34) and preprocessing it (Lines 40-42). These preprocessing steps should be identical to those taken in our training script. As you can see, we’ve converted the images from BGR to RGB channel ordering, resized to 64×64 pixels, and scaled to the range [0, 1].

A common mistake I see new deep learning practitioners make is failing to preprocess new images in the same manner as their training images.

Moving on, let’s make a prediction an image  each iteration of the loop:

In this block we:

  • Handle channel ordering (Line 47). The TensorFlow backend default is "channels_first" , but don’t forget that Keras supports alternative backends as well.
  • Create a batch to send through the network by adding a dimension to the volume (Line 48). We’re just sending one image through the network at a time, but the additional dimension is critical.
  • Pass image through ResNet model  (Line 51), obtaining a prediction. We take the index of the max prediction (either "Parasitized"  or "Uninfected" ) on Line 52.
  • Then we create a colored label and draw it on the original image (Lines 56-63).
  • Finally, we append the annotated orig  image to results .

To visualize our results let’s create a montage and display it on the screen:

A montage  of results is built on Line 69. Our montage  is a 4×4 grid of images to accommodate the 16 random testing images we grabbed earlier on. Learn how this function works in my blog post, Montages with OpenCV.

The montage  will be displayed until any key is pressed (Lines 72 and 73).


To see our script in action make sure you use the “Downloads” section of the tutorial to download the source code and dataset of images.

From there, open up a terminal and execute the following command:

Figure 5: A montage of cells either “Parasitized” or “Uninfected” with Malaria. In today’s blog post we saved a ResNet deep learning model to disk and then loaded it with a separate script to make these predictions.

Here you can see that we have:

  1. Provided the path to our testing images ( --images malaria/testing ) as well as the model already residing on disk ( --model saved_model.model ) via command line argument
  2. Loaded our Keras model from disk
  3. Preprocessed our input images
  4. Classified each of the example images
  5. Constructed an output visualization of our classifications (Figure 5)

This process was made possible due to the fact we were able to save our Keras model from disk in the training script and then load the Keras model from disk in a separate script.

Summary

In today’s tutorial you learned:

  1. How to train a Keras model on a dataset
  2. How to serialize and save your Keras model to disk
  3. How to load your saved Keras model from a separate Python script
  4. How to classify new input images using your loaded Keras model

You can use the Python scripts covered in today’s tutorial as templates when training, saving, and loading your own Keras models.

I hope you enjoyed today’s blog post!

To download the source code to today’s tutorial, and be notified when future blog posts are published here on PyImageSearch, just enter your email address in the form below!

Downloads:

If you would like to download the code and images used in this post, please enter your email address in the form below. Not only will you get a .zip of the code, I’ll also send you a FREE 17-page Resource Guide on Computer Vision, OpenCV, and Deep Learning. Inside you'll find my hand-picked tutorials, books, courses, and libraries to help you master CV and DL! Sound good? If so, enter your email address and I’ll send you the code immediately!

, , ,

14 Responses to Keras – Save and Load Your Deep Learning Models

  1. Nikhil Panigrahi December 10, 2018 at 10:44 am #

    Hello Sir, a great fan of your coding and you as well!! Can you please upload a blog on self driving car on the Udacity simulator please!!

  2. sophia December 10, 2018 at 12:35 pm #

    Another informative and practical tutorial. An extension of this topic is “deploying deep learning models on the edge”, so as to be able to do real-time inference on live camera feed. AWS/GCP?Azure all have IoT options, but I haven’t found a good end-to-end tutorial on deploying our own TensorFlow/PyTorch models. I think NVIDIA too has edge GPUs (Jetson). A tutorial on this topic would be very helpful. Thanks, as always!

    • Adrian Rosebrock December 11, 2018 at 12:42 pm #

      Thank you for the suggestion, Sophia.

  3. Adam Ge December 10, 2018 at 8:26 pm #

    Hi, Adrian, Thank you for sharing this blog.

    I have one question, when I use load_img(keras.preprocessing.image import load_img) instead of opencv to load the image, the probability is difference.

    The Opencv and PIL load image function is a little difference, the training stage use the load_img funcition to load the image, so I think in the predicting stage, use load_img maybe better than Opencv. Is that right?

    • Adrian Rosebrock December 11, 2018 at 12:40 pm #

      The load_img function will load images in RGB ordering while OpenCV uses BGR ordering. You may have your channel ordering incorrect at prediction.

      • Adam Ge December 12, 2018 at 12:24 am #

        Thank you for your reply.
        I also change the image to RGB ordering, but the result was the same.
        Do you have any advice?

        • Adrian Rosebrock December 13, 2018 at 9:08 am #

          Hm, I’m not sure what the exact issue could be then without having physical access to your machine. I used Keras 2.2 and TensorFlow v1.12 for this post. Perhaps try with those versions?

  4. David Bonn December 10, 2018 at 8:49 pm #

    Hi Adrian, Great blog post! I had one question. Rather than save the binarized labels you effectively hard-coded the label indices. So does Keras index the labels in alphabetical order?

    • Adrian Rosebrock December 11, 2018 at 12:39 pm #

      I believe it does when using the “.flow_from_directory” function but that is always something you should double-check.

  5. Ravi December 11, 2018 at 6:29 am #

    Hi Adrian, Great post. Used last weeks trained model and got perfect results. Thanks for teaching us, with such wonderful in-depth Computer Vision concepts.

    // You should change it to “categorical_crossentropy” if you are working with > 2 classes.//

    Appreciate your forethought. Great job. I did worked around with multiple classes. I would like to point out here that i need to do another change to make multiple classes work.

    //model = ResNet.build(64, 64, 3, 2, (2, 2, 3),
    (32, 64, 128, 256), reg=0.0005)//

    I need to change the fourth argument from 2 to 10 number of classes.

    • Adrian Rosebrock December 11, 2018 at 12:32 pm #

      You are absolutely right, thanks Ravi.

  6. Nicole Finnie December 11, 2018 at 8:07 am #

    Hey Adrian, Do you know if there’s a good way to compress the saved model? When a neural network is big, the h5 file can get quite large, and not suitable for git unless you split the file. Thanks!

    • Adrian Rosebrock December 11, 2018 at 12:30 pm #

      What you are referring to is called “model quantization”. Exactly how you do that depends on your deep learning library so you should spend some time researching model quantization for your specific library.

  7. Khaled Metwaly December 16, 2018 at 4:56 pm #

    Hi Adrian, i enjoy your way in explaining the tutorials, i wish to see a tutorial how can computer vision deals with images in 360 panorama “equirectangular , i was playing around for object detection in 360 video but the main challenge was the computational time. Thanks.

Leave a Reply

[email]
[email]