Keras Tutorial: How to get started with Keras, Deep Learning, and Python

Click here to download the source code to this post.

Inside this Keras tutorial, you will discover how easy it is to get started with deep learning and Python. You will use the Keras deep learning library to train your first neural network on a custom image dataset, and from there, you’ll implement your first Convolutional Neural Network (CNN) as well.

The inspiration for this guide came from PyImageSearch reader, Igor, who emailed me a few weeks ago and asked:

Hey Adrian, thanks for the PyImageSearch blog. I’ve noticed that nearly every “getting started” guide I come across for Keras and image classification uses either the MNIST or CIFAR-10 datasets which are built into Keras. I just call one of those functions and the data is automatically loaded for me.

But how do I go about using my own image dataset with Keras?

What steps do I have to take?

Igor has a great point — most Keras tutorials you come across will try to teach you the basics of the library using an image classification dataset such MNIST (handwriting recognition) or CIFAR-10 (basic object recognition).

These image datasets are standard benchmarks in the computer vision and deep learning literature, and sure, they will absolutely get you started using Keras…

…but they aren’t necessarily practical in the sense that they don’t teach you how to work with your own set of images residing on disk. Instead, you’re just calling helper functions to load pre-compiled datasets.

I’m going with a different take on an introductory Keras tutorial.

Instead of teaching you how to utilize one of these pre-compiled datasets, I’m going to teach you how to train your first neural network and Convolutional Neural Network using a custom dataset — because let’s face it, your goal is to apply deep learning to your own dataset, not one built into Keras, am I right?

To learn how to get started with Keras, Deep Learning, and Python, just keep reading!

Looking for the source code to this post?
Jump right to the downloads section.

Keras Tutorial: How to get started with Keras, Deep Learning, and Python

Today’s Keras tutorial is designed with the practitioner in mind — it is meant to be a practitioner’s approach to applied deep learning.

That means that we’ll learn by doing.

We’ll be getting our hands dirty.

Writing some Keras code.

And then training our networks on our custom datasets.

This tutorial is not meant to be a deep dive into the theory surrounding deep learning.

If you’re interested in studying deep learning in odepth, including both (1) hands-on implementations and (2) a discussion of theory, I would suggest you check out my book, Deep Learning for Computer Vision with Python.

Overview of what’s going to be covered

Training your first simple neural network with Keras doesn’t require a lot of code, but we’re going to start slow, taking it step-by-step, ensuring you understand the process of how to train a network on your own custom dataset.

The steps we’ll cover today include:

  1. Installing Keras and other dependencies on your system
  2. Loading your data from disk
  3. Creating your training and testing splits
  4. Defining your Keras model architecture
  5. Compiling your Keras model
  6. Training your model on your training data
  7. Evaluating your model on your test data
  8. Making predictions using your trained Keras model

I’ve also included an additional section on training your first Convolutional Neural Network.

This may seem like a lot of steps, but I promise you, once we start getting into the example you’ll see that the examples are linear, make intuitive sense, and will help you understand the fundamentals of training a neural network with Keras.

Our example dataset

Figure 1: In this Keras tutorial, we won’t be using CIFAR-10 or MNIST for our dataset. Instead, I’ll show you how you can organize your own dataset of images and train a neural network using deep learning with Keras.

Most Keras tutorials you come across for image classification will utilize MNIST or CIFAR-10 — I’m not going to do that here.

To start, MNIST and CIFAR-10 aren’t very exciting examples.

These tutorials don’t actually cover how to work with your own custom image datasets. Instead, they simply call built-in Keras utilities that magically return the MNIST and CIFAR-10 datasets as NumPy arrays. In fact, your training and testing splits have already been pre-split for you!

Secondly, if you want to use your own custom datasets you really don’t know where to start. You’ll find yourself scratching your head and asking questions such as:

  • Where are those helper functions loading the data from?
  • What format should my dataset on disk be?
  • How can I load my dataset into memory?
  • What preprocessing steps do I need to perform?

Let’s be honest — your goal in studying Keras and deep learning isn’t to work with these pre-baked datasets.

Instead, you want to work with your own custom datasets.

And those introductory Keras tutorials you’ve come across only take you so far.

That’s why, inside this Keras tutorial, we’ll be working with a custom dataset called the “Animals dataset” I created for my book, Deep Learning for Computer Vision with Python:

Figure 2: In this Keras tutorial we’ll use an example animals dataset straight from my deep learning book. The dataset consists of dogs, cats, and pandas.

The purpose of this dataset is to correctly classify an image as containing either:

  • Cats
  • Dogs
  • Pandas

Containing only 3,000 images, the Animals dataset is meant to be an introductory dataset that we can quickly train a deep learning model on using either our CPU or GPU (and still obtain reasonable accuracy).

Furthermore, using this custom dataset enables you to understand:

  1. How you should organize your dataset on disk
  2. How to load your images and class labels from disk
  3. How to partition your data into training and testing splits
  4. How to train your first Keras neural network on the training data
  5. How to evaluate your model on the testing data
  6. How you can reuse your trained model on data that is brand new and outside your training and testing splits

By following the steps in this Keras tutorial you’ll be able to swap out my Animals dataset for any dataset of your choice, provided you utilize the project/directory structure detailed below.

Need data? If you need to scrape images from the internet to create a dataset, check out how to do it the easy way with Bing Image Search, or the slightly more involved way with Google Images.

Project structure

There are a number of files associated with this project. Grab the zip from the “Downloads” section and then use the tree  command to show the project structure in your terminal (I’ve provided two command line argument flags to tree  to make the output nice and clean):

As previously discussed, today we’ll be working with the Animals dataset. Notice how animals  is organized in the project tree. Inside of animals/ , there are three class directories: cats/ , dogs/ , panda/ . Within each of those directories is 1,000 images pertaining to the respective class.

If you work with your own dataset, just organize it the same way! Ideally you’ll gather 1,000 images per class at a minimum. This isn’t always possible, but you should at least have class balance. Significantly more images in one class folder could cause model bias.

Next is the images/  directory. This directory contains three images for testing purposes which we’ll use to demonstrate how to (1) load a trained model from disk and then (2) classify an input image that is not part of our original dataset.

The output/  folder contains three types of files which are generated by training:

  • .model : A serialized Keras model file is generated after training and can be used in future inference scripts.
  • .pickle : A serialized label binarizer file. This file contains an object which contains class names. It accompanies a model file.
  • .png : I always place my training/validation plot images in the output folder as it is an output of the training process.

The pyimagesearch/  directory is a module. Contrary to the many questions I receive, pyimagesearch  is not a pip-installable package. Instead it resides in the project folder and classes contained within can be imported into your scripts. It is provided in the “Downloads” section of this Keras tutorial.

Today we’ll be reviewing four .py files:

  • In the first half of the blog post, we’ll train a simple model. The training script is train_simple_nn.py .
  • We’ll advance to training SmallVGGNet  using the train_vgg.py  script.
  • The smallvggnet.py  file contains our SmallVGGNet  class, a Convolutional Neural Network.
  • What good is a serialized model unless we can deploy it? In predict.py , I’ve provided sample code for you to load a serialized model + label file and make an inference on an image. The prediction script is only useful after we have successfully trained a model with reasonable accuracy. It is always useful to run this script to test with images that are not contained within the dataset.

1. Install Keras on your system

Figure 3: We’ll use Keras with the TensorFlow backend in this introduction to Keras for deep learning blog post.

For today’s tutorial, you will need to have Keras, TensorFlow, and OpenCV installed.

If you don’t have this software on your system yet, don’t run for the hills! I’ve written a number of easy-to-follow installation guides. I also update them on a regular basis. Here is what you need:

  • OpenCV Installation Guides — This launchpad links to tutorials that will help you install OpenCV on Ubuntu, MacOS, or Raspberry Pi.
  • Install Keras with TensorFlow — You’ll be up and running with Keras and Tensorflow in less than two minutes, thanks to pip. You can install these packages on a Raspberry Pi; however, I advise against training with your Pi. Pre-trained and reasonably sized models (such as both that we’re covering today) can easily run on a Pi, but make sure you train them first!
  • Install imutilscikit-learn, and matplotlib — Be sure to install these packages as well (ideally into your virtual environment). It is easy to install each with pip:

2. Load your data from disk

Figure 4: Step #2 of our Keras tutorial involves loading images from disk into memory.

Now that Keras is installed on our system we can start implementing our first simple neural network training script using Keras. We’ll later implement a full-blown Convolutional Neural Network, but let’s start easy and work our way up.

Open up train_simple_nn.py  and insert the following code:

Lines 2-19 import our required packages. As you can see there are quite a few tools this script is taking advantage of. Let’s review the important ones:

  • matplotlib : This is the go-to plotting package for Python. That said, it does have its nuances, and if you’re having trouble with it, refer to this blog post. On Line 3, we instruct matplotlib  to use the "Agg"  backend enabling us to save plots to disk — that’s your first nuance!
  • sklearn : The scikit-learn  library will help us with binarizing our labels, splitting data for training/testing, and generating a training report in our terminal.
  • keras : You’re reading this tutorial to learn about Keras — it is our high level frontend into TensorFlow and other deep learning backends.
  • imutils : My package of convenience functions. We’ll use the paths  module to generate a list of image file paths for training.
  • numpy : NumPy is for numerical processing with Python. It is another go-to package. If you have OpenCV for Python and scikit-learn installed, then you’ll have NumPy as it is a dependency.
  • cv2 : This is OpenCV. At this point, it is both tradition and a requirement to tack on the 2 even though you’re likely using OpenCV 3 or higher.
  • …the remaining imports are built into your installation of Python!

Wheww! That was a lot, but having a good idea of what each import is used for will aid your understanding as we walk through these scripts.

Let’s parse our command line arguments with argparse:

Our script will dynamically handle additional information provided via the command line when we execute our script. The additional information is in the form of command line arguments. The argparse  module is built into Python and will handle parsing the information you provide in your command string. For additional explanation, refer to this blog post.

We have four command line arguments to parse:

  • --dataset : The path to our dataset of images on disk.
  • --model : Our model will be serialized and output to disk. This argument contains the path to the output model file.
  • --label-bin : Dataset labels are serialized to disk for easy recall in other scripts. This is the path to the output label binarizer file.
  • --plot : The path to the output training plot image file. We’ll review this plot to check for over/underfitting of our data.

With the dataset information in hand, let’s load our images and class labels:

Here we:

  • Initialize lists for our data  and labels  (Lines 35 and 36). These will later become NumPy arrays.
  • Grab imagePaths  and randomly shuffle them (Lines 39-41). The paths.list_images  function conveniently will find all the paths to all input images in our --dataset  directory before we sort and shuffle  them. I set a seed  so that the random reordering is reproducible.
  • Begin looping over all imagePaths  in our dataset (Line 44).

For each imagePath , we proceed to:

  • Load the image  into memory (Line 48).
  • Resize the image to 32x32  pixels (ignoring aspect ratio) as well as flatten  the image (Line 49). It is critical to resize  our images properly because this neural network requires these dimensions. Each neural network will require different dimensions, so just be aware of this. Flattening the data allows us to pass the raw pixel intensities to the input layer neurons easily. You’ll see later that for VGGNet we pass the volume to the network since it is convolutional. Keep in mind that this example is just a simple non-convolutional network — we’ll be looking at a more advanced example later in the post.
  • Append the resized image to data  (Line 50).
  • Extract the class label  of the image from the path (Line 54) and add it to the labels  list (Line 55). The labels  list contains the classes that correspond to each image in the data list.

Now in one fell swoop, we can apply array operations to the data and labels:

On Line 58 we scale pixel intensities from the range [0, 255] to [0, 1] (a common preprocessing step).

We also convert the labels  list to a NumPy array (Line 59).

3. Construct your training and testing splits

Figure 5: Before fitting a deep learning or machine learning model you must split your data into training and testing sets. Scikit-learn is employed in this blog post to split our data.

Now that we have loaded our image data from disk, next we need to construct our training and testing splits:

It is typical to allocate a percentage of your data for training and a smaller percentage of your data for testing. The scikit-learn provides a handy train_test_split  function which will split the data for us.

Both trainX  and testX  make up the image data itself while  trainY  and testY  make up the labels.

Our class labels are currently represented as strings; however, Keras will assume that both:

  1. Labels are encoded as integers
  2. And furthermore, one-hot encoding is performed on these labels making each label represented as a vector rather than an integer

To accomplish this encoding, we can use the LabelBinarizer  class from scikit-learn:

On Line 70, we initialize the LabelBinarizer  object.

A call to fit_transform  finds all unique class labels in trainY  and then transforms them into one-hot encoded labels.

A call to just .transform  on testY  performs just the one-hot encoding step — the unique set of possible class labels was already determined by the call to .fit_transform .

Here’s an example:

Notice how only one of the array elements is “hot” which is why we call this “one-hot” encoding.

4. Define your Keras model architecture

Figure 6: Our simple neural network is created using Keras in this deep learning tutorial.

The next step is to define our neural network architecture using Keras. Here we will be using a network with one input layer, two hidden layers, and one output layer:

Since our model is really simple, we go ahead and define it in this script (typically I like to make a separate class in a separate file for the model architecture).

The input layer and first hidden layer are defined on Line 76. will have an input_shape  of 3072  as there are 32x32x3=3072  pixels in a flattened input image. The first hidden layer will have 1024  nodes.

The second hidden layer will have 512  nodes (Line 77).

Finally, the number of nodes in the final output layer (Line 78) will be the number of possible class labels — in this case, the output layer will have three nodes, one for each of our class labels (“cats”, “dogs”, and “panda”, respectively).

5. Compile your Keras model

Figure 7: Step #5 of our Keras tutorial requires that we compile our model with an optimizer and loss function.

Once we have defined our neural network architecture, the next step is to “compile” it:

First, we initialize our learning rate and total number of epochs to train for  (Lines 81 and 82).

Then we compile  our model using the Stochastic Gradient Descent ( SGD ) optimizer with "categorical_crossentropy"  as the loss  function.

Categorical cross-entropy is used as the loss for nearly all networks trained to perform classification. The only exception is for 2-class classification where there are only two possible class labels. In that event you would want to swap out "categorical_crossentropy"  for "binary_crossentropy" .

6. Fit your Keras model to the data

Figure 8: In Step #6 of this Keras tutorial, we train a deep learning model using our training data and compiled model.

Now that our Keras model is compiled, we can “fit” (i.e., train) it on our training data:

We’ve discussed all the inputs except batch_size . The batch_size  controls the size of each group of data to pass through the network. Larger GPUs would be able to accommodate larger batch sizes. I recommend starting with 32  or 64  and going up from there.

7. Evaluate your Keras model

Figure 9: After we fit our model, we can use our testing data to make predictions and generate a classification report.

We’ve trained our actual model but now we need to evaluate it on our testing data.

It’s important that we evaluate on our testing data so we can obtain an unbiased (or as close to unbiased as possible) representation of how well our model is performing with data it has never been trained on.

To evaluate our Keras model we can use a combination of the .predict  method of the model along with the classification_report  from scikit-learn:

When running this script you’ll notice that our Keras neural network will start to train, and once training is complete, we’ll evaluate the network on our testing set:

This network is small, and when combined with a small dataset, takes only 2 seconds per epoch on my CPU.

Here you can see that our network is obtaining 61% accuracy.

Since we would have a 1/3 chance of randomly picking the correct label for a given image we know that our network has actually learned patterns that can be used to discriminate between the three classes.

We also save a plot of our:

  • Training loss
  • Validation loss
  • Training accuracy
  • Validation accuracy

…ensuring that we can easily spot overfitting or underfitting in our results.

Figure 10: Our simple neural network training script (created with Keras) generates an accuracy/loss plot to help us spot under/overfitting.

Looking at our plot we see a small amount of overfitting start to occur past epoch ~45 where our training and validation losses start to diverge and a pronounced gap appears.

Finally, we can save our model to disk so we can reuse it later without having to retrain it:

8. Make predictions on new data using your Keras model

At this point our model is trained — but what if we wanted to make predictions on images after our network has already been trained?

What would we do then?

How would we load the model from disk?

How can we load an image and then preprocess it for classification?

Inside the predict.py  script, I’ll show you how, so open it and insert the following code:

First, we’ll import our required packages and modules.

You’ll need to explicitly import load_model  from keras.models  whenever you write a script to load a Keras model from disk. OpenCV will be used for annotation and display. The pickle  module will be used to load our label binarizer.

Next, let’s parse our command line arguments:

  • --image : The path to our input image.
  • --model : Our trained and serialized Keras model path.
  • --label-bin : Path to the serialized label binarizer.
  • --width : The width of the input shape for our CNN. Remember — you can’t just specify anything here. You need to specify the width that the model is designed for.
  • --height : The height of the image input to the CNN. The height specified must also match the network’s input shape.
  • --flatten : Whether or not we should flatten the image. By default, we won’t flatten the image. If you need to flatten the image, you should pass a 1  for this argument.

Next, let’s load the image and resize it based on the command line arguments:

And then we’ll flatten  the image if necessary:

Flattening the image for standard fully-connected networks is straightforward (Lines 30-32).

In the case of a CNN, we also add the batch dimension, but we do not flatten the image (Lines 36-38). An example CNN is covered in the next section.

From there, let’s load the model + label binarizer into memory and make a prediction:

Our model and label binarizer are loaded via Lines 42 and 43.

We can make predictions on the input image  by calling  model.predict  (Line 46.

What does the  preds  array look like?

The 2D array contains (1) the index of the image in the batch (here there is only one index as there was only one image passed into the NN for classification) and (2) percentages corresponding to each class label, as shown by querying the variable in my Python debugger:

  • cats: 54.6%
  • dogs: 45.4%
  • panda: ~0%

In other words, our network “thinks” that it sees “cats” and it sure as hell “knows” that it doesn’t see a “panda”.

Line 50 finds the index of the max value (the 0-th “cats” index).

And Line 51 extracts the “cats” string label from the label binarizer.

Easy right?

Now let’s display the results:

We format our text  string on Line 54. This includes the label  and the prediction value in percentage format.

Then we place the text  on the output  image (Lines 55 and 56).

Finally, we show the output image on the screen and wait until the user presses any key on Lines 59 and 60 (watch Homer Simpson try to locate the “any” key).

Our prediction script was rather straightforward.

Once you’ve used the “Downloads” section of this tutorial to download the code, you can open up a terminal and try running our trained network on custom images:

Be sure that you copy/pasted or typed the entire command (including command line arguments) from within the folder relative to the script. If you’re having trouble with the command line arguments, give this blog post a read.

Figure 11: A cat is correctly classified with a simple neural network in our Keras tutorial.

Here you can see that our simple Keras neural network has classified the input image as “cats” with 55.87% probability, despite the cat’s face being partially obscured by a piece of bread.

9. BONUS: Training your first Convolutional Neural Network with Keras

Admittedly, using a standard feedforward neural network to classify images is not a wise choice.

Instead, we should leverage Convolutional Neural Networks (CNNs) which are designed to operate over the raw pixel intensities of images and learn discriminating filters that can be used to classify images with high accuracy.

The model we’ll be discussing here today is a smaller variant of VGGNet which I have named “SmallVGGNet”.

VGGNet-like models share two common characteristics:

  1. Only 3×3 convolutions are used
  2. Convolution layers are stacked on top of each other deeper in the network architecture prior to applying a destructive pooling operation

Let’s go ahead and implement SmallVGGNet now.

Open up the smallvggnet.py  file and insert the following code:

As you can see from the imports on Lines 2-10, everything needed for the SmallVGGNet  comes from keras . I encourage you to familiarize yourself with each in the Keras documentation and in my deep learning book.

We then begin to define our SmallVGGNet  class and the build  method:

Our class is defined on Line 12 and the sole build  method is defined on Line 14.

Four parameters are required for build : the width of the input images, the height of the  height input images, the  depth , and number of classes .

The depth  can also be thought of as the number of channels. Our images are in the RGB color space, so we’ll pass a depth  of 3  when we call the build  method.

First, we initialize a Sequential  model (Line 17).

Then, we determine channel ordering. Keras supports "channels_last"  (i.e. TensorFlow) and "channels_first"  (i.e. Theano) ordering. Lines 18-25 allow our model to support either type of backend.

Now, let’s add some layers to the network:

Our first CONV => RELU => POOL  layers are added by this block.

Our first CONV  layer has 32  filters of size 3x3 .

It is very important that we specify the inputShape  for the first layer as all subsequent layer dimensions will be calculated using a trickle-down approach.

We’ll use the ReLU (Rectified Linear Unit) activation function in this network architecture. There are a number of activation methods and I encourage you to familiarize yourself with the popular ones inside Deep Learning for Computer Vision with Python where pros/cons and tradeoffs are discussed.

Batch Normalization, MaxPooling, and Dropout are also applied.

Batch Normalization is used to normalize the activations of a given input volume before passing it to the next layer in the network. It has been proven to be very effective at reducing the number of epochs required to train a CNN as well as stabilizing training itself.

POOL layers have a primary function of progressively reducing the spatial size (i.e. width and height) of the input volume to a layer. It is common to insert POOL layers between consecutive CONV layers in a CNN architecture.

Dropout is an interesting concept not to be overlooked. In an effort to force the network to be more robust we can apply dropout, the process of disconnecting random neurons between layers. This process is proven to reduce overfitting, increase accuracy, and allow our network to generalize better for unfamiliar images. As denoted by the parameter, 25% of the node connections are randomly disconnected (dropped out) between layers during each training iteration.

Note: If you’re new to deep learning, this may all sound like a different language to you. Just like learning a new spoken language, it takes time, study, and practice. If you’re yearning to learn the language of deep learning, why not grab my highly rated book, Deep Learning for Computer Vision with Python? I promise that I break down these concepts in the book and reinforce them via practical examples.

Moving on, we reach our next block of (CONV => RELU) * 2 => POOL  layers:

Notice that our filter dimensions remain the same ( 3x3 , which is common for VGG-like networks); however, we’re increasing the total number of filters learned from 32 to 64.

This is followed by a (CONV => RELU => POOL) * 3  layer set:

Again, notice how all CONV layers learn 3x3  filters but the total number of filters learned by the CONV layers has doubled from 64 to 128. Increasing the total number of filters learned the deeper you go into a CNN (and as your input volume size becomes smaller and smaller) is common practice.

And finally we have a set of FC => RELU  layers:

Fully connected layers are denoted by Dense  in Keras. The final layer is fully connected with three outputs (since we have three classes  in our dataset). The softmax  layer returns the class probabilities for each label.

Now that SmallVGGNet  is implemented, let’s write the driver script that will be used to train it on our Animals dataset.

Much of the code here will be similar to the previous example, but I’ll:

  1. Review the entire script as a matter of completeness
  2. And call out any differences along the way

Open up the train_vgg.py  script and let’s get started:

The imports are the same as our previous training script with two exceptions:

  1. Instead of from keras.models import Sequential ,  this time we import SmallVGGNet via
    from pyimagesearch.smallvggnet import SmallVGGNet . Scroll up slightly to see the SmallVGGNet implementation.
  2. We will be augmenting our data with ImageDataGenerator . Data augmentation is almost always recommended and leads to models that generalize better. Data augmentation involves adding applying random rotations, shifts, shears, and scaling to existing training data. You won’t see a bunch of new .png and .jpg files — it is done on the fly as the script executes.

You should recognize the other imports at this point. If not, just refer to the bulleted list above.

Let’s parse our command line arguments:

We have four command line arguments to parse:

  • --dataset : The path to our dataset of images on disk. This can be the path to animals/  or another dataset organized the same way.
  • --model : Our model will be serialized and output to disk. This argument contains the path to the output model file. Be sure to name your model accordingly so you don’t overwrite any previously trained models (such as the simple neural network one).
  • --label-bin : Dataset labels are serialized to disk for easy recall in other scripts. This is the path to the output label binarizer file.
  • --plot : The path to the output training plot image file. We’ll review this plot to check for over/underfitting of our data. Each time you train your model with changes to parameters, you should specify a different plot filename in the command line so that you’ll have a history of plots corresponding to training notes in your notebook or notes file. This tutorial makes deep learning seem easy, but keep in mind that I went through several iterations of training before I settled on all parameters to share with you in this script.

Let’s load and preprocess our data:

Exactly as in the simple neural network script, here we:

  • Initialize lists for our data  and labels  (Lines 35 and 36).
  • Grab imagePaths  and randomly shuffle  them (Lines 39-41). The paths.list_images  function conveniently will find all images in our input dataset directory before we sort and shuffle  them.
  • Begin looping over all imagePaths  in our dataset (Line 44).

As we loop over each imagePath , we proceed to:

  • Load the image  into memory (Line 48).
  • Resize the image to 64x64 , the required input spatial dimensions of SmallVGGNet  (Line 49). One key difference is that we are not flattening our data for neural network, because it is convolutional.
  • Append the resized image  to data  (Line 50).
  • Extract the class label  of the image from the imagePath  and add it to the labels  list (Lines 54 and 55).

On Line 58 we scale pixel intensities from the range [0, 255] to [0, 1] in array form.

We also convert the labels  list to a NumPy array format (Line 59).

Then we’ll split our data and binarize our labels:

We perform a 75/25 training and testing split on the data (Lines 63 and 64). An experiment I would encourage you to try is to change the training split to 80/20 and see if the results change significantly.

Label binarizing takes place on Lines 70-72. This allows for one-hot encoding as well as serializing our label binarizer to a pickle file later in the script.

Now comes the data augmentation:

On Lines 75-77, we initialize our image data generator to perform image augmentation.

Image augmentation allows us to construct “additional” training data from our existing training data by randomly rotating, shifting, shearing, zooming, and flipping.

Data augmentation is often a critical step to:

  1. Avoiding overfitting
  2. Ensuring your model generalizes well

I recommend that you always perform data augmentation unless you have an explicit reason not to.

To build our SmallVGGNet , we simply call SmallVGGNet.build  while passing the necessary parameters (Lines 80 and 81).

Let’s compile and train our model:

First, we establish our learning rate, number of epochs, and batch size (Lines 85-87).

Then we initialize our Stochastic Gradient Descent (SGD) optimizer (Line 92).

We’re now ready to compile and train our model (Lines 93-99). Since we’re performing data augmentation, we call model.fit_generator  (instead of model.fit ). We must pass the generator with our training data as the first parameter. The generator will produce batches of augmented training data according to the settings we previously made.

Finally, we’ll evaluate our model, plot the loss/accuracy curves, and save the model:

We make predictions on the testing set, and then scikit-learn is employed to calculate and print our classification_report  (Lines 103-105).

Matplotlib is utilized for plotting the loss/accuracy curves — Lines 108-118 demonstrate my typical plot setup. Line 119 saves the figure to disk.

Finally, we save our model and label binarizer to disk (Lines 123-126).

Let’s go ahead and train our model.

Make sure you’ve used the “Downloads” section of this blog post to download the source code and the example dataset.

From there, open up a terminal and execute the following command:

When you paste the command, ensure that you have all the command line arguments to avoid a “usage” error. If you are new to command line arguments, make sure you read about them before continuing.

Training on a CPU will take some time — each of the 75 epochs requires over one minute. Training will take well over an hour.

A GPU will finish the process in a matter of minutes as each epoch requires only 2sec, as demonstrated!

Let’s take a look at the resulting training plot that is in the output/  directory:

Figure 12: Our deep learning with Keras accuracy/loss plot demonstrates that we have obtained 78% accuracy on our animals data with our SmallVGGNet model.

As our results demonstrate, you can see that we are achieving 78% accuracy on our Animals dataset using a Convolutional Neural Network, significantly higher than the previous accuracy of 61% using a standard fully-connected network.

We can also apply our newly trained Keras CNN to example images:

Figure 13: Our deep learning with Keras tutorial has demonstrated how we can confidently recognize pandas in images.

Our CNN is very confident that this a “panda”. I am too, but I just wish he would stop staring at me!

Let’s try a cute little beagle:

Figure 14: A beagle is recognized as a dog using Keras, TensorFlow, and Python. Our Keras tutorial has introduced the basics for deep learning, but has just scratched the surface of the field.

A couple beagles have been part of my family and childhood. I’m glad that this beagle picture I found online is recognized as a dog!

I could use a similar CNN to find dog photos of my beagles on my computer.

In fact, in Google Photos, if you type “dog” in the search box, pictures of dogs in your photo library will be returned — I’m pretty sure a CNN has been used for that image search engine feature. Image search engines aren’t the only use case for CNNs — I bet your mind is starting to come up with all sorts of ideas upon which to apply deep learning.

Frustrated with your progress in deep learning?

You can develop your first neural network in minutes…with just a few lines of Python as I demonstrated today.

But designing more advanced networks and tuning training parameters takes studying, time, and practice. Many people find tutorials online that work, but when they try to train their own models, they are left struggling.

Discover and learn deep learning the right way in my book: Deep Learning for Computer Vision with Python.

Inside the book, you’ll find self-study tutorials and end-to-end projects on topics like:

  • Convolutional Neural Networks
  • Object Detection via Faster R-CNNs and SSDs
  • Generative Adversarial Networks (GANs)
  • Emotion/Facial Expression Recognition
  • Best practices, tips, and rules of thumb
  • …and much more!

Using this book you’ll finally be able to bring deep learning to your own projects.

Skip the academics and get to the results.

Click here to learn more.

Summary

In today’s tutorial, you learned how to get started with Keras, Deep Learning, and Python.

Specifically, you learned the seven key steps to working with Keras and your own custom datasets:

  1. How to load your data from disk
  2. How to create your training and testing splits
  3. How to define your Keras model architecture
  4. How to compile and prepare your Keras model
  5. How to train your model on your training data
  6. How to evaluate your model on testing data
  7. How to make predictions using your trained Keras model

From there you also learned how to implement a Convolutional Neural Network, enabling you to obtain higher accuracy than a standard fully-connected network.

If you have any questions regarding Keras be sure to leave a comment — I’ll do my best to answer.

And to be notified when future Keras and deep learning posts are published here on PyImageSearch, be sure to enter your email address in the form below!

Downloads:

If you would like to download the code and images used in this post, please enter your email address in the form below. Not only will you get a .zip of the code, I’ll also send you a FREE 17-page Resource Guide on Computer Vision, OpenCV, and Deep Learning. Inside you'll find my hand-picked tutorials, books, courses, and libraries to help you master CV and DL! Sound good? If so, enter your email address and I’ll send you the code immediately!

, , , , , ,

29 Responses to Keras Tutorial: How to get started with Keras, Deep Learning, and Python

  1. mohamed September 10, 2018 at 1:54 pm #

    Great: Adrian!
    always forward
    Thank you and thank you Igor

    I have a suggestion as to how to apply some basic concepts of deep learning.
    About how to write those equations in Python.
    Many people know the concepts but there is a barrier between them and the application.

    • Adrian Rosebrock September 10, 2018 at 5:24 pm #

      Hey Mohamed — is there a particular algorithm/equation that you’re struggling with? Or are you speaking in more general terms? If you’re speaking more generically, then Deep Learning for Computer Vision with Python covers the basic concepts of both machine learning and deep learning, including some basic equations and theory before moving into actual applications and code.

      • mohamed September 10, 2018 at 6:00 pm #

        Thanks Adrian Yes, there are certain things that are facing me. But anyway I meant in general

        The book is really wonderful I will work on getting the rest of the versions of it.

  2. Newman September 10, 2018 at 1:58 pm #

    Not Working, here, different numbers on training, and a lot of wrong detection.

    first:
    precision recall f1-score support

    cats 0.46 0.66 0.54 244
    dogs 0.49 0.22 0.30 242
    panda 0.69 0.78 0.73 264

    avg / total 0.55 0.56 0.53 750

    second method:
    precision recall f1-score support

    cats 0.66 0.77 0.71 244
    dogs 0.76 0.55 0.63 242
    panda 0.85 0.95 0.90 264

    avg / total 0.76 0.76 0.75 750

    everything seems fine but not the results.

    • Adrian Rosebrock September 10, 2018 at 5:23 pm #

      Could you share which version of Keras and TensorFlow (assuming a TF backend) you are running? Secondly, keep in mind that NNs are stochastic algorithms — there will naturally be variations in results and you should not expect your results to 100% match mine. The effects are random weight initializations are even more pronounced due the fact that we’re working with such a small dataset.

  3. Enrique September 11, 2018 at 3:44 am #

    Hi Adrian:
    I test an image like this https://www.petdarling.com/articulos/wp-content/uploads/2014/06/como-quitarle-las-pulgas-a-mi-perro.jpg, however the result shown is “Panda 100%”. Why happend this?

    • Adrian Rosebrock September 11, 2018 at 8:04 am #

      Pandas are largely white and black and the dog itself is dark brown and white. The network could have overfit to the panda class. The example used here is just that — an example. It’s not meant to be a model that can correctly classify each image class 100% of the time. For that, you will need more data and more advanced techniques. I would encourage you to take a look at Deep Learning for Computer Vision with Python for more information.

  4. Aiden Ralph September 11, 2018 at 3:46 am #

    Brilliant post Adrian!

    • Adrian Rosebrock September 11, 2018 at 8:02 am #

      Thanks Aiden, I’m glad you liked it!

  5. Aline September 11, 2018 at 4:29 pm #

    Amazing tutorial! So clear!

    • Adrian Rosebrock September 12, 2018 at 2:10 pm #

      Thanks so much, Aline! I’m glad you found it helpful 🙂

  6. Reed Guo September 12, 2018 at 12:40 am #

    Hi, Adrian

    Excellent post.

    Can we improve its accuracy?

    • Adrian Rosebrock September 12, 2018 at 2:03 pm #

      You could improve the accuracy by:

      1. Using more images
      2. Applying transfer learning

      Models trained on ImageNet already have a panda, dog, and cat class as well so you could even use an out-of-the-box classifier.

  7. Viktor September 12, 2018 at 3:49 am #

    Hello! How can I download the Animals dataset?

    • Adrian Rosebrock September 12, 2018 at 1:58 pm #

      Just use the “Downloads” section of the blog post and you will be able to download the code and “Animals” dataset.

  8. Hamid September 12, 2018 at 5:59 am #

    Nice one Adrian ! I really appreciate it .

    Such a wonderful post with elegant and simple explanation .

    I wonder if increasing the no.of hidden layers and making dropout to 0.5 would further increase the accuracy from 78

    • Adrian Rosebrock September 12, 2018 at 1:55 pm #

      It may as those are hyperparameters. Give it a try and see!

  9. Marcelo Mota September 12, 2018 at 2:06 pm #

    My friend, this is the best tutorial so far I have ever seen!! thank you so much.

    I am struggling just in one point: I have a binary problem and have to use the to_categorical function from keras. As I am seeing, I can just use it with integers categories, not strings. Is this true?

    And how do I use the pickle file to write this integer binary categories (from to_categorical) and also how do I use it in the classification_report (the code uses the “lb”)?

    Thank you again and congratulations for such good and complete explanations!

    • Adrian Rosebrock September 12, 2018 at 2:14 pm #

      Thanks Marcelo, I’m glad you found the tutorial helpful!

      For a binary problem you should use the LabelEncoder class instead of LabelBinarizer. The LabelBinarizer will integer-encode the labels which you can then pass into to_categorical to obtain the vector representation of the labels.

      The LabelEncoder can be serialized to disk and convert labels just like the LabelBinarizer does.

  10. Mutlucan Tokat September 12, 2018 at 2:29 pm #

    Hi Adrian,
    Range of the pixels are same. Every pixel gets values between 0-255. Why we need to scale it between 0 and 1 ?

    • Adrian Rosebrock September 14, 2018 at 9:53 am #

      Most (but not all) neural networks expect inputs to be in the range [0, 1]. You may see other scaling and normalization techniques, such as mean subtraction, but those are more advanced methods. Your goal here is to bring every input to a common range. Exactly which range you use may depend on:

      1. Weight initialization techniqu
      2. Activation technique
      3. Whether or not you are performing mean subtraction

      In general, scaling to [0, 1] gives your variables less “freedom” and less likely of causing gradient or overflow errors if you keep larger value ranges (such as [0, 255]). [0, 1] scaling is typically your “first” scaling technique that you’ll see.

  11. Bob de Graaf September 13, 2018 at 9:28 am #

    Hi Adrian,

    Great tutorial as always! I’m wondering though, isn’t this almost the same tutorial as the Pokemon one? The where you classify different 5 pokemons in images? The code seems mostly the same 🙂

    I do see some small differences though, for example in the Pokemon on you use the Adam optimizer instead of SGD, and the initial learning rate is 0.001 instead of 0.01.

    Are these changes things you’ve learned these past months to achieve better results? Or were these randomly chosen?

    • Adrian Rosebrock September 14, 2018 at 9:33 am #

      The code is similar but not the same. This tutorial is meant to be significantly more introductory and is intended for readers with little to no experience with Keras and deep learning.

      The parameters were also not randomly chosen. They were determined via experiments to find the optimal values for this particular post.

      • Bob de Graaf September 15, 2018 at 8:42 am #

        Ah ok, good to know, thanks! I wasn’t trying to be offensive or anything, just curious. Apologies if I came across that way!

        • Adrian Rosebrock September 17, 2018 at 2:31 pm #

          You certainly were not being offensive, Bob. I just wanted to clarify, that’s all 🙂 Have a great day, friend!

  12. Mutlu September 15, 2018 at 7:39 am #

    Hi Adrian,

    What is chanDim = -1 and chanDim = 1 in beginning of the smallVGGNET?

    Great tutorial BTW.

    • Adrian Rosebrock September 17, 2018 at 2:51 pm #

      It’s the dimension of the channel. For channels-first ordering (ex. Thenano) the channels are ordered first but with channels-last ordering (like TensorFlow) the channels are ordered last — a “-1” value when indexing with Python means the “last dimension/value”.

  13. Roshan September 15, 2018 at 4:27 pm #

    Hi Adrain,
    Thank you for the excellent tutorial.
    I have a basic question:
    During validation, we are considering train, test split as 75% and 25%respectively.
    So while testing, the network randomly picks 25% of images.
    But if I want to find out which images are used for testing, how can I find out?
    I want to know the names of the images used for testing.
    Please help me

    • Adrian Rosebrock September 17, 2018 at 2:25 pm #

      The names of the images won’t be returned by scikit-learn. Instead, if you want the exact image names I would suggest you split your image paths rather than the raw images/labels. That will enable you to still perform the split and have the image paths.

Quick Note on Comments

Please note that all comments on the PyImageSearch blog are hand-moderated by me. By moderating each comment on the blog I can ensure (1) I interact with and respond to as many readers as possible and (2) the PyImageSearch blog is kept free of spam.

Typically, I only moderate comments every 48-72 hours; however, I just got married and am currently on my honeymoon with my wife until early October. Please feel free to submit comments of course! Just keep in mind that I will be unavailable to respond until then. For faster interaction and response times, you should join the PyImageSearch Gurus course which includes private community forums.

I appreciate your patience and thank you being a PyImageSearch reader! I will see you when I get back.

Leave a Reply