Image classification with Keras and deep learning

The Christmas season holds a special place in my heart.

Not because I’m particularly religious or spiritual. Not because I enjoy cold weather. And certainly not because I relish the taste of eggnog (the consistency alone makes my stomach turn).

Instead, Christmas means a lot to me because of my dad.

As I mentioned in a post a few weeks ago, I had a particularly rough childhood. There was a lot of mental illness in my family. I had to grow up fast in that environment and there were times where I missed out on the innocence of being a kid and living in the moment.

But somehow, through all that struggle, my dad made Christmas a glowing beacon of happiness.

Perhaps one of my favorite memories as a kid was when I was in kindergarten (5-6 years old). I had just gotten off the bus, book bag in hand.

I was walking down our long, curvy driveway where at the bottom of the hill I saw my dad laying out Christmas lights which would later decorate our house, bushes, and trees, transforming our home into a Christmas wonderland.

I took off like a rocket, carelessly running down the driveway (as only a child can), unzipped winter coat billowing behind me as I ran, shouting:

“Wait for me, dad!”

I didn’t want to miss out on the decorating festivities.

For the next few hours, my dad patiently helped me untangle the knotted ball of Christmas lights, lay them out, and then watched as I haphazardly threw the lights over the bushes and trees (that were many times my size), ruining any methodical, well-planned decorating blueprint he had so tirelessly designed.

Once I was finished he smiled proudly. He didn’t need any words. His smile confessed that my decorating was the best he had ever seen.

This is just one example of the many, many times my dad made Christmas special for me (despite what else may have been going on in the family).

He probably didn’t even know he was crafting a lifelong memory in my mind — he just wanted to make me happy.

Each year, when Christmas rolls around, I try to slow down, reduce stress, and enjoy the time of year.

Without my dad, I wouldn’t be where I am today — and I certainly wouldn’t have made it through my childhood.

In honor of the Christmas season, I’d like to dedicate this blog post to my dad.

Even if you’re busy, don’t have the time, or simply don’t care about deep learning (the subject matter of today’s tutorial), slow down and give this blog post a read, if for nothing else than for my dad.

I hope you enjoy it.

Looking for the source code to this post?
Jump right to the downloads section.

Image classification with Keras and deep learning

This blog post is part two in our three-part series of building a Not Santa deep learning classifier (i.e., a deep learning model that can recognize if Santa Claus is in an image or not):

  1. Part 1: Deep learning + Google Images for training data
  2. Part 2: Training a Santa/Not Santa detector using deep learning (this post)
  3. Part 3: Deploying a Santa/Not Santa deep learning detector to the Raspberry Pi (next week’s post)

In the first part of this tutorial, we’ll examine our “Santa” and “Not Santa” datasets.

Together, these images will enable us to train a Convolutional Neural Network using Python and Keras to detect if Santa is in an image.

Once we’ve explored our training images, we’ll move on to training the seminal LeNet architecture. We’ll use a smaller network architecture to ensure readers without expensive GPUs can still follow along with this tutorial. This will also ensure beginners can understand the fundamentals of deep learning with Convolutional Neural Networks with Keras and Python.

Finally, we’ll evaluate our Not Santa model on a series of images, then discuss a few limitations to our approach (and how to further extend it).

Our “Santa” and “Not Santa” dataset

Figure 1: A subset of the dataset for the Not Santa detector. Included are (left) Santa images extracted from Google Images and (right) images sampled from the UKBench dataset.

In order to train our Not Santa deep learning model, we require two sets of images:

  • Images containing Santa (“Santa”).
  • Images that do not contain Santa (“Not Santa”).

Last week we used our Google Images hack to quickly grab training images for deep learning networks.

In this case, we can see a sample of the 461 images containing Santa gathered using technique (Figure 1, left).

I then randomly sampled 461 images that do not contain Santa (Figure 1, right) from the UKBench dataset, a collection of ~10,000 images used for building and evaluating Content-based Image Retrieval (CBIR) systems (i.e., image search engines).

Used together, these two image sets will enable us to train our Not Santa deep learning model.

Your first image classifier with Convolutional Neural Networks and Keras

Figure 2: The LeNet architecture consists of two sets of convolutional, activation, and pooling layers, followed by a fully-connected layer, activation, another fully-connected, and finally a softmax classifier. We’ll be implementing this network architecture using Keras and Python (image source).

The LetNet architecture is an excellent “first image classifier” for Convolutional Neural Networks. Originally designed for classifying handwritten digits, we can easily extend it to other types of images as well.

This tutorial is meant to be an introduction to image classification using deep learning, Keras, and Python so I will not be discussing the inner-workings of each layer. If you are interested in taking a deep dive into deep learning, please take a look at my book, Deep Learning for Computer Vision with Python, where I discuss deep learning in detail (and with lots of code + practical, hands-on implementations as well).

Let’s go ahead and define the network architecture. Open up a new file name it lenet.py , and insert the following code:

Note: You’ll want to use the “Downloads” section of this post to download the source code + example images before running the code. I’ve included the code below as a matter of completeness, but you’ll want to ensure your directory structure matches mine.

Lines 2-8 handle importing our required Python packages. The Conv2D  class is responsible for performing convolution. We can use the MaxPooling2D  class for max-pooling operations. As the name suggests, the Activation  class applies a particular activation function. When we are ready to Flatten  our network topology into fully-connected, Dense  layer(s) we can use the respective class names.

The LeNet  class is defined on Line 10 followed by the build  method on Line 12. Whenever I defined a new Convolutional Neural Network architecture I like to:

  • Place it in its own class (for namespace and organizational purposes)
  • Create a static build  function that builds the architecture itself

The build  method, as the name suggests, takes a number of parameters, each of which I discuss below:

  • width : The width of our input images
  • height : The height of the input images
  • depth : The number of channels in our input images ( 1  for grayscale single channel images, 3  for standard RGB images which we’ll be using in this tutorial)
  • classes : The total number of classes we want to recognize (in this case, two)

We define our model  on Line 14. We use the Sequential  class since we will be sequentially adding layers to the model .

Line 15 initializes our inputShape  using channels last ordering (the default for TensorFlow). If you are using Theano (or any other backend to Keras that assumes channels first ordering), Lines 18 and 19 properly update the inputShape .

Now that we have initialized our model, we can start adding layers to it:

Lines 21-25 creates our first set of CONV => RELU => POOL  layers.

The CONV  layer will learn 20 convolution filters, each of which are 5×5.

We then apply a ReLU activation function followed by 2×2 max-pooling in both the x and y direction with a stride of two. To visualize this operation, consider a sliding window that “slides” across the activation volume, taking the max operation over each region, while taking a step of two pixels in both the horizontal and vertical direction.

Let’s define our second set of CONV => RELU => POOL  layers:

This time we are learning 50 convolutional filters rather than the 20 convolutional filters as in the previous layer set. It’s common to see the number of CONV  filters learned increase the deeper we go in the network architecture.

Our final code block handles flattening out the volume into a set of fully-connected layers:

On Line 33 we take the output of the preceding MaxPooling2D  layer and flatten it into a single vector. This operation allows us to apply our dense/fully-connected layers.

Our fully-connected layer contains 500 nodes (Line 34) which we then pass through another nonlinear ReLU activation.

Line 38 defines another fully-connected layer, but this one is special — the number of nodes is equal to the number of classes  (i.e., the classes we want to recognize).

This Dense  layer is then fed into our softmax classifier which will yield the probability for each class.

Finally, Line 42 returns our fully constructed deep learning + Keras image classifier to the calling function.

Training our Convolutional Neural Network image classifier with Keras

Let’s go ahead and get started training our image classifier using deep learning, Keras, and Python.

Note: Be sure to scroll down to the “Downloads” section to grab the code + training images. This will enable you to follow along with the post and then train your image classifier using the dataset we have put together for you.

Open up a new file, name it train_network.py , and insert the following code (or simply follow along with the code download):

On Lines 2-18 we import required packages. There packages enable us to:

  1. Load our image dataset from disk
  2. Pre-process the images
  3. Instantiate our Convolutional Neural Network
  4. Train our image classifier

Notice that on Line 3 we set the matplotlib  backend to "Agg"  so that we can save the plot to disk in the background. This is important if you are using a headless server to train your network (such as an Azure, AWS, or other cloud instance).

From there, we parse command line arguments:

Here we have two required command line arguments, --dataset  and --model , as well as an optional path to our accuracy/loss chart, --plot .

The --dataset  switch should point to the directory containing the images we will be training our image classifier on (i.e., the “Santa” and “Not Santa” images) while the --model  switch controls where we will save our serialized image classifier after it has been trained. If --plot  is left unspecified, it will default to plot.png  in this directory if unspecified.

Next, we’ll set some training variables, initialize lists, and gather paths to images:

On Lines 32-34 we define the number of training epochs, initial learning rate, and batch size.

Then we initialize data and label lists (Lines 38 and 39). These lists will be responsible for storing our the images we load from disk along with their respective class labels.

From there we grab the paths to our input images followed by shuffling them (Lines 42-44).

Now let’s pre-process the images:

This loop simply loads and resizes each image to a fixed 28×28 pixels (the spatial dimensions required for LeNet), and appends the image array to the data  list (Lines 49-52) followed by extracting the class label  from the imagePath  on Lines 56-58.

We are able to perform this class label extraction since our dataset directory structure is organized in  the following fashion:

Therefore, an example imagePath  would be:

After extracting the label  from the imagePath , the result is:

I prefer organizing deep learning image datasets in this manner as it allows us to efficiently organize our dataset and parse out class labels without having to use a separate index/lookup file.

Next, we’ll scale images and create the training and testing splits:

On Line 61 we further pre-process our input data by scaling the data points from [0, 255] (the minimum and maximum RGB values of the image) to the range [0, 1].

We then perform a training/testing split on the data using 75% of the images for training and 25% for testing (Lines 66 and 67). This is a typical split for this amount of data.

We also convert labels to vectors using one-hot encoding — this is handled on Lines 70 and 71.

Subsequently, we’ll perform some data augmentation, enabling us to generate “additional” training data by randomly transforming the input images using the parameters below:

Data augmentation is covered in depth in the Practitioner Bundle of my new book, Deep Learning for Computer Vision with Python.

Essentially Lines 74-76 create an image generator object which performs random rotations, shifts, flips, crops, and sheers on our image dataset. This allows us to use a smaller dataset and still achieve high results.

Let’s move on to training our image classifier using deep learning and Keras.

We’ve elected to use LeNet for this project for two reasons:

  1. LeNet is a small Convolutional Neural Network that is easy for beginners to understand
  2. We can easily train LeNet on our Santa/Not Santa dataset without having to use a GPU
  3. If you want to study deep learning in more depth (including ResNet, GoogLeNet, SqueezeNet, and others) please take a look at my book, Deep Learning for Computer Vision with Python.

We build our LeNet model along with the Adam  optimizer on Lines 80-83. Since this is a two-class classification problem we’ll want to use binary cross-entropy as our loss function. If you are performing classification with > 2 classes, be sure to swap out the loss  for categorical_crossentropy .

Training our network is initiated on Lines 87-89 where we call model.fit_generator , supplying our data augmentation object, training/testing data, and the number of epochs we wish to train for.

Line 93 handles serializing the model to disk so we later use our image classification without having to retrain it.

Finally, let’s plot the results and see how our deep learning image classifier performed:

Using matplotlib, we build our plot and save the plot to disk using the --plot  command line argument which contains the path + filename.

To train the Not Santa network (after using the “Downloads” section of this blog post to download the code + images), open up a terminal and execute the following command:

As you can see, the network trained for 25 epochs and we achieved high accuracy (97.40% testing accuracy) and low loss that follows the training loss, as is apparent from the plot below:

Figure 3: Training our image classifier using deep learning, Keras, and Python.

Evaluating our Convolutional Neural Network image classifier

The next step is to evaluate our Not Santa model on example images not part of the training/testing splits.

Open up a new file, name it test_network.py , and let’s get started:

On Lines 2-7 we import our required packages. Take special notice of the load_model  method — this function will enable us to load our serialized Convolutional Neural Network (i.e., the one we just trained in the previous section) from disk.

Next, we’ll parse our command line arguments:

We require two command line arguments: our --model  and a input --image (i.e., the image we are going to classify).

From there, we’ll load the image and pre-process it:

We load the image  and make a copy of it on Lines 18 and 19. The copy allows us to later recall the original image and put our label on it.

Lines 22-25 handling scaling our image to the range [0, 1], converting it to an array, and addding an extra dimension (Lines 22-25).

As I explain in my book, Deep Learning for Computer Vision with Python, we train/classify images in batches with CNNs. Adding an extra dimension to the array via np.expand_dims  allows our image to have the shape (1, width, height, 3) , assuming channels last ordering.

If we forget to add the dimension, it will result in an error when we call model.predict  down the line.

From there we’ll load the Not Santa image classifier model and make a prediction:

This block is pretty self-explanatory, but since this is where the heavy lifting of this script is performed, let’s take a second and understand what’s going on under the hood.

We load the Not Santa model on Line 29 followed by making a prediction on Line 32.

And finally, we’ll use our prediction to draw on the orig  image copy and display it to the screen:

We build the label (either “Santa” or “Not Santa”) on Line 35 and then choose the corresponding probability value on Line 36.

The label  and  proba are used on Line 37 to build the label text to show at the image as you’ll see in the top left corner of the output images below.

We resize the images to a standard width to ensure it will fit on our screen, and then put the label text on the image (Lines 40-42).

Finally, on Lines 45, we display the output image until a key has been pressed (Line 46).

Let’s give our Not Santa deep learning network a try:

Figure 4: Santa has been detected with 98% confidence using our Keras image classifer.

By golly! Our software thinks it is good ole’ St. Nick, so it really must be him!

Let’s try another image:

 

Figure 5: Using a Convolutional Neural Network, Keras, and Python to perform image classification.

Santa is correctly detected by the Not Santa detector and it looks like he’s happy to be delivering some toys!

Now, let’s perform image classification on an image that does not contain Santa:

Figure 6: Image classification with deep learning.

It looks like it’s too bright out for Santa to be flying through the sky and delivering presents in this part of the world yet (New York City) — he must still be in Europe at this time where night has fallen.

Speaking of the night and Christmas Eve, here is an image of a cold night sky:

Figure 7: Santa isn’t present in this part of the Christmas Eve sky, but he’s out there somewhere.

But it must be too early for St. Nicholas. He’s not in the above image either.

But don’t worry!

As I’ll show next week, we’ll be able to detect him sneaking down the chimney and delivering presents with a Raspberry Pi.

Limitations of our deep learning image classification model

There are a number of limitations to our image classifier.

The first one is that the 28×28 pixel images are quite small (the LeNet architecture was originally designed to recognize handwritten digits, not objects in photos).

For some example images (where Santa is already small), resizing the input image down to 28×28 pixels effectively reduces Santa down to a tiny red/white blob that is only 2-3 pixels in size.

In these types of situations it’s likely that our LeNet model is just predicting when there is a significant amount of red and white localized together in our input image (and likely green as well, as red, green, and white are Christmas colors).

State-of-the-art Convolutional Neural Networks normally accept images that are 200-300 pixels along their maximum dimension — these larger images would help us build a more robust Not Santa classifier. However, using larger resolution images would also require us to utilize a deeper network architecture, which in turn would mean that we need to gather additional training data and utilize a more computationally expensive training process.

This is certainly a possibility, but is also outside the scope of this blog post.

Therefore, If you want to improve our Not Santa app I would suggest you:

  1. Gather additional training data (ideally, 5,000+ example “Santa” images).
  2. Utilize higher resolution images during training. I imagine 64×64 pixels would produce higher accuracy. 128×128 pixels would likely be ideal (although I have not tried this).
  3. Use a deeper network architecture during training.
  4. Read through my book, Deep Learning for Computer Vision with Python, where I discuss training Convolutional Neural Networks on your own custom datasets in more detail.

Despite these limitations, I was incredibly surprised with how well the Not Santa app performed (as I’ll discuss next week). I was expecting a decent number of false-positives but the network was surprisingly robust given how small it is.

Summary

In today’s blog post you learned how to train the seminal LeNet architecture on a series of images containing “Santa” and “Not Santa”, with our end goal being to build an app similar to HBO’s Silicon Valley Not Hotdog application.

We were able to gather our “Santa” dataset (~460 images) by following our previous post on gathering deep learning images via Google Images.

The “Not Santa” dataset was created by sampling the UKBench dataset (where no images contain Santa).

We then evaluated our network on a series of testing images — in each case our Not Santa model correctly classified the input image.

In our next blog post, we’ll deploy our trained Convolutional Neural Network to the Raspberry Pi to finish building our Not Santa app.

What now?

Now that you’ve learned how to train your first Convolutional Neural Network, I’m willing to bet that you’re interested in:

  • Mastering the fundamentals of machine learning and neural networks
  • Studying deep learning in more detail
  • Training your own Convolutional Neural Networks from scratch

If so, you’ll want to take a look at my new book, Deep Learning for Computer Vision with Python.

Inside the book you’ll find:

  • Super-practical walkthroughs
  • Hands-on tutorials (with lots of code)
  • Detailed, thorough guides to help you replicate state-of-the-art results from seminal deep learning publications.

To learn more about my new book (and start your journey to deep learning mastery), just click here.

Otherwise, be sure to enter your email address in the form below to be notified when new deep learning posts are published here on PyImageSearch.

Downloads:

If you would like to download the code and images used in this post, please enter your email address in the form below. Not only will you get a .zip of the code, I’ll also send you a FREE 11-page Resource Guide on Computer Vision and Image Search Engines, including exclusive techniques that I don’t post on this blog! Sound good? If so, enter your email address and I’ll send you the code immediately!

, , , , , ,

71 Responses to Image classification with Keras and deep learning

  1. John Beale December 11, 2017 at 1:07 pm #

    Very clearly presented, as always! Looking forward to the next installment on deploying on R-Pi. The question I’m most interested in is what the tradeoff looks like between image resolution and processing time. For a given network architecture, is there some equation that can tell you, for a [x,y] pixel input, that it will take N FLOPs (or on given hardware, T seconds) to do the forward propagation through the network? I understand that there is a separate question of mAP scores versus input resolution.

    • Adrian Rosebrock December 11, 2017 at 4:33 pm #

      It’s not an exact computation because you need to consider there are many parameters to consider. There is the speed of the physical hardware itself. Then you have the libraries used to accelerate learning. On top of that is the actual network architecture. Is the network fully convolutional? Or do you have FC layers in there? All of these choices have an impact and really the best way to benchmark is with system timings. Quite an exhaustive one can be found here.

  2. Ayesha shakeel December 11, 2017 at 3:40 pm #

    Hy Adrian, hope you’re having a great time. Can you please give me a Christmas gift by helping me resolve this issue? i would be very grateful
    The issue is: I am following your tutorial to install open CV 3 and python 2.7 on my raspberry pi3. here’s the link to the tutorial https://www.pyimagesearch.com/2016/04/18/install-guide-raspberry-pi-3-raspbian-jessie-opencv-3/
    I have followed all the steps and i get the outputs at each step described by you but when i come to the step of compiling cv, using make -j4(i have tried simple make also), i get this error “fatal error : stdlib.h >> no such file or directory found”.
    i have checked and i have std library in the path /usr/include/ c++/6/stdlib.h, still why does it give this error. please please help me resolve it, my project’s deadline is approaching and i need to have open CV installed for that. Thank you!
    regards
    Ayesha

    • Adrian Rosebrock December 11, 2017 at 4:35 pm #

      This sounds like an issue with the precompiled headers. Delete your “build” directory, re-create it, and re-run CMake, but this time include the following flag:

      -D ENABLE_PRECOMPILED_HEADERS=OFF

      The compile might take slightly longer but it should resolve the issue.

  3. sam December 11, 2017 at 11:02 pm #

    what would be the change for it to do image classification on 4 classes?

    • Adrian Rosebrock December 12, 2017 at 9:07 am #

      You would use categorical cross-entropy as your loss function and you would change classes=4 in the LeNet instantiation. If you’re just getting started with deep learning, please take a look at my new book, Deep Learning for Computer Vision with Python. This book will help you learn the fundamentals of deep learning for computer vision applications. Be sure to take a look!

  4. RomRoc December 12, 2017 at 4:22 am #

    Hello Adrian, as always an incredibly useful post. You should know that I started learning opencv + computer vision + deep learning from 2 months, and your blog was the starting point of my study.

    It could be nice if next year you will make a post for object detection using deep learning.

    Thanks for your work, and have a great Christmas holiday!

    • Adrian Rosebrock December 12, 2017 at 8:59 am #

      It’s great you have been enjoying the PyImageSearch blog! Congratulations on your computer vision + deep learning journey.

      I actually have a blog post on deep learning object detection. I’m covering how to train your own custom deep learning object detectors inside Deep Learning for Computer Vision with Python.

      • RomRoc December 15, 2017 at 7:27 am #

        Amazing post, this is the only one post I found in Internet that describes properly opencv functionality for deep learning object detection. I have to say even opencv official documentation is not very clear as your post.

        So, semantic segmentation using deep learning and opencv could be a nice post in your blog for next year 🙂

        Bye

        • Adrian Rosebrock December 15, 2017 at 8:14 am #

          I’m glad you found the blog post useful! I really appreciate the kind words as well, thank you. I will consider doing semantic segmentation as a blog post in the future as well.

  5. QV December 12, 2017 at 1:35 pm #

    HI Adrian, I come across your post to find some info that relate to my current project, but the most impression I am left with is your emotional Christmas story. Like you, I also had lot of struggle growing up in my family. But Christmas is always a wonderful time. And it is very compelling that you find a way to utilize technology to express your feeling and your story. Thank you for sharing with us!

    • Adrian Rosebrock December 13, 2017 at 1:39 pm #

      Thank you for the comment! I don’t normally share personal information on the PyImageSearch blog but the past month it’s felt appropriate. Best wishes to you, your family, and have a Merry Christmas and Happy Holidays.

  6. Jeff December 12, 2017 at 3:04 pm #

    Hello Adrian,

    I am trying to use your code above but unfortunately I keep getting error.
    Where do I have to write the path to the images folder where the Santa images are located. And where do I write the path for the NOT Santa images?

    • John December 21, 2017 at 8:40 pm #

      Hi Adrian,

      Unfortunately, I’m having the same issue as well. You say that “The –dataset switch should point to the directory containing the images we will be training our image classifier on (i.e., the “Santa” and “Not Santa” images)…” But where do I specify that?

      I’ve tried specifying it (replacing “images” with the path to santa images) in the following line, but it doesnt seem to work.

      $ python train_network.py –dataset images –model santa_not_santa.model

      Could you please help?

      Thanks and Merry Christmas

      I’ve tried

      • John December 21, 2017 at 9:00 pm #

        Specifically, I’m wondering about lines 9-15 of train_network.py and how I specify the path to the dataset on any of those lines. I’ve tried a few things, but i keep getting these errors.

        Using TensorFlow backend.

        usage: train_network.py [-h] -d DATASET -m MODEL [-p PLOT]
        train_network.py: error: the following arguments are required: -d/–dataset, -m/–model

        Could you please provide an example code with pathways? Any help would be appreciated. Thanks

        • Adrian Rosebrock December 22, 2017 at 6:42 am #

          Hi John — first, please make sure you use the “Downloads” section of this blog post to download the source code. From there unzip the archive and use your terminal to navigate to where the unzipped code is. You do not need to modify any code. Use your terminal to execute the code as is done in the blog post. If you’re new to command line arguments please give this tutorial a read.

          • John December 22, 2017 at 10:15 am #

            Hi Adrian,
            Thanks for your response. I’ve downloaded the data. but I keep getting errors when I try to run the following line in the terminal:

            python train_network.py –dataset images –model santa_not_santa.model

            File “train_network.py”, line 9, in
            from keras.preprocessing.image import ImageDataGenerator
            ModuleNotFoundError: No module named ‘keras’

            It’s strange, because thus far, I don’t have any issues importing keras and running python scripts with it. More generally, I’m wondering how to create the santa_not_santa.model as well (I might have missed it, but it doesn’t appear to be in the blog post).

            If you could clarify the issue for me, that would be fantastic!

            Thanks again,

          • Adrian Rosebrock December 26, 2017 at 4:41 pm #

            Running the following command will generate the santa_not_santa.model file:

            $ python train_network.py --dataset images --model santa_not_santa.model

            Since that is the script causing the error your model is not trained and saved to disk.

            As for the Keras import issue, that is very strange. You mentioned being able to import and use Keras. Were you using a Python virtual environment? Unfortunately without physical access to your machine I’m not sure what the particular error is.

    • Cassandra December 21, 2017 at 10:19 pm #

      I’m having the same issue as well. not sure where to specify the file path for the images. Any help would be appreciated

      • Adrian Rosebrock December 22, 2017 at 8:33 am #

        Hi Cassandra — Be sure to use the “Downloads” section of the blog post to download the model and data. You’ll need to use the same commands I do in the blog post. For a review of command line arguments, please see this tutorial.

  7. Jeff December 12, 2017 at 3:07 pm #

    Sorry Adrian,

    I forgot to mention train_network.py returns..

    ModuleNotFoundError: No module named ‘pyimagesearch’

  8. Yuri December 12, 2017 at 4:03 pm #

    This is an excellent post and systematically submitted information. In the framework of this network, is it possible to obtain information about the coordinates of the object, so that it is possible to construct a rectangle that allocates it?

    • Adrian Rosebrock December 13, 2017 at 1:38 pm #

      With LeNet, no, you cannot obtain the (x, y)-coordinates of the object. You would need to train either a YOLO, SSD, or R-CNN network. Pre-trained ones can be found here. If you want to train them from scratch please take a look at my book, Deep Learning for Computer Vision with Python where I discuss the process in detail.

  9. Bharath Kumar December 12, 2017 at 10:33 pm #

    Hey your the go to tutorials for computer vision..why dont you teach on youtube? Just curious.!!

    • Adrian Rosebrock December 13, 2017 at 1:37 pm #

      I’ve considered it! Maybe in the future I will 🙂

  10. Alice December 13, 2017 at 1:53 am #

    I find Computer Vision, Deep Learning, Python,.. are so difficult stuffs but I did not give up because your posts and instructions make me feel like one day I can make a little program run. However, after I haven’t had any success after many times of trying but as I said I won’t give up. I wish you a Merry Christmas and a Happy New Year approaching in the next few weeks.

    • Adrian Rosebrock December 13, 2017 at 1:36 pm #

      Thank you so much for the comment Alice, comments like these really put a smile on my face 🙂 Keep trying and keep working at it. What particular problem are you having trying to get your script to run?

      • Alice December 13, 2017 at 10:35 pm #

        I got very popular problem and I saw many people got on StackoverFlow:

        “Error: ‘::hypot’ has not been declared” in cmath

        • Adrian Rosebrock December 15, 2017 at 8:28 am #

          Unfortunately I have not encountered that error before. What library is throwing that error?

          • Alice December 17, 2017 at 9:48 pm #

            Well, I solved it and now the program is running well. I am wondering of making it an Android app when the input is taken from phone’s camera, the output in real-time shows santa and not-santa, it is like your demo with object-recognition. Please suggest me some tutorials I should follow. Thanks

  11. Jeff December 13, 2017 at 2:09 am #

    Hello Adrian,

    How do I get the following library:

    from pyimagesearch.lenet import LeNet

    • Adrian Rosebrock December 13, 2017 at 1:36 pm #

      Hi Jeff, please make sure use the “Downloads” section of this blog post. Once you download the code you’ll find the necessary Python files and project structure.

  12. Peter December 13, 2017 at 6:12 am #

    Hi Adrian, good stuff. I don’t seem to have imutils, as in

    from imutils import paths

    Is this from an earlier post or do I have to conda it?

  13. Peter December 13, 2017 at 8:47 am #

    No worries Sheila, I found it.

    • Adrian Rosebrock December 13, 2017 at 1:35 pm #

      Hi Peter — congrats on figuring it out. I just want to make sure if other readers have the same question they know they can find imutils on GithHub and/or install it via pip:

      $ pip install imutils

  14. AsafOron December 13, 2017 at 10:56 am #

    Very well presented and easy to follow, wonderful !

    Can one utilize this same model for object detection? that is you have a big image say 500×500 with multiple santas in it and you need to identify the various santas and put a bounding box around each and provide a santa count. i believe it can be done by sliding a 28×28 window on the big image and run it through the model but it seems very inefficient not to mention that santas in the images may vary in size. is there a better way ?

    • Adrian Rosebrock December 13, 2017 at 1:40 pm #

      Please see my reply to Yuri above where I discuss this question. You can use sliding windows, image pyramids, and non-maxima suppression but you would be better off training a network using an object detection framework.

  15. Subash Gandyer December 14, 2017 at 7:06 am #

    model.add(Dense(500))

    Why is it 500 and not 5000 or any other number? How did you arrive at this?

    • Adrian Rosebrock December 15, 2017 at 8:23 am #

      We are following the exact implementation of the LeNet architecture. Please see the post for more details.

  16. menokey December 18, 2017 at 10:03 pm #

    Hello Adrain ,
    Why are we appending all images into one array as in
    data.append(image)

  17. menokey December 18, 2017 at 10:59 pm #

    For the directory structure of pyimagesearch ,what is networks folder and why do we need another letnet.py inside

    • Adrian Rosebrock December 19, 2017 at 4:16 pm #

      Please use the “Downloads” section of this blog post to download the code + director structure so you can compare yours to mine. This will help you understand the proper directory structure.

  18. stoiclemon December 21, 2017 at 6:55 pm #

    Do I have to install sklearn.model separately? can’t seem to find it anywherein the Downloads folder.

    • Adrian Rosebrock December 22, 2017 at 6:45 am #

      Yes, you need to install scikit-learn via:

      $ pip install scikit-learn

  19. Chandra December 23, 2017 at 3:34 am #

    Hi,

    Thank you for providing this tutorial. I have a simple question.

    You said in line 22-25, you do scaling by dividing your image with 255. I believe that because you expect the images input have many colors. But how if the input is black and white photo or roentgen photography? Does it need to be scaled? How to scale it?

    Please advise

    • Adrian Rosebrock December 26, 2017 at 4:34 pm #

      The scaling is done to scale the pixel values from [0, 255] down to [0, 1]. This is done to give the neural network less “freedom” in values and enables the network learn faster.

  20. kaisar khatak December 26, 2017 at 1:46 am #

    Cool post. I think you already identified the issue with the size of the images and network. The LeNet model is just predicting when there is a significant amount of red and white localized together in the input image. If you feed the program any images/frames with a lot of red and/or white, the program will generate false positives.

    You have identified some solutions as well:
    Gather additional training data
    Utilize higher resolution images during training. I imagine 64×64 pixels would produce higher accuracy. 128×128 pixels would likely be ideal (although I have not tried this).
    Use a deeper network architecture during training.

    Maybe, try using YOLO/SSD for object localization???

    BTW, I used the SNOW app (ios/android) and Santa Claus face filter for testing….

    video:
    https://drive.google.com/file/d/14AjetH-vRosXSoymbz7wnv-iOcTXyuYe/view?usp=sharing

    image:
    https://drive.google.com/file/d/1PXdtA-a1utL12Uy265-qsiOTR8b1phhL/view?usp=sharing

    Happy Holidays!

    • Adrian Rosebrock December 26, 2017 at 3:52 pm #

      Thanks for sharing, Kaisar! Yes, you’re absolutely right — the model will report false positives when there is a significant amount of red/white. YOLO or SSD could be used for localization/detection, but that would again require much larger input resolution and ideally more training data.

  21. Abder-Rahman Ali December 28, 2017 at 8:31 am #

    Thanks so much for this nice post. The issue is that the program is classifying all the images in the “exmaples” directory as “not santa” with 100%.

    The plot also looks like this (which is weird): https://www.dropbox.com/s/24q26wvf0ljihdd/fig.png?dl=1

    This is the command I used to train the network:

    $ python train_network.py –dataset /full-path/image-classification-keras/images/santa –dataset /full-path/image-classification-keras/images/not_santa –model /full-path/image-classification-keras/santa.model –plot /full-path/image-classification-keras/

    Any ideas where I might be having some mistakes?

    Thanks.

    • Adrian Rosebrock December 28, 2017 at 2:08 pm #

      Please take a look at the example command used to execute the script:

      $ python train_network.py --dataset images --model santa_not_santa.model

      The “images” directory should contain two sub-directories: “santa” and “not_santa”. Your command does not reflect this. Use the “Downloads” section of the blog post to compare your directory structure to mine.

      • Abder-Rahman Ali December 28, 2017 at 2:48 pm #

        Thanks so much Adrian. It works now 🙂 I just get the following warning:

        libpng warning: iCCP: known incorrect sRGB profile

        I downloaded the code from the link you send through email, and not sure how the “examples” folder came in.

  22. Abder-Rahman Ali December 30, 2017 at 10:21 pm #

    Hello Adrian, when I downloaded the code, I noticed that the “examples” directory is within the “images” directory. Shouldn’t it be separate? Thanks.

    • Adrian Rosebrock December 31, 2017 at 10:01 am #

      Great catch! I added the “examples” directory after I had trained the model. The “examples” directory should be moved up a level. I’ll get the blog post + code updated.

      • Adrian Rosebrock January 7, 2018 at 9:03 am #

        Just a quick update: I have updated the code + downloads for this post to move the “examples” up one directory.

  23. Mohammed January 3, 2018 at 3:32 am #

    I am a new in this area and i want to ask about extract features, so my question is how to decide the best number of epochs that i stop train and get a vectors of features for every image in dataset ?

    • Adrian Rosebrock January 3, 2018 at 12:54 pm #

      Hey Mohammed — typically we only perform feature extraction on a pre-trained network. We wouldn’t typically train a network and use it only for feature extraction. Could you elaborate a bit more on your project and what your end goal is?

  24. judson antu January 5, 2018 at 8:50 am #

    hey Adrian,
    how good would be this method for detecting rotten and good apples or in that case any fruit. will the only CPU method be enough to train for such a level of accuracy?

    and what about resizing the image to more than a 28×28 pixel array, like maybe 56×56 array?

    • Adrian Rosebrock January 5, 2018 at 1:24 pm #

      It’s hard to say without seeing a dataset first. Your first step should be to collect the dataset and then decide on a model and input spatial dimensions.

      • Judson antu January 5, 2018 at 9:42 pm #

        Okay, so in my case, the classification would be done in a controlled environment. Like the fruits would be passing on a conveyer belt. In that case , would we need diversity in images?

        • Adrian Rosebrock January 8, 2018 at 2:57 pm #

          If it’s a controlled environment you can reduce the amount of images you would need, but I would still suggest 500 images (ideally 1,000) per object class that you want to recognize. If your conveyor belt is already up and running put a camera on it and start gathering images. You can then later label them. This would be the fastest method to get up and running.

  25. Andy January 8, 2018 at 12:18 pm #

    Adrian,

    Thank you for a great tutorial.

    Question – what does the “not santa” dataset really need to represent for this to work effectively for other types of problems?

    For example, if our “not santa” dataset does not contain many images of things like strawberries, watermelons, etc – could it mistakenly classify those as santa (red, green, white, etc.)?

    • Adrian Rosebrock January 8, 2018 at 2:32 pm #

      The architecture used in this example is pretty limited at only 28×28 pixels. State-of-the-art neural networks accept images that are typically 200-300 pixels along their largest dimension. Ensuring your images are that large and using a network such as ResNet, VGGNet, SqueezeNet, or another current architecture is a good starting point.

      From there you need to gather data, ideally 1,000 images per object class that you want to recognize.

  26. Jim Walker January 16, 2018 at 3:43 pm #

    Adrian:
    Thanks for the project. A problem I am having is this error: If on CPU, do you have a BLAS library installed Theano can link against? On the CPU we do not support float16.

    I looked up BLAS libraries but didn’t get very far…What does it mean and how can I correct it?
    Thanks for your help.

    • Adrian Rosebrock January 17, 2018 at 10:19 am #

      BLAS is a linear algebra optimization library. Are you using Keras with a TensorFlow or Theano backend? And which operating system?

      • Jim Walker January 17, 2018 at 1:04 pm #

        Theano backend with Windows 10

        • Adrian Rosebrock January 18, 2018 at 8:58 am #

          It sounds like you need to install BLAS on your Windows system then reinstall Theano and Keras. I haven’t used Windows in a good many years and I do not support Windows here on the PyImageSearch blog. In general I recommend using a Unix-based operating system such as Ubuntu or macOS for deep learning. Using Windows will likely lead to trouble. Additionally, consider using the TensorFlow backend as Theano is no longer being developed.

  27. Akbar H January 16, 2018 at 7:23 pm #

    (notSanta, santa) = model.predict(image)[0]

    is this label notSanta and santa, same as 0 and 1 ?

    thanks.

    • Adrian Rosebrock January 17, 2018 at 10:17 am #

      notSanta is the probability of the “not santa” label while santa is the probability of the “santa” label. The probability can be in the range [0, 1.0].

  28. isra60 January 17, 2018 at 2:34 am #

    Hi Adrian.

    I’m really interested in this tutorial and I want to learn to my own purposes

    Have you ever tried to train with thermal or infrarred images?? Any hints of how to do this??
    Maybe this is not possible as this models and detectors are only color reliable or maybe we can train them in other way..

    As for visible images we have PASCAL VOC 2012 in order to benchmark our models do you know a benchmark for thermal images?

    Thank you

    • Adrian Rosebrock January 17, 2018 at 10:10 am #

      I have not trained to train a network on thermal or infrared images but the process would be the same. Ensure Keras and/or OpenCV can load them, apply labels to them, and train. That would at least give you a reasonable benchmark to improve upon.

Trackbacks/Pingbacks

  1. Keras and deep learning on the Raspberry Pi - PyImageSearch - December 18, 2017

    […] week, we learned how to train a Convolutional Neural Network using Keras to determine if Santa was in an input […]

Leave a Reply