k-NN classifier for image classification


Now that we’ve had a taste of Deep Learning and Convolutional Neural Networks in last week’s blog post on LeNet, we’re going to take a step back and start to study machine learning in the context of image classification in more depth.

To start, we’ll reviewing the k-Nearest Neighbor (k-NN) classifier, arguably the most simple, easy to understand machine learning algorithm. In fact, k-NN is so simple that it doesn’t perform any “learning” at all!

In the remainder of this blog post, I’ll detail how the k-NN classifier works. We’ll then apply k-NN to the Kaggle Dogs vs. Cats dataset, a subset of the Asirra dataset from Microsoft.

The goal of the Dogs vs. Cats dataset, as the name suggests, is to classify whether a given image contains a dog or a cat. We’ll be using this dataset a lot in future blog posts (for reasons I’ll explain later in this tutorial), so make sure you take the time now to read through this post and familiarize yourself with the dataset.

All that said, let’s get started implementing k-NN for image classification to recognize dogs vs. cats in images!

Looking for the source code to this post?
Jump right to the downloads section.

k-NN classifier for image classification

After getting your first taste of Convolutional Neural Networks last week, you’re probably feeling like we’re taking a big step backward by discussing k-NN today.

What gives?

Well, here’s the deal.

I once wrote a (controversial) blog post on getting off the deep learning bandwagon and getting some perspective. Despite the antagonizing title, the overall theme of this post centered around various trends in machine learning history, such as Neural Networks (and how research in NNs almost died in the 70-80’s), Support Vector Machines, and Ensemble methods.

When each of these methods were introduced, researchers and practitioners were equipped with new, powerful techniques — in essence, they were given a hammer and every problem looked like a nail, when in reality, all they needed was a few simple turns of a phillips head to solve a particular the problem.

I have news for you: Deep learning is no different.

Go to the vast majority of popular machine learning and computer vision conferences and look at the recent list of publications. What is the overarching theme?

Deep learning.

Then, hop on large LinkedIn groups related to computer vision and machine learning. What are many people asking about?

How to apply deep learning to their datasets.

After that, go over to popular computer science sub-reddits such as /r/machinelearning. What tutorials are the most consistently upvoted?

You guessed it: deep learning.

Here’s the bottom line:

Yes, I will teach you about Deep Learning and Convolutional Neural Networks on this blog — but you’re damn-well going to understand that Deep Learning is just a TOOL, and like any other tool, there is a right and wrong time to use it.

Because of this, it’s important for us to understand the basics of machine learning before we progress too far. Over the next few weeks, I’ll be discussing the basics of machine learning and Neural Networks, eventually building up to Deep Learning (where you’ll be able to appreciate the inner-workings of these algorithms more).

Kaggle Dogs vs. Cats dataset

The Dogs vs. Cats dataset was actually part of a Kaggle challenge a few years back. The challenge itself was simple: given an image, predict whether it contained a dog or a cat:

Figure 4: Examples from the Kaggle Dogs vs. Cats dataset.

Figure 1: Examples from the Kaggle Dogs vs. Cats dataset.

Simple enough — but if you know anything about image classification, you’ll understand that given:

  • Viewpoint variation
  • Scale variation
  • Deformation
  • Occlusion
  • Background clutter
  • Intra-class variation

That the problem is significantly harder than it might appear on the surface.

By simply randomly guessing, you should be able to reach 50% accuracy (since there are only two class labels). A machine learning algorithm will need to obtain > 50% accuracy in order to demonstrate that it has in fact “learned” something (or found an underlying pattern in the data).

Personally, I love the Dogs vs. Cats challenge, especially for teaching Deep Learning.


The dataset is simple enough to wrap your head around — there are only two classes: “dog” or “cat”.

However, the dataset is nicely sized, containing 25,000 images in the training data. This means that you have enough data to train data-hungry Convolutional Neural Networks from scratch.

We’ll be using this dataset a lot in future blog posts. I’ve actually included it in the “Downloads” section of this blog post for your convenience, so scroll down to grab the code + data before you follow along.

Project structure

Once you’ve downloaded the archive for this blog post, unzip it to someplace convenient. From there let’s take a look at the project directory structure:

The Kaggle dataset is included in the kaggle_dogs_vs_cats/train  directory (it comes from train.zip  available on the Kaggle webpage).

We’ll be reviewing one Python script today — knn_classifier.py . This file will load the dataset, establish and run the K-NN classifier, and print out the evaluation metrics.

How does the k-NN classifier work?

The k-Nearest Neighbor classifier is by far the most simple machine learning/image classification algorithm. In fact, it’s so simple that it doesn’t actually “learn” anything.

Inside, this algorithm simply relies on the distance between feature vectors, much like building an image search engine — only this time, we have the labels associated with each image so we can predict and return an actual category for the image.

Simply put, the k-NN algorithm classifies unknown data points by finding the most common class among the k-closest examples. Each data point in the k closest examples casts a vote and the category with the most votes wins!

Or, in plain english: “Tell me who your neighbors are, and I’ll tell you who you are”

To visualize this, take a look at the following toy example where I have plotted the “fluffiness” of animals along the x-axis and the lightness of their coat on the y-axis:

Figure 2: Plotting the fluffiness of animals along the x-axis and the lightness of their coat on the y-axis.

Figure 2: Plotting the fluffiness of animals along the x-axis and the lightness of their coat on the y-axis.

Here we can see there are two categories of images and that each of the data points within each respective category are grouped relatively close together in an n-dimensional space. Our dogs tend to have dark coats which are not very fluffy while our cats have very light coats that are extremely fluffy.

This implies that the distance between two data points in the red circle is much smaller than the distance between a data point in the red circle and a data point in the blue circle.

In order to apply the k-nearest Neighbor classification, we need to define a distance metric or similarity function. Common choices include the Euclidean distance:

Figure 2: The Euclidean distance.

Figure 3: The Euclidean distance.

And the Manhattan/city block distance:

Figure 3: The Manhattan/city block distance.

Figure 4: The Manhattan/city block distance.

Other distance metrics/similarity functions can be used depending on your type of data (the chi-squared distance is often used for distributions [i.e., histograms]). In today’s blog post, for the sake of simplicity, we’ll be using the Euclidean distance to compare images for similarity.

Implementing k-NN for image classification with Python

Now that we’ve discussed what the k-NN algorithm is, along with what dataset we’re going to apply it to, let’s write some code to actually perform image classification using k-NN.

Open up a new file, name it knn_classifier.py , and let’s get coding:

We start off on Lines 2-9 by importing our required Python packages. If you haven’t already installed the scikit-learn library, then you’ll want to follow these instructions and install it now.

Note: This blog post has been updated to be compatible with the future scikit-learn==0.20  where sklearn.cross_validation  has been replaced by sklearn.model_selection .

Secondly, we’ll be using the imutils library, a package that I have created to store common computer vision processing functions. If you do not have imutils  installed, you’ll want to do that now:

Next, we are going to define two methods to take an input image and convert it to a feature vector, or a list of numbers that quantify the contents of an image. The first method can be seen below:

The image_to_feature_vector  method is an extremely naive function that simply takes an input image  and resizes it to a fixed width and height ( size ), and then flattens the RGB pixel intensities into a single list of numbers.

This means that our input image  will be shrunk to 32 x 32 pixels, and given three channels for each Red, Green, and Blue component respectively, our output “feature vector” will be a list of 32 x 32 x 3 = 3,072 numbers.

Strictly speaking, the output of image_to_feature_vector  is not a true “feature vector” since we tend to think of “features” and “descriptors” as abstract quantifications of the image contents.

Furthermore, utilizing raw pixel intensities as inputs to machine learning algorithms tends to yield poor results as even small changes in rotation, translation, viewpoint, scale, etc., can dramatically influence the image itself (and thus your output feature representation).

Note: As we’ll find out in later tutorials, Convolutional Neural Networks obtain fantastic results using raw pixel intensities as inputs — but this is because they learn a robust set of discriminating filters during the training process.

We then define our second method, this one called extract_color_histogram :

As the name suggests, this function accepts an input image  and constructs a color histogram to characterize the color distribution of the image.

First, we convert our image  to the HSV color space on Line 19. We then apply the cv2.calcHist  function to compute a 3D color histogram for the image  (Lines 20 and 21). You can read more about computing color histograms in this post. You might also be interested in applying color histograms to image search engines and in general, how to compate color histograms for similarity.

Given our computed hist , we then normalize it, taking care to use the appropriate cv2.normalize  function signature based on our OpenCV version (Lines 24-30).

Given 8 bins for each of the Hue, Saturation, and Value channels respectively, our final feature vector is of size 8 x 8 x 8 = 512, thus our image is characterized by a 512-d feature vector.

Next, let’s parse our command line arguments:

We require only one command line argument, followed by two optional ones, each of which are detailed below:

  • --dataset : This is the path to our input Kaggle Dogs vs. Cats dataset directory.
  • --neighbors : Here we can supply the number of nearest neighbors that are taken into account when classifying a given data point. We’ll default this value to one, meaning that an image will be classified by finding its closest neighbor in an ndimensional space and then taking the label of the closest image. In next week’s post, I’ll demonstrate how to automatically tune k for optimal accuracy.
  • --jobs : Finding the nearest neighbor for a given image requires us to compute the distance from our input image to every other image in our dataset. This is clearly a O(N) operation that scales linearly. For larger datasets, this can become prohibitively slow. In order to speedup the process, we can distribute the computation of nearest neighbors across multiple processors/cores of our machine. Setting --jobs  to -1  ensures that all processors/cores are used to help speedup the classification process.

Note: We can also speedup the k-NN classifier by utilizing specialized data structures such as kd-trees or Approximate Nearest Neighbor algorithms such as FLANN or Annoy. In practice, these algorithms can reduce nearest neighbor search to approximately O(log N); however, for the sake of simplicity in this post, we’ll perform an exhaustive nearest neighbor search.

We are now ready to prepare our images for feature extraction:

Line 47 grabs the paths to all 25,000 training images from disk.

We then initialize three lists (Lines 51-53) to store the raw image pixel intensities (the 3072-d feature vector), another to store the histogram features (the 512-d feature vector), and finally the class labels themselves (either “dog” or “cat”).

Let’s move on to extracting features from our dataset:

We start looping over our input images on Line 56. Each image is loaded from disk and the class label is extracted from the imagePath  (Lines 59 and 60).

We apply the image_to_feature_vector  and extract_color_histogram  functions on Lines 65 and 66 — these functions are used to extract our feature vectors from the input image .

Given our features labels, we then update the respective rawImages , features , and labels  lists on Lines 70-72.

Finally, we display an update to our terminal to inform us on feature extraction progress every 1,000 images (Lines 75 and 76).

You might be curious how much memory our rawImages  and features  matrices take up — the following code block will tell us when executed:

We start by converting our lists to NumPy arrays. We then use the .nbytes  attribute of the NumPy array to display the number of megabytes of memory the representations utilize (about 75MB for the raw pixel intensities and 50MB for the color histograms. This implies that we can easily store our features in main memory).

Next, we need to take our data partition it into two splits — one for training and another for testing:

Here we’ll be using 75% of our data for training and the remaining 25% for testing the k-NN algorithm.

Let’s apply the k-NN classifier to the raw pixel intensities:

Here we instantiate the KNeighborsClassifier  object from the scikit-learn library using the supplied number of --neighbors  and --jobs .

We then “train” our model by making a call to .fit  on Line 99, followed by evaluation on the testing data on Line 100.

In a similar fashion, we can also train and evaluate a k-NN classifier on our histogram representations:

k-NN image classification results

To test our k-NN image classifier, make sure you have downloaded the source code to this blog post using the “Downloads” form found at the bottom of this tutorial.

The Kaggle Dogs vs. Cats dataset is included with the download.

From there, just execute the following command:

At first, you’ll see that our images are being described and quantified via the image_to_feature_vector  and extract_color_histogram  functions:

Figure 5: Quantifying and extracting features from the Dogs vs. Cats dataset.

Figure 5: Quantifying and extracting features from the Dogs vs. Cats dataset.

This process shouldn’t take longer than 1-3 minutes depending on the speed of your machine.

After the feature extraction process is complete, we can see some information on the size (in MB) of our feature representations:

Figure 6: Measuring the size of our feature matrices.

Figure 6: Measuring the size of our feature matrices.

The raw pixel features take up 75MB while the color histograms require only 50MB of RAM.

Finally, the k-NN algorithm is trained and evaluated for both the raw pixel intensities and color histograms:

Figure 7: Evaluating our k-NN algorithm for image classification.

Figure 7: Evaluating our k-NN algorithm for image classification.

As the figure above demonstrates, by utilizing raw pixel intensities we were able to reach 54.42% accuracy. On the other hand, applying k-NN to color histograms achieved a slightly better 57.58% accuracy.

In both cases, we were able to obtain > 50% accuracy, demonstrating there is an underlying pattern to the images for both raw pixel intensities and color histograms.

However, that 57% accuracy leaves much to be desired.

And as you might imagine, color histograms aren’t the best way to distinguish between a dog and a cat:

  • There are brown dogs. And there are brown cats.
  • There are black dogs. And there are black cats.
  • And certainly a dog and cat could appear in the same environment (such as a house, park, beach, etc.) where the background color distributions are similar.

Because of this, utilizing strictly color is not a great choice for characterizing the difference between dogs and cats — but that’s okay. The purpose of this blog post was simply to introduce the concept of image classification using the k-NN algorithm.

We can easily apply methods to obtain higher accuracy. And as we’ll see, utilizing Convolutional Neural Networks we can achieve > 95% accuracy without much effort — but I’ll save that for a future discussion once we better understand image classification.

Want to learn more about Convolutional Neural Networks right now?

If you enjoyed this tutorial on image classification, you’ll definitely want to take a look at the PyImageSearch Gurus course — the most complete, comprehensive computer vision course online today.

Inside the course, you’ll find over 168 lessons covering 2,161+ pages of content on Deep Learning, Convolutional Neural NetworksImage Classification, Face Recognitionand much more.

To learn more about the PyImageSearch Gurus course (and grab 10 FREE sample lessons + course syllabus), just click the button below:

Click here to learn more about PyImageSearch Gurus!

Can we do better?

You might be wondering, can we do better than the 57% classification accuracy?

You’ll notice that I’ve used only k=1 in this example, implying that only one nearest neighbor is considered when classifying each image. How would the results change if I used k=3 or k=5?

And how about the choice in distance metric — would the Manhattan/City block distance be a better choice?

How about changing both the value of k and distance metric at the same time?

Would classification accuracy improve? Get worse? Stay the same?

The fact is that nearly all machine learning algorithms require a bit of tuning to obtain optimal results. In order to determine the optimal set of values for these model variables, we apply a process called hyperparameter tuning, which is exactly the topic of next week’s blog post.


In this blog post, we reviewed the basics of image classification using the k-NN algorithm. We then applied the k-NN classifier to the Kaggle Dogs vs. Cats dataset to identify whether a given image contained a dog or a cat.

Utilizing only the raw pixel intensities of the input image images, we obtained 54.42% accuracy. And by using color histograms, we achieved a slightly better 57.58% accuracy. Since both of these results are > 50% (we should expect to get 50% accuracy simply by random guessing), we can ascertain that there is an underlying pattern in the raw pixels/color histograms that can be used to discriminate dogs vs. cats (although 57% accuracy is quite poor).

That raises the question: “Is it possible to obtain > 57% classification accuracy using k-NN? If so, how?”

The answer is hyperparameter tuning — which is exactly what we’ll be covering in next week’s blog post.

Be sure to sign up for the PyImageSearch Newsletter using the form below to be notified when the next blog post goes live!


If you would like to download the code and images used in this post, please enter your email address in the form below. Not only will you get a .zip of the code, I’ll also send you a FREE 17-page Resource Guide on Computer Vision, OpenCV, and Deep Learning. Inside you'll find my hand-picked tutorials, books, courses, and libraries to help you master CV and DL! Sound good? If so, enter your email address and I’ll send you the code immediately!

, , , , ,

85 Responses to k-NN classifier for image classification

  1. Navdeep August 8, 2016 at 2:03 pm #

    Great Post, Thanks for such a useful knowledge 😀

    • Adrian Rosebrock August 8, 2016 at 6:37 pm #

      Thanks Navdeep! 🙂

  2. Pete August 8, 2016 at 7:14 pm #

    Is there any way to save the classifier so rather than having to run the test each time and wait each time, I can just load it and run it?

    • Adrian Rosebrock August 10, 2016 at 9:35 am #

      Certainly, just use cPickle to write your model to file:

      And then load it again:

      • MOHAMED AWNI HAMED March 26, 2018 at 8:43 pm #

        could you give me a good link explain cpickle.dumps?

        • Adrian Rosebrock March 27, 2018 at 6:12 am #

          The cpickle.dumps function serializes the Python object to disk so you can read it from disk using a separate script (likely to perform prediction). If you’re using Python 3 you should instead use “pickle.dumps” (notice no “c” in the function).

          If you encounter a Python function you have not used before I would first recommend reading the docs.

  3. g10dras August 8, 2016 at 10:15 pm #

    Awesome tutorial !!! 🙂

  4. Midhun August 9, 2016 at 12:09 am #

    I think this post would be a kickstarter for a newbie
    thanks adrian

  5. Ashish August 9, 2016 at 12:57 am #

    Amazing article and well thought pace of content. Thanks for sharing. Looking forward for more.

    • Adrian Rosebrock August 10, 2016 at 9:32 am #

      Thanks for the feedback Ashish, I’m glad you liked it!

  6. Manik August 12, 2016 at 12:22 am #

    Adrian – I must say, you are helping the community in a great way !! Kudos for your effort and time. I just needed an advice on how to install SciPy on Windows Python. I have Python 2.7 on my Machine. I downloaded scipy-0.18.0.zip for windows from one of the websites but don’t know where to go thereafter..Any suggestion would be appreciated.

    • Adrian Rosebrock August 12, 2016 at 10:48 am #

      Hey Manik — in general, I don’t recommend using Windows for computer vision development. I personally haven’t used Windows in many years, so I’m not sure regarding your exact situation. I’ll leave this comment here for another reader to answer.

    • Sifan August 18, 2016 at 8:59 am #

      Download from this site for your python version:
      Then from your windows command line go to location of the downloads i.e use cd
      Finally:pip install full name of the module(scipy)
      It needs numpy+mkl follow the same procedure to install,I think it will help!

    • K October 1, 2018 at 12:28 am #

      I suggest you to download WinPython or anaconda or simply try pip install scipy

  7. Geoffrey Anderson August 25, 2016 at 1:34 pm #

    Who can elaborate on what is really the right hand side of line 51, the raw images array initializer? I see a square there. What is the intended python code? Thanks!

    2nd thing: Line 11 and other parts of this program are EXACTLY the kind of code I was looking for, for other reasons. Particularly, this code shows me how to load a machine learning dataset from genuine image files! I am so happy now. Thanks for writing this code to get me bootstrapped with machine learning on images, real ones like jpegs! Not prepackaged images contained in some weird monolithic singleton file!

    Now if I could just find a rosetta stone (or nice person) to explain the full python meaning of the square at line 51!

    • Adrian Rosebrock August 30, 2016 at 12:49 pm #

      Hey George — what exactly is your question regarding Line 51? Line 51 simply initializes a list. We append to list another set of lists: the raw pixel intensities.

    • Severiano February 21, 2017 at 10:51 am #

      I thought the same as you for a moment but the problem is how the code is displayed, that is not a square but two square parenthesis [ ]…

  8. Ricardo Brasil August 27, 2016 at 10:09 am #

    Thanks for this great tutorial, was awesome! I am learning alot of things.
    Well, I would like to ask you a few questions, can you please help?

    I was doing your tutorial and I saw a point:

    imagePaths = list(paths.list_images(path))

    The folder needs to be train1 folder or test1 folder? When I downloaded catsvsdogs from Kaggle I am getting this two folders. I don’t know what I use. Can you help me?

    How I use Manhattan distance? can you give me a example?
    Thanks for hlelp

    • Adrian Rosebrock August 29, 2016 at 2:02 pm #

      You can use the Manhattan distance by switching the metric in KNeighborsClassifier to be “cityblock”.

      As for your images, you should have a separate directory for each class. One directory for “dog” and another directory for “cat”.

  9. Matt September 1, 2016 at 8:32 am #

    Hey Adrian,

    Just wondering where the best place is to start machine learning and deep learning as a complete newb like me.

    Love your work, Matt

    • Adrian Rosebrock September 1, 2016 at 10:56 am #

      If you’re starting from scratch, the absolute best place to learn computer vision, machine learning, and deep learning is inside the PyImageSearch Gurus course. The course covers advanced algorithms, but also takes the time to explain the basics.

  10. Johnny September 2, 2016 at 2:21 am #

    Great post for beginners, it covers all the workflow from preprocessing and analyzing. Thanks!

    • Adrian Rosebrock September 2, 2016 at 6:57 am #

      Thanks Johnny, I’m glad the post helped! 🙂

  11. Rohit November 5, 2016 at 10:50 am #

    Sorry for this silly question but how can I now classify a single image ?? 😛

    • Adrian Rosebrock November 7, 2016 at 2:53 pm #

      You would use the .predict method of the model:

      prediction = model.predict(hist)

      Where hist is the histogram of your input image.

      For more details on how to use OpenCV + Python to classify the contents of an image please refer to my book Practical Python and OpenCV and the PyImageSearch Gurus course.

  12. bhaarat January 29, 2017 at 5:46 pm #

    Hi Adrian, the dataset comes pre-split from kaggle. In train and test1 zips. Both contain 25k and 12.5k images respectively. Does the code in this blog post take the pre-split into account or should I combine train and test from kaggle into one folder and let your code do the split?

    • Adrian Rosebrock January 30, 2017 at 4:25 pm #

      The code from this blog post only uses the training data from Kaggle and then partitions into two respective training and testing splits. If you wanted to train a model and submit the results from Kaggle you would use the pre-split zip files from the official Kaggle challenge.

      It’s been awhile since I looked at the “test1” directory, but I wouldn’t recommend combining them. They likely use different image path structures, and if so, that will break the code that extracts the label from the image path.

  13. Megha Soni March 8, 2017 at 2:47 am #

    Hi Adrian,
    Thankyou for this tutorial, it is really helpful.

    I do have doubts in lines 37, 39 & 41. I added the path for the “train” dataset. But I’m not sure what to add for “–neighbors” and “–jobs”

    • Adrian Rosebrock March 8, 2017 at 1:03 pm #

      The --neighbors is the number of nearest neighbors in the k-NN algorithm. Please read this post again to understand this important variable. As for --jobs, I would leave this as -1 which uses all available processors on your system.

  14. jan April 14, 2017 at 10:51 pm #

    how to classify more than one class of images ?

    • Adrian Rosebrock April 16, 2017 at 8:54 am #

      The k-NN algorithm handles classifying more than one class of image. Perhaps I’m misunderstanding your question?

      • jan April 16, 2017 at 11:27 am #

        To classify more than one class of images which is better SVM or KNN?

        • Adrian Rosebrock April 19, 2017 at 1:08 pm #

          It’s entirely dependent on your dataset, what features you’re extracting, etc. You normally would apply cross-validation to both models and determine which one is better. There is no “one size fits all” solution to computer vision and machine learning.

  15. sunil May 16, 2017 at 3:21 am #

    can I test with other dataset… how and where to give the data set path…error:–dataset

    • Adrian Rosebrock May 17, 2017 at 10:04 am #

      This specific example assumes your directory structure and image file paths follow a particular pattern:


      Provided that your dataset follows this pattern, you can use the exact same code.

      In either case, I would suggest you work through either Practical Python and OpenCV or the PyImageSearch Gurus course so you can learn how to apply machine learning to computer vision datasets.

      • abc April 1, 2018 at 10:24 am #

        In which terminal have you run? I am using Windows…

        • Adrian Rosebrock April 4, 2018 at 12:32 pm #

          Use the terminal/command line utility inside Windows.

  16. Aditi Bhiwaniwala May 31, 2017 at 3:11 am #

    Hello Adrian,
    I have used the knn classifier for car logo recognition in one of your tutorials, but when i am using sliding window technique for recognising the ‘make’ of a given car from its image , the output depends on size of sliding window and a wrong result is obtained if i change the size of sliding window . Can you help me generalise the size of sliding window.

    • Adrian Rosebrock May 31, 2017 at 1:02 pm #

      Do you have any example images of what you are working with? That will better help me point you in the right direction.

      For what it’s worth, I’ll be covering how to recognize the make + model of a vehicle inside my book, Deep Learning for Computer Vision with Python.

      Be sure to take a look!

  17. Umair July 9, 2017 at 2:18 am #

    A very useful tutorial i like it so much

  18. Dhvani shah August 1, 2017 at 4:55 am #

    Ap = argparse.ArgumentParser() I am not understanding what to put in the parameters of –dataset and help=”path to dataset” options. And should I merge the testing1 and training folder into one folder and name it as kaggle-dogs-vs-cats and then the code will split it into training and testing dataset. Thanking you.

    • Adrian Rosebrock August 1, 2017 at 9:35 am #

      If you are having trouble understanding the argument parsing code, please refer to this tutorial. It will help you understand how to parse command line arguments with Python.

      As for your second question, no, do not merge the testing1 and training folder. Use only the training folder (the code will automatically perform a training/testing split for you).

  19. Niha Beig October 3, 2017 at 12:35 pm #

    How do we know what features were picked up by the classifier to build the model??

    • Adrian Rosebrock October 4, 2017 at 12:38 pm #

      The k-NN algorithm doesn’t “pick” any features. It simply computes the distance between the feature vectors.

  20. sourav October 15, 2017 at 11:24 am #

    Hello Adrian
    I tried implementing your code on my dataset however i keep getting an error
    “ValeuError:setting an array element with sequence”
    kindly suggest what to do

    • Adrian Rosebrock October 16, 2017 at 12:24 pm #

      Which line threw the error. Please be as descriptive as possible. If you do not provide both the line number and error it’s hard for myself or other readers to provide you with any suggestions.

  21. NGUYEN November 8, 2017 at 9:09 am #

    I have an issue to solve. In some case for example we have two class (eg. Cat and Dog) but some time an user can put a Bird into program to test and the application always gives an answe (of course it is wrong ) and sometime with a very high confidence (eg. 0.9). I want to know how to prevent this case in the real work? Is there any technique for that ? Thank !

    • Adrian Rosebrock November 9, 2017 at 6:30 am #

      Keep in mind that machine learning algorithms aren’t magic. They apply a very specific set of rules. If the bird image is closest to a non-bird image in a Euclidean space than the k-NN algorithm is going to return the closest label. The k-NN algorithm doesn’t care whether the label is right or wrong, it’s just applying a very specific set of rules. I’m not sure what you mean by “prevent” this scenario so perhaps it would be best to limit the control of what the user can upload, but that’s certainly not ideal.

  22. NGUYEN November 9, 2017 at 10:40 am #

    Thank @Adrian for your reply. In fact, I have an issue to classify some types of document. There are a lot of types (about 50 types and millions documents) but my client want to take out only some types automatically. I used deep-learning, it is not difficult to classify if one image in the set of types that I trained but I can’t classify correctly if one type of doc not in the set so that the app can not classify automatically the documents that we desire. And I think we met this issue in many real world app.

  23. wedgess121 November 29, 2017 at 1:21 pm #

    Can you help with plotting the results out with matplotlib

  24. Nine January 11, 2018 at 10:31 pm #

    I want to merge Local Binary Patterns descriptor with KNN as its classifier. Is that possible with the algorithm above ?
    After all, This is a good and interesting tutorial. I learned a lot from your posts. Thank you !

    • Adrian Rosebrock January 12, 2018 at 5:31 am #

      Yes, although I would recommend using this Local Binary Patterns tutorial where we use a Linear SVM. You can easily swap in a k-NN classifier from the Linear SVM with only a few lines.

      Additionally, if you’re interested in learning more about image classification using machine learning algorithms, take a look at the PyImageSearch Gurus course where I discuss them both in detail.

  25. Red February 1, 2018 at 11:06 am #

    Dear Adrian
    I needed to link (classify) field-sampled forest biomass and Landsat satellite image using knn. I have no idea of python. I have access to commercial GIS remote sensing software like IDRISI/TerrSet, ENVI, ERDAS Imagine and ArcGIS. Could you show me the steps I should follow to do that in any of these software? I needed it very very urgently please. I would highly appreciate your kind effort.

    • Adrian Rosebrock February 3, 2018 at 10:50 am #

      Hey Red — I do not have any experience with the software you mentioned. This blog uses the Python programming language to write software to understand the contents of an image. I’m sorry I could not be of help here.

  26. michael February 17, 2018 at 4:33 am #

    hi adrian thank you i implement knn by my self

  27. Mohamed Awni March 24, 2018 at 2:13 pm #

    Thanks Adrian for your great effort. I am so confused in KNN to classify images. What I understand that in training phase we store all training images and its corresponding labels. And in testing phase we compare or apply Euclidean distance measure between the test image and all other training images and then we have score for that, Is this right? what is the role of K = 1 to determine the class of the test image?

    • Adrian Rosebrock March 27, 2018 at 6:30 am #

      The role of the “K” value is the number of nearest neighbors to consider when making the classification. Setting K=1 will use the label of the closest vector (according to the distance metric) to label the test image. If we set K=3 then the 3 closest neighbors (again, according to the distance metric) will be used to “vote” on which label they believe the test image is.

  28. Pritam March 31, 2018 at 2:11 am #

    Hello Adrian
    I’m doing a project on Bird species identification from an image and using the dataset ”Caltech-UCSD Birds-200-2011 (CUB200-2011). It contains about 200 bird species. I’m not getting how to start or where to start from”. I’m getting confused with 200 classes . I have implemented this cat vs dog project but not getting how should I use this code as reference for my project. So please, can you suggest me something, that I can use to get starter for my project. I want to classify the input image using k-nn algorithm.

    I am extremely thankful to you for your blog.

    • Adrian Rosebrock April 4, 2018 at 12:48 pm #

      The first step is to modify Line 60 to extract the class label from the image path. I don’t remember the exact directory structure of the USCD Birds dataset so you might need to do some debugging.

      For what it’s worth, I demonstrate how to work with image datasets, parse class labels, and build your own machine learning + deep learning models inside my book, Deep Learning for Computer Vision with Python. I believe this would be an excellent starting point for you on the project.

  29. Shashi Raj April 5, 2018 at 10:14 am #

    I am getting an error when I am testing a single image. Can anyone help me ASAP.

    • Adrian Rosebrock April 5, 2018 at 10:32 am #

      What is the exact error you are getting? If your error includes a lot of terminal output create a GitHub Gist and link to it from your comment.

  30. Koki April 8, 2018 at 8:53 pm #

    Hello Adrian,
    what changes to the code should I make, if I want to run this script in spyder instead of terminal?

    • Adrian Rosebrock April 10, 2018 at 12:19 pm #

      I’m not familiar with the Sypder IDE but if it’s anything like PyCharm you need to change the “Project Interpreter” if you are using a Python virtual environment. You should also replace any command line arguments with hardcoded values. If you’re new to command line arguments refer to this tutorial.

  31. Chadwick Sanders May 2, 2018 at 11:20 am #

    Hi Adrian, I am grateful for your tutorial. Can I ask you one question, with this technique of features extraction is it suitable to use another algorithm such as Support Vector Machine or Random Forest instead of kNN ( using scikit-learn). Thanks for your help.

    • Adrian Rosebrock May 3, 2018 at 9:33 am #


  32. Dung Huynh May 3, 2018 at 10:48 am #

    Hi Adrian, can I use this technique for other classification purpose such as car an non-car classification in parking lot. Thank you very much or your help.

    • Adrian Rosebrock May 9, 2018 at 10:42 am #

      That really depends on your image dataset. I would actually recommend trying to train a HOG + Linear SVM detector to detect car/non-car parking spots.

  33. DouBi July 10, 2018 at 6:15 am #

    Hi Adrian, I would like to recognize (or classify) several different types of object in the same image by using sift (or cnn)

    • Adrian Rosebrock July 10, 2018 at 8:09 am #

      I actually cover both SIFT recognition and deep learning recognition inside the PyImageSearch Gurus course. I would suggest starting there.

  34. Prudence September 2, 2018 at 1:39 pm #

    I managed to get through all the steps although when I run the program I get this error

    usage: knn_classifier.py [-h] -d DATASET [-k NEIGHBORS] [-j JOBS]
    knn_classifier.py: error: argument -d/–dataset is required

    I don’t seem to know how to fix it

  35. Sarah November 21, 2018 at 12:52 pm #

    Hi Adrian, I’m relatively new to python. If I wanted to edit the code to classify using hist and pixels (or some other feature) to improve accuracy how would I go about that?

    • Adrian Rosebrock November 25, 2018 at 9:42 am #

      Exactly which extracted feature you use is heavily dependent on your dataset and what exactly what you are trying to quantify with the image. If you’re new to Python and the world of computer vision and OpenCV I would suggest you work through Practical Python and OpenCV. The book is a gentle guide to learning the fundamentals of computer vision and I have no doubt it will help you as well.

  36. Sundar December 25, 2018 at 11:54 am #

    Hey Adrian, I have 2 doubts:
    1. Can you explain why we use argument parsing? Can I run the same code without argument parsing? If yes, how to do so?
    2. How can I run the same data set through an SVM classifier?

  37. Ahtsham Asghar January 31, 2019 at 5:56 am #

    First of all thanks Adrian Sir for awesome tutorial.
    I want to know how i can apply predictions on other images by using this trained model ?

    • Ahtsham Asghar January 31, 2019 at 6:21 am #

      And 2nd thing i want to train model for 3 different classes .
      Is it possible using this code ?

      • Adrian Rosebrock February 1, 2019 at 6:51 am #

        1. You can use the model.predict function to make predictions on images outside the original dataset. Just extract color histograms for the new images and pass them through the “predict” method.

        2. Yes, but you would need to organize your images better. The default dogs vs. cats dataset doesn’t include subdirectories for better organization. Take a look at this post for an example of how I like to organize my datasets.

        • Ahtsham Asghar February 1, 2019 at 12:38 pm #

          Thanks Sir…

  38. Rockson Agyeman February 9, 2019 at 9:34 am #

    Thank you for you great post. I actually bought the Imagenet bundle of your book and I have just started with the KNN example.

    I have one question regarding KNN representation of data. I know to plot a point on a graph, it must have both x and y axis cordinates.

    If the pixels are flattened, it goes to say they are vectors now and have only 1 dimension. How is this plotted Sir?

    I may be misunderstanding this concept but I will be glad to recieve your input on this, please. Thank you.

    • Adrian Rosebrock February 14, 2019 at 1:54 pm #

      Thank you for picking up a copy of the ImageNet Bundle, Rockson! I hope you are enjoying it!

      As far as your question, we don’t plot k-NN data. The example I included here was just a visualization for you to see how data points can be represented in a Euclidean space (and how we can compute the distance between them).

  39. B.charishma March 20, 2019 at 6:50 am #

    sir,for all the input images i am getting the same histogram accuracy . i have tried for different test case images but the histogram and raw pixel intensity accuracy is same (60%,50%). sir,please rectify the error.

  40. Yamini May 6, 2019 at 11:58 pm #

    I need the full coverage of deep learning and how it came and what is thr difference between deep learning and convolution neural network plz help it’s my seminar topic


  1. How to tune hyperparameters with Python and scikit-learn - PyImageSearch - August 15, 2016

    […] last week’s post, I introduced the k-NN machine learning algorithm which we then applied to the task of image […]

Before you leave a comment...

Hey, Adrian here, author of the PyImageSearch blog. I'd love to hear from you, but before you submit a comment, please follow these guidelines:

  1. If you have a question, read the comments first. You should also search this page (i.e., ctrl + f) for keywords related to your question. It's likely that I have already addressed your question in the comments.
  2. If you are copying and pasting code/terminal output, please don't. Reviewing another programmers’ code is a very time consuming and tedious task, and due to the volume of emails and contact requests I receive, I simply cannot do it.
  3. Be respectful of the space. I put a lot of my own personal time into creating these free weekly tutorials. On average, each tutorial takes me 15-20 hours to put together. I love offering these guides to you and I take pride in the content I create. Therefore, I will not approve comments that include large code blocks/terminal output as it destroys the formatting of the page. Kindly be respectful of this space.
  4. Be patient. I receive 200+ comments and emails per day. Due to spam, and my desire to personally answer as many questions as I can, I hand moderate all new comments (typically once per week). I try to answer as many questions as I can, but I'm only one person. Please don't be offended if I cannot get to your question
  5. Do you need priority support? Consider purchasing one of my books and courses. I place customer questions and emails in a separate, special priority queue and answer them first. If you are a customer of mine you will receive a guaranteed response from me. If there's any time left over, I focus on the community at large and attempt to answer as many of those questions as I possibly can.

Thank you for keeping these guidelines in mind before submitting your comment.

Leave a Reply