Building an Image Search Engine: Searching and Ranking (Step 4 of 4)

We are now at the final step of building an image search engine — accepting a query image and performing an actual search.

Let’s take a second to review how we got here:

  • Step 1: Defining Your Image Descriptor. Before we even consider building an image search engine, we need to consider how we are going to represent and quantify our image using only a list of numbers (i.e. a feature vector). We explored three aspects of an image that can easily be described: color, texture, and shape. We can use one of these aspects, or many of them.
  • Step 2: Indexing Your Dataset. Now that we have selected a descriptor, we can apply the descriptor to extract features from each and every image in our dataset. The process of extracting features from an image dataset is called “indexing”. These features are then written to disk for later use. Indexing is also a task that is easily made parallel by utilizing multiple cores/processors on our machine.
  • Step 3: Defining Your Similarity Metric. In Step 1, we defined a method to extract features from an image. Now, we need to define a method to compare our feature vectors. A distance function should accept two feature vectors and then return a value indicating how “similar” they are. Common choices for similarity functions include (but are certainly not limited to) the Euclidean, Manhattan, Cosine, and Chi-Squared distances.

Finally, we are now ready to perform our last step in building an image search engine: Searching and Ranking.

The Query

Before we can perform a search, we need a query.

The last time you went to Google, you typed in some keywords into the search box, right? The text you entered into the input form was your “query”.

Google then took your query, analyzed it, and compared it to their gigantic index of webpages, ranked them, and returned the most relevant webpages back to you.

Similarly, when we are building an image search engine, we need a query image.

Query images come in two flavors: an internal query image and an external query image.

As the name suggests, an internal query image already belongs in our index. We have already analyzed it, extracted features from it, and stored its feature vector.

The second type of query image is an external query image. This is the equivalent to typing our text keywords into Google. We have never seen this query image before and we can’t make any assumptions about it. We simply apply our image descriptor, extract features, rank the images in our index based on similarity to the query, and return the most relevant results.

You may remember that when I wrote the How-To Guide on Building Your First Image Search engine, I included support for both internal and external queries.

Why did I do that?

Let’s think back to our similarity metrics for a second and assume that we are using the Euclidean distance. The Euclidean distance has a nice property called the Coincidence Axiom, implying that the function returns a value of 0 (indicating perfect similarity) if and only if the two feature vectors are identical.

If I were to search for an image already in my index, then the Euclidean distance between the two feature vectors would be zero, implying perfect similarity. This image would then be placed at the top of my search results since it is the most relevant. This makes sense and is the intended behavior.

How strange it would be if I searched for an image already in my index and did not find it in the #1 result position. That would likely imply that there was a bug in my code somewhere or I’ve made some very poor choices in image descriptors and similarity metrics.

Overall, using an internal query image serves as a sanity check. It allows you to make sure that your image search engine is functioning as expected.

Once you can confirm that your image search engine is working properly, you can then accept external query images that are not already part of your index.

The Search

So what’s the process of actually performing a search? Checkout the outline below:

1. Accept a query image from the user

A user could be uploading an image from their desktop or from their mobile device. As image search engines become more prevalent, I suspect that most queries will come from devices such as iPhones and Droids. It’s simple and intuitive to snap a photo of a place, object, or something that interests you using your cellphone, and then have it automatically analyzed and relevant results returned.

2. Describe the query image

Now that you have a query image, you need to describe it using the exact same image descriptor(s) as you did in the indexing phase. For example, if I used a RGB color histogram with 32 bins per channel when I indexed the images in my dataset, I am going to use the same 32 bin per channel histogram when describing my query image. This ensures that I have a consistent representation of my images. After applying my image descriptor, I now have a feature vector for the query image.

3. Perform the Search

To perform the most basic method of searching, you need to loop over all the feature vectors in your index. Then, you use your similarity metric to compare the feature vectors in your index to the feature vectors from your query. Your similarity metric will tell you how “similar” the two feature vectors are. Finally, sort your results by similarity.

If you would like to see the “Performing the Search” step in action, head on over to my How-To Guide on Building Your First Image Search Engine post. On Step #3, I give you Python code that can be used to perform a search.

Looping over your entire index may be feasible for small datasets. But if you have a large image dataset, like Google or TinEye, this simply isn’t possible. You can’t compute the distance between your query features and the billions of feature vectors already present in your dataset.

For the readers that have experience in information retrieval (traditionally focused on building text search engines), we can also use tf-idf indexing and an inverted index to speedup the process. However, in order to use this method, we need to ensure our features can fit into the vector space model and are sufficiently sparse. Building an image search engine that utilizes this method is outside the scope of this post; however, I will certainly be revisiting it in the future when we start to build more complex search engines.

4. Display Your Results to the User

Now that we have a ranked list of relevant images we need to display them to the user. This can be done using a simple web interface if the user is on a desktop, or we can display the images using some sort of app if they are on a mobile device. This step is pretty trivial in the overall context of building an image search engine, but you should still give thought to the user interface and how the user will interact with your image search engine.

Summary

So there you have it, the four steps of building an image search engine, from front to back:

  1. Define your image descriptor.
  2. Index your dataset.
  3. Define your similarity metric.
  4. Perform a search, rank the images in your index in terms of relevancy to the user, and display the results to the user.

So what did you think of this series of posts? Was it informative? Did you learn anything? Or do you prefer posts that have more code examples, like Hobbits and Histograms?

Please leave a comment below, I would love to hear your thoughts.

And as always, be sure to sign up below to download my image search engine Resource Guide PDF. You’ll receive exclusive tips, tricks, and hacks that I don’t publish in this blog. And you’ll be the first to know about my upcoming book launch!

, , , , ,

16 Responses to Building an Image Search Engine: Searching and Ranking (Step 4 of 4)

  1. Dinesh Vadhia February 27, 2014 at 12:06 pm #

    My 2 cents. The Hobbits & Histograms post covered the basics and so this series of posts should go into more depth about building an image search engine wrt using an inverted index as that is the norm.

    • Adrian Rosebrock February 27, 2014 at 3:09 pm #

      Thanks for the reply!

      We’ll definitely get to the inverted index. But there are definitely some important topics to cover first, such as keypoint detection, local invariant descriptors, and codebook construction. Finally, we’ll be able to discuss how to use an inverted index to speedup the actual search process. Again, thanks for the feedback!

  2. Bingbing April 3, 2017 at 10:21 am #

    This series of post is really cool, which reminds me of many things I have learned before. And I like this kind of process, I mean, first use example and code to tell us what you are doing and what kind of results we can get, then dive into each step. Thank you for your post.

    • Adrian Rosebrock April 3, 2017 at 1:50 pm #

      I’m glad you enjoyed the post, thank you for the kind words.

  3. Rohit May 5, 2017 at 4:48 am #

    Thanks Adrian for this wonderful series of posts. Your blog is making my understanding of image handling very clear

    • Adrian Rosebrock May 8, 2017 at 12:38 pm #

      Thank you Rohit, I appreciate the kind words 🙂

  4. Pooja Gundewar November 22, 2017 at 7:04 am #

    Hello Adrian,
    I want to measure the distance between moving obstacle and camera, which descriptor should be used for feature extraction of frame?

  5. Pooja Gundewar November 25, 2017 at 2:48 am #

    Thank you for providing the reference.
    I want to measure the distance using feature extraction concept. Let us assume that the object is moving towards the camera. For each frame , features will be extracted and those features will be used to estimate the distance between camera and object. Is it feasible?

    • Adrian Rosebrock November 25, 2017 at 12:13 pm #

      Are you referring to detecting the object via features? Typically under this approach we would calibrate our camera and compute the intrinsic/extrinsic parameters.

  6. Pooja Gundewar November 28, 2017 at 3:26 am #

    I want to detect the moving object via features, is it ok if only texture based descriptor is selected?

    • Adrian Rosebrock November 28, 2017 at 2:03 pm #

      Hi Pooja — is there a particular reason you want to detect moving objects via features instead of using background subtraction?

  7. Kaustav Mukherjee August 22, 2018 at 2:36 pm #

    Did you get a chance to write the detailed article on using inverted index to search the matching image quickly instead of looping over all images?

    • Adrian Rosebrock August 24, 2018 at 8:55 am #

      Yes, I cover it inside the PyImageSearch Gurus course. Inside the Gurus course you’ll find 40+ lessons dedicated to extracting features from images, building an inverted index, and scaling an image search engine to millions of images.

    • Adrian Rosebrock January 29, 2019 at 6:52 am #

      I discuss (and include code) how to build an image search engine that scale to millions of images inside the PyImageSearch Gurus course. I would suggest starting there.

Before you leave a comment...

Hey, Adrian here, author of the PyImageSearch blog. I'd love to hear from you, but before you submit a comment, please follow these guidelines:

  1. If you have a question, read the comments first. You should also search this page (i.e., ctrl + f) for keywords related to your question. It's likely that I have already addressed your question in the comments.
  2. If you are copying and pasting code/terminal output, please don't. Reviewing another programmers’ code is a very time consuming and tedious task, and due to the volume of emails and contact requests I receive, I simply cannot do it.
  3. Be respectful of the space. I put a lot of my own personal time into creating these free weekly tutorials. On average, each tutorial takes me 15-20 hours to put together. I love offering these guides to you and I take pride in the content I create. Therefore, I will not approve comments that include large code blocks/terminal output as it destroys the formatting of the page. Kindly be respectful of this space.
  4. Be patient. I receive 200+ comments and emails per day. Due to spam, and my desire to personally answer as many questions as I can, I hand moderate all new comments (typically once per week). I try to answer as many questions as I can, but I'm only one person. Please don't be offended if I cannot get to your question
  5. Do you need priority support? Consider purchasing one of my books and courses. I place customer questions and emails in a separate, special priority queue and answer them first. If you are a customer of mine you will receive a guaranteed response from me. If there's any time left over, I focus on the community at large and attempt to answer as many of those questions as I possibly can.

Thank you for keeping these guidelines in mind before submitting your comment.

Leave a Reply

[email]
[email]