Keras: Feature extraction on large datasets with Deep Learning

In this tutorial, you will learn how to use Keras for feature extraction on image datasets too big to fit into memory. You’ll utilize ResNet-50 (pre-trained on ImageNet) to extract features from a large image dataset, and then use incremental learning to train a classifier on top of the extracted features.

Today is part two in our three-part series on transfer learning with Keras:

Last week we discussed how to perform transfer learning using Keras — inside that tutorial we focused primarily on transfer learning via feature extraction.

Using this method we were able to utilize CNNs to recognize classes it was never trained on!

The problem with that method is that it assumes that all of our extracted features can fit into memorythat may not always be the case!

For example, suppose we have a dataset of 50,000 images and wanted to utilize the ResNet-50 network for feature extraction via the final layer prior to the FC layers — that output volume would be of size 7 x 7 x 2048 = 100,352-dim.

If we had 50,000 of such 100,352-dim feature vectors (assuming 32-bit floats), then we would need a total of 40.14GB of RAM to store the entire set of feature vectors in memory!

Most people don’t have 40GB+ of RAM in their machines, so in those situations, we need to be able to perform incremental learning and train our model on incremental subsets of the data.

The rest of today’s tutorial will show you how to do exactly that.

To learn how to utilize Keras for feature extraction on large datasets, just keep reading!

Looking for the source code to this post?
Jump right to the downloads section.

Keras: Feature extraction on large datasets with Deep Learning

In the first part of this tutorial, we’ll briefly discuss the concept of treating networks as feature extractors (which was covered in more detail in last week’s tutorial).

From there we’ll investigate the scenario in which your extracted feature dataset is too large to fit into memory — in those situations, we’ll need to apply incremental learning to our dataset.

Next, we’ll implement Python source code that can be used for:

  1. Keras feature extraction
  2. Followed by incremental learning on the extracted features

Let’s get started!

Networks as feature extractors

Figure 1: Left: The original VGG16 network architecture that outputs probabilities for each of the 1,000 ImageNet class labels. Right: Removing the FC layers from VGG16 and instead returning the final POOL layer. This output will serve as our extracted features.

When performing deep learning feature extraction, we treat the pre-trained network as an arbitrary feature extractor, allowing the input image to propagate forward, stopping at pre-specified layer, and taking the outputs of that layer as our features.

Doing so, we can still utilize the robust, discriminative features learned by the CNN. We can also use them to recognize classes the CNN was never trained on!

An example of feature extraction via deep learning can be seen in Figure 1 at the top of this section.

Here we take the VGG16 network, allow an image to forward propagate to the final max-pooling layer (prior to the fully-connected layers), and extract the activations at that layer.

The output of the max-pooling layer has a volume shape of 7 x 7 x 512 which we flatten into a feature vector of 21,055-dim.

Given a dataset of N images, we can repeat the process of feature extraction for all images in the dataset, leaving us with a total of N x 21,055-dim feature vectors.

Given these features, we can train a “standard” machine learning model (such as Logistic Regression or Linear SVM) on these features.

Note: Feature extraction via deep learning was covered in much more detail in last week’s post — refer to it if you have any questions on how feature extraction works.

What if your extracted features are too large to fit into memory?

Feature extraction via deep learning is all fine and good…

…but what happens when your extracted features are too large to fit into memory?

Keep in mind that (most implementations of, including scikit-learn) Logistic Regression and SVMs require your entire dataset to be accessible all at once for training (i.e., the entire dataset must fit into RAM).

That’s great, but if you have 50GB, 100GB, or even 1TB of extracted features, what are you going to do?

Most people don’t have access to machines with so much memory.

So, what do you do then?

Solution: Incremental learning (i.e., “online learning”)

Figure 2: The process of incremental learning plays a role in deep learning feature extraction on large datasets.

When your entire dataset does not fit into memory you need to perform incremental learning (sometimes called “online learning”).

Incremental learning enables you to train your model on small subsets of the data called batches.

Using incremental learning the training process becomes:

  1. Load a small batch of data from the dataset
  2. Train the model on the batch
  3. Repeat looping through the dataset in batches, training as we go, until we reach convergence

But wait — doesn’t that process sound familiar?

It should.

It’s exactly how we train neural networks.

Neural networks are excellent examples of incremental learners.

And in fact, if you check out the scikit-learn documentation, you’ll find that the classification models for incremental learning are either NNs themselves or directly related to NNs (i.e., Perceptron and SGDClassifier).

Instead of using scikit-learn’s incremental learning models, we are going to implement our own neural network using Keras.

This NN will be trained on top of our extracted features from the CNN.

Our training process now becomes:

  1. Extract all features from our image dataset using a CNN.
  2. Train a simple, feedforward neural network on top of the extracted features.

The Food-5K dataset

Figure 3: The Foods-5K dataset will be used for this example of deep learning feature extraction with Keras.

The dataset we’ll be using here today is the Food-5K dataset, curated by the Multimedia Signal Processing Group (MSPG) of the Swiss Federal Institute of Technology.

This dataset consists of 5,000 images, each belonging to one of two classes:

  1. Food
  2. Non-food

Our goal today is to:

  1. Utilize Keras feature extraction to extract features from the Food-5K dataset using ResNet-50 pre-trained on ImageNet.
  2. Train a simple neural network on top of these features to recognize classes the CNN was never trained to recognize.

It’s worth noting that the entire Food-5K dataset, after feature extraction, will only occupy ~2GB of RAM if loaded all at once — that’s not the point.

The point of today’s post is to show you how to use incremental learning to train a model on the extracted features.

That way, regardless of whether you are working with 1GB of data or 100GB of data, you will know the exact steps to train a model on top of features extracted via deep learning.

Downloading the Food-5K dataset

To start, make sure you grab the source code for today’s tutorial using the “Downloads” section of the blog post.

Once you’ve downloaded the source code, change directory into transfer-learning-keras :

In my experience, I’ve found that downloading the Food-5K dataset to be a bit unreliable.

Therefore I’m presenting two options to download the dataset:

Option 1: Use wget  in your terminal

The wget  application comes on Ubuntu and other Linux distros. On macOS, you must install it:

To download the Food-5K dataset, let’s use wget  in our terminal:

Note: At least on macOS, I’ve found that if the wget  command fails once, just run it again and then the download will start.

Option 2: Use FileZilla

FileZilla is a GUI application for FTP and SCP connections. You may download it for your OS here.

Once you’ve installed and launched the application, enter the credentials:

  • Host: tremplin.epfl.ch
  • Username: FoodImage@grebvm2.epfl.ch
  • Password: Cahc1moo

You can then connect and download the file into the appropriate destination.

Figure 4: Downloading the Food-5K dataset using FileZilla.

The username and password combination was obtained from the official Food-5K dataset website. If the username/password combination stops working for you, check to see if the dataset curators changed the login credentials.

Once downloaded, we can go ahead and unzip the dataset (ensuring that you are in the Food-5K/  directory that we previously used the cd command to move into):

Project structure

Go ahead and navigate back to the root directory:

From there, we’re able to analyze our project structure with the tree  command:

The config.py  file contains our configuration settings in Python form. Our other Python scripts will take advantage of the config.

Using our build_dataset.py  script, we’ll organize and output the contents of the Food-5K/  directory to the dataset folder.

From there, the extract_features.py  script will use transfer learning via feature extraction to compute feature vectors for each image. These features will be output to a CSV file.

Both build_dataset.py  and extract_features.py  were reviewed in detail last week; however, we’ll briefly walk through them again today.

Finally, we’ll review train.py . In this Python script, we will use incremental learning to train a simple neural network on the extracted features. This script is different than last week’s tutorial and we will focus our energy here.

Our configuration file

Let’s get started by reviewing our config.py  file where we’ll store our configurations, namely the paths to our input dataset of images along with our output paths of extracted features.

Open up the config.py file and insert the following code:

Take the time to read through the config.py  script paying attention to the comments.

Most of the settings are related to directory and file paths which are used in the rest of our scripts.

For a full review of the configuration, be sure to refer to last week’s post.

Building the image dataset

Whenever I’m performing machine learning on a dataset (and especially Keras/deep learning), I prefer to have my dataset in the format of:

dataset_name/class_label/example_of_class_label.jpg

Maintaining this directory structure not only keeps our dataset organized on disk but also enables us to utilize Keras’ flow_from_directory function when we get to fine-tuning later in this series of tutorials.

Since the Food-5K dataset provides pre-supplied data splits our final directory structure will have the form:

dataset_name/split_name/class_label/example_of_class_label.jpg

Again, this step isn’t always necessary, but it is a best practice (in my opinion), and one that I suggest you do as well.

At the very least it will give you experience writing Python code to organize images on disk.

Let’s use the build_dataset.py  file to build our directory structure now:

After importing our packages on Lines 2-5, we proceed to loop over the training, testing, and validation splits (Line 8).

We create our split + class label directory structure (detailed above) and then populate the directories with the Food-5K images. The result is organized data which we can use for extracting features.

Let’s execute the script and review our directory structure once more.

You can use the “Downloads” section of this tutorial to download the source code — from there, open up a terminal and execute the following command:

After doing so, you will encounter the following directory structure:

Notice that our dataset/ directory is now populated. Each subdirectory then has the following format:

split_name/class_label

With our data organized, we’re ready to move on to feature extraction.

Using Keras for deep learning feature extraction

Now that we’ve built our dataset directory structure for the project, we can:

  1. Use Keras to extract features via deep learning from each image in the dataset.
  2. Write the class labels + extracted features to disk in CSV format.

To accomplish these tasks we’ll need to implement the extract_features.py  file.

This file was covered in detail in last week’s post so we’ll only briefly review the script here as a matter of completeness:

On Line 16, ResNet is loaded while excluding the head. Pre-trained ImageNet weights are loaded into the network as well. Feature extraction via transfer learning is now possible using this pre-trained, headless network.

From there, we proceed to loop over the data splits on Line 20.

Inside, we grab all imagePaths  for the particular split  and fit our label encoder (Lines 23-39).

A CSV file is opened for writing (Lines 37-39) so that we can write our class labels and extracted features to disk.

Now that our initializations are all set, we can start looping over images in batches:

Each image  in the batch is loaded and preprocessed. From there it is appended to batchImages .

We’ll now send the batch through ResNet to extract features:

Feature extraction for the batch takes place on Line 72. Using ResNet, our output layer has a volume size of 7 x 7 x 2,048. Treating the output as a feature vector, we simply flatten it into a list of 7 x 7 x 2,048 = 100,352-dim (Line 73).

The batch of feature vectors is then output to a CSV file with the first entry of each row being the class label  and the rest of the values making up the feature vec .

We’ll repeat this process for all batches inside each split until we finish. Finally, our label encoder is dumped to disk.

For a more detailed, line-by-line review, refer to last week’s tutorial.


To extract features from our dataset, make sure you use the “Downloads” section of the guide to download the source code to this post.

From there, open up a terminal and execute the following command:

On an NVIDIA K80 GPU the entire feature extraction process took 5m11s.

You can also run extract_features.py on a CPU but it will take much longer.

After feature extraction is complete, you should have three CSV files in your output directory, one for each of our data splits, respectively:

Implementing the incremental learning training script

Finally, we are now ready to utilize incremental learning to apply transfer learning via feature extraction on large datasets.

The Python script we’re implementing in this section will be responsible for:

  1. Constructing the simple feedforward NN architecture.
  2. Implementing a CSV data generator used to yield batches of labels + feature vectors to the NN.
  3. Training the simple NN using the data generator.
  4. Evaluating the feature extractor.

Open up the train.py script and let’s get started:

On Lines 2-10 import our required packages. Our most notable import is Keras’ Sequential  API which we will use to build a simple feedforward neural network.

Several months ago I wrote a tutorial on implementing custom Keras data generators, and more specifically, yielding data from a CSV file to train a neural network with Keras.

At the time, I found that readers were a bit confused on practical applications where you would use such a generator — today is a great example of such a practical application.

Again, keep in mind that we’re assuming at the entire CSV file of extracted features will not fit into memory. Therefore, we need a custom Keras generator to yield batches of labels + data to the network so it can be trained.

Let’s implement the generator now:

Our csv_feature_generator  accepts four parameters:

  • inputPath : The path to our input CSV file containing the extracted features.
  • bs : The batch size (or length) of each chunk of data.
  • numClasses : An integer value representing the number of classes in our data.
  • mode : Whether we are training or evaluating/testing.

On Line 14, we open our CSV file for reading.

Beginning on Line 17, we loop indefinitely, starting by initializing our data and labels. (Lines 19 and 20).

From there, we’ll loop until the length data  equals the batch size starting on Line 23.

We proceed by reading a line from the CSV (Line 25). Once we have the line we’ll go ahead and process it:

If the row  is empty, we will restart at the beginning of the file (Lines 29-32). And if we are in evaluation mode, we will break  from our loop, ensuring that we don’t fill the batch from the start of the file (Lines 38 and 39).

Assuming we are continuing on, the label  and features  are extracted from the row  (Lines 42-45).

We then append the feature vector ( features ) and label  to the data  and labels  lists, respectively, until the lists reach the specified batch size (Lines 48 and 49).

When the batch is ready, Line 52 yields the data  and labels  as a tuple. Python’s yield  keyword is critical to making our function operate as a generator.

Let’s continue — we have a few more steps before we will train the model:

Our label encoder is loaded from disk on Line 54. We then derive the paths to the training, validation, and testing CSV files (Lines 58-63).

Lines 67 and 68 handle counting the number of images that are in the training and validation sets. With this information, we will be able to tell the .fit_generator  function how many batch_size  steps are in each epoch.

Let’s construct a generator for each data split:

Lines 76-81 initialize our CSV feature generators.

We’re now ready to build a simple neural network:

Contrary to last week’s tutorial where we used a Logistic Regression machine learning model, today we will build a simple neural network for classification.

Lines 84-87 define a simple 100352-256-16-2  feedforward neural network architecture using Keras.

How did I come up with the values of 256  and 16  for the two hidden layers?

A good rule of thumb is to take the square root of the previous number of nodes in the layer and then find the closest power of 2.

In this case, the closest power of 2 to 100352  is 256 . The square root of 256  is then 16 , thus giving us our architecture definition.

Let’s go ahead and compile  our model :

We compile  our model  using stochastic gradient descent ( SGD ) with an initial learning rate of 1e-3  (which will decay over 25  epochs).

We’re using "binary_crossentropy"  for our loss  function here as we only have to two classes. If you have greater than 2 classes then you should use "categorical_crossentropy" .

With our model  compiled, now we are ready to train and evaluate:

Lines 96-101 fit our model  using our training and validation generators ( trainGen  and valGen ). Using generators with our model  allows for incremental learning.

Using incremental learning we are no longer required to have all of our data loaded into memory at one time. Instead, batches of data flow through our network making it easy to work with massive datasets.

Of course, CSV data isn’t exactly an efficient use of space, nor is it fast. Inside of Deep Learning for Computer Vision with Python, I teach how to use HDF5 for storage more efficiently.

Evaluation of the model takes place on Lines 107-109, where testGen  generates our feature vectors in batches. A classification report is then printed in the terminal (Lines 110 and 111).

Keras feature extraction results

Finally, we are ready to train our simple NN on the extracted features from ResNet!

Make sure you use the “Downloads” section of this tutorial to download the source code.

From there, open up a terminal and execute the following command:

Training on an NVIDIA K80 took approximately ~30m. You could train on a CPU as well but it will take considerably longer.

And as our output shows, we are able to obtain ~98-99% accuracy on the Food-5K dataset, even though ResNet-50 was never trained on food/non-food classes!

As you can see, transfer learning is a very powerful technique, enabling you to take the features extracted from CNNs and recognize classes they were not trained on.

Later in this series of tutorials on transfer learning with Keras and deep learning, I’ll be showing you how to perform fine-tuning, another transfer learning method.

What’s next — where do I learn more about transfer learning and feature extraction?

In this tutorial, you learned how to utilize a CNN to recognize class labels it was never trained on.

You also learned how to use incremental learning to accomplish this task.

Incremental learning is critical when your dataset is too large to fit into memory.

But I know as soon as this post is published I’m going to get emails and questions in the comments regarding:

  • “How do I classify images outside my training/testing set?”
  • “How do I load an image from disk, extract features from it using a CNN, and then classify it using the neural network?”
  • “How do I correctly preprocess my input image before classification?”

Today’s tutorial is long enough as it is. I can’t, therefore, include those sections of Deep Learning for Computer Vision with Python inside this post.

If you’d like to learn more about transfer learning, including:

  1. More details on the concept of transfer learning
  2. How to perform feature extraction
  3. How to fine-tune networks
  4. How to classify images outside your training/testing set using both feature extraction and fine-tuning

…then you’ll definitely want to refer to my book, Deep Learning for Computer Vision with Python.

Besides chapters on transfer learning, you’ll also find:

  • Super practical walkthroughs that present solutions to actual, real-world image classification, object detection, and instance segmentation problems.
  • Hands-on tutorials (with lots of code) that not only show you the algorithms behind deep learning for computer vision, but their implementations as well.
  • A no-nonsense teaching style that is guaranteed to help you master deep learning for image understanding and visual recognition.

To learn more about the book, and grab the table of contents + free sample chapters, just click here!

Summary

In this tutorial you learned how to:

  1. Utilize Keras for deep learning feature extraction.
  2. Perform incremental learning on the extracted features.

Utilizing incremental learning enables us to train models on datasets too large to fit into memory.

Neural networks are a great example of incremental learners as we can load data via batches, ensuring the entire network does not have to fit into RAM at once. Using incremental learning we were able to obtain ~98% accuracy.

I would suggest using this code as a template for whenever you need to use Keras for feature extraction on large datasets.

I hope you enjoyed the tutorial!

To download the source code to this post (and be notified when future tutorials are published here on PyImageSearch), just enter your email address in the form below!

Downloads:

If you would like to download the code and images used in this post, please enter your email address in the form below. Not only will you get a .zip of the code, I’ll also send you a FREE 17-page Resource Guide on Computer Vision, OpenCV, and Deep Learning. Inside you'll find my hand-picked tutorials, books, courses, and libraries to help you master CV and DL! Sound good? If so, enter your email address and I’ll send you the code immediately!

, , , , , ,

36 Responses to Keras: Feature extraction on large datasets with Deep Learning

  1. Sanchit May 27, 2019 at 10:36 am #

    Hi Adrian,
    What about making a series (or, a blog) of using Python Dask, Keras, Sklearn, OpenCV etc together? It could be an interesting use case of big data and multi-core computing blog by using Dask. Thanks.

    • Adrian Rosebrock May 28, 2019 at 6:37 am #

      I haven’t used Dask before. I’ll take a look. Thanks!

  2. JBeale May 27, 2019 at 11:52 am #

    I’m curious what your take is on this this ICLR 2019 paper by Geirhos et.al. https://openreview.net/forum?id=Bygh9j09KX, which appears to demonstrate “object recognition” across most DL frameworks is really “texture recognition”. However you can in fact force true shape recognition and get better generalization as well, but only if you randomize the textures in your training set. General-audience article based on paper: https://blog.usejournal.com/why-deep-learning-works-differently-than-we-thought-ec28823bdbc based on

    • Adrian Rosebrock May 28, 2019 at 6:36 am #

      Sorry, I haven’t read the paper you’re referring to. I get requested to read papers and provide my feedback on them nearly every day — it’s honestly not something I can do. If I think a paper is interesting enough I typically write a blog post on it. I may do that for the paper you suggested but I honestly cannot guarantee that.

  3. Jorge May 27, 2019 at 2:03 pm #

    Hi Adrian. Thank you very much for your excellent work.
    I would like to ask a basic question.How could we save the final model trained for “food / non-food” for later use as a pre-trained network to recognize food / non-food?
    Thank you
    Jorge, From Argentina

    • Adrian Rosebrock May 28, 2019 at 6:33 am #

      Hi Jorge — I address your exact question in the “What’s next — where do I learn more about transfer learning and feature extraction?” section of the post (kindly give it a read).

      The short answer is that this post is long enough/detailed enough as it is. If you’d like to learn how to save the model and then apply it to your own custom images, you’ll need to refer to Deep Learning for Computer Vision with Python.

  4. David Bonn May 27, 2019 at 2:28 pm #

    Hi Adrian,

    Great post.

    Francios Chollet described a similar approach (using a small Keras model to classify extracted features) in the Keras Blog a number of years ago. His examples weren’t as clean and clear as yours were but the idea was similar, and the results for his particular example were impressive. The blog title was “Building powerful image classification models using very little data.”

    I’ve found that in practice it is almost always best to store your training dataset in an HDF5 database or something similar. Even small-ish datasets can be very unwieldy when stored as directory trees.

    Again, thanks for the great and very instructive post.

    • Adrian Rosebrock May 28, 2019 at 6:32 am #

      Yep, I recall the exact article that you’re referring to David! It was a really nice intro to using Keras for transfer learning.

      I also prefer to store my dataset in HDF5. Using the generators in Deep Learning for Computer Vision with Python you can obtain faster throughout and reduce training times. Plus, there’s the benefit of better data organization.

  5. Niraj May 28, 2019 at 3:31 am #

    Much needed post.Thanks a lot..!!

    • Adrian Rosebrock May 28, 2019 at 6:30 am #

      Thanks Niraj!

  6. Slim Frikha May 28, 2019 at 12:14 pm #

    Hi Adrian,
    Just noticed something that seems wrong to me. You generator is not implementing mini batch SGD. It is not stochastic as the generator is looping on the same batches again and again and again. This can have a negative effect on the optimization of the model (as order of data apparence matters now).
    Instead, for every mini batch, the generator should pick randomly for the entire dataset m samples (m is mini batch size).
    As a result, you need to vectorize all images but dump them individually. The generator then takes the list of all vectors paths in a list and for every mini batch picks randomly the samples and read their vectors, concat and yield.

    • Adrian Rosebrock May 30, 2019 at 9:10 am #

      For true min-batch SGD, yes, you would randomly select an index into the dataset and start looping. However, for large datasets, you may not do this. It’s not a requirement for mini-batch SGD but in some cases, especially for small datasets, it can work. For large datasets it’s not a requirement to perform such an operation, and worse, it’s not I/O efficient.

  7. McHacker May 29, 2019 at 8:46 am #

    hello adrian, your 17 day course has been of a major help to me, thank you very much. But i hope you talk about human action recognition some day. Thank you.

    • Adrian Rosebrock May 30, 2019 at 9:04 am #

      Thanks, I’m glad you liked the course 🙂

  8. Balvinder singh May 29, 2019 at 10:11 pm #

    Hi adrian,all ur posts r very impressive and clear…myself PhD scholor just stared course work….can u pls suggest me a novice,simple,good problem statement for my research….am not a good programmer so pls help and suggest me a simple problem to work effectively on it…..tq

    • Adrian Rosebrock May 30, 2019 at 9:01 am #

      That’s great that you are working on your PhD, but I would suggest speaking with your PhD advisor first — what does your PhD advisor think is a good topic? Try to align your topic with your advisors interests so they can provide more help to you.

  9. Talat Shah May 31, 2019 at 1:53 pm #

    How we make our own Dataset using keras??

  10. Denis Brion June 1, 2019 at 9:35 am #

    “Training on an NVIDIA K80 took approximately ~30m. You could train on a CPU as well but it will take considerably longer.”
    With a RPi B3 -4 cores-, it takes 6 hours….

    • Adrian Rosebrock June 6, 2019 at 8:35 am #

      Thanks for sharing Denis, although I would NOT recommend using an RPi to actually train a model. You should train the model on a laptop/desktop/GPU machine and then transfer the model to the RPi for inference.

  11. KerasIsFun June 3, 2019 at 11:18 am #

    Hey Adrian. Very useful, informative blog posts!

    Are you available for remote tutoring on an hourly paid basis?

    Thanks!

  12. Abkul June 3, 2019 at 11:20 am #

    Hi Adrian,

    Your blogs are super clear, demistfying and inspiring. Your books are extra ordinary.

    Perhaps you could give an example in medical field next time.

    • Adrian Rosebrock June 6, 2019 at 8:00 am #

      Thanks Abkul. I’ve actually done a few medical-related posts. You can find them in the medical category.

  13. Brian Troup June 6, 2019 at 11:19 pm #

    If you did part 1, then you do not need to download Food-5K dataset again and re-build the dataset directory.
    Simply create sym-links for Food-5k and dataset using the directories created in part 1.

  14. Andrei June 20, 2019 at 9:58 am #

    What about keras speed?
    If i need extract N-thousand images descritors i will wait few hours.
    At my machine, with good 14-cores CPU + multiprocessing, keras need about 0.2 sec to extract descriptor. Is there any variants to speed up this extraction(without GPU ofc)?

    • Adrian Rosebrock June 26, 2019 at 1:54 pm #

      You have three options:

      1. Use a GPU
      2. Parallelize across the system bus and CPU
      3. Parallelize across multiple systems

  15. CH Ho June 25, 2019 at 10:43 pm #

    Hi Adrian,

    Thanks for the tutorial! Great reading and learning as usual!

    I have a question on the calculation:

    “For example, suppose we have a dataset of 50,000 images and wanted to utilize the ResNet-50 network for feature extraction via the final layer prior to the FC layers — that output volume would be of size 7 x 7 x 2048 = 100,352-dim.

    If we had 50,000 of such 100,352-dim feature vectors (assuming 32-bit floats), then we would need a total of 40.14GB of RAM to store the entire set of feature vectors in memory!”

    Can you pls explain how did you calculate the last part? How did you come up with 40.14GB of RAM needed to store the feature vectors?

    Many thanks!

    CH

    • Gertjan Brouwer August 15, 2019 at 9:24 am #

      100352 32bit per vector. Thus 100352*32 = 3211264 bits per vector.
      3211264 * 50000 images = 160563200000 bits
      160563200000 / 4 = 40140800000 bytes (converting from bits to bytes)
      40140800000 / 1000 = 40140800 Kbytes
      40140800 / 1000 = 40140.8 Mbytes
      40140.8 / 1000 = 40.1408 Gbytes

  16. Der July 24, 2019 at 3:18 am #

    Hi Adrian you’re genius and winning hearts by the way I’ve got a task wherein I’d be dealing with extraction of primary sound source using a deep Neural network can you tell me if a Neural network can produce an extracted feature as an output if yes how and what would be the code for it

  17. Gertjan Brouwer August 15, 2019 at 9:19 am #

    Is it also possible to use 49 descriptors of 2048 dimensions? If you flatten the 7*7 to 49 you end up with 49*2048 vectors. Which is more similar to SIFT which has about 1000-2000 vectors of 128 dimension per image. What are you thoughts on this?

    • Adrian Rosebrock August 16, 2019 at 5:30 am #

      DoG and SIFT are keypoint detector and local invariant descriptor algorithms. That is NOT the same as using a DL model to quantify an image. I think you may be confusing your algorithms/how SIFT is used with a BOVW model. I would suggest going through the PyImageSearch Gurus course where I cover them in detail.

      • Gertjan Brouwer August 16, 2019 at 2:36 pm #

        So I assume it is necessary to flatten the whole output layer which becomes a 100352 descriptors. But I have worse results with those than with the 49*2048 descriptors. I don’t know why this is. I am using a dataset of 4k images with mostly bags and suitcases. I have also tried SIFT which gives better results than a neural net descriptor at the moment. What are some things I might change to get better descriptors with ResNet/VGG16 ?

        • Adrian Rosebrock September 5, 2019 at 9:53 am #

          I’d be happy to discuss this project in more detail but I would first suggest you read through either the PyImageSearch Gurus course (which I already linked you to) or Deep Learning for Computer Vision with Python. Both will teach you how to solve this project, and furthermore, I’ll then be able to provide additional help to you.

  18. Dheeraj Khanna December 4, 2019 at 5:59 am #

    Thanks Adrian for the amazing tutorial. Every time I come here I learn something new.

    • Adrian Rosebrock December 5, 2019 at 10:28 am #

      Thanks Dheeraj 🙂

Before you leave a comment...

Hey, Adrian here, author of the PyImageSearch blog. I'd love to hear from you, but before you submit a comment, please follow these guidelines:

  1. If you have a question, read the comments first. You should also search this page (i.e., ctrl + f) for keywords related to your question. It's likely that I have already addressed your question in the comments.
  2. If you are copying and pasting code/terminal output, please don't. Reviewing another programmers’ code is a very time consuming and tedious task, and due to the volume of emails and contact requests I receive, I simply cannot do it.
  3. Be respectful of the space. I put a lot of my own personal time into creating these free weekly tutorials. On average, each tutorial takes me 15-20 hours to put together. I love offering these guides to you and I take pride in the content I create. Therefore, I will not approve comments that include large code blocks/terminal output as it destroys the formatting of the page. Kindly be respectful of this space.
  4. Be patient. I receive 200+ comments and emails per day. Due to spam, and my desire to personally answer as many questions as I can, I hand moderate all new comments (typically once per week). I try to answer as many questions as I can, but I'm only one person. Please don't be offended if I cannot get to your question
  5. Do you need priority support? Consider purchasing one of my books and courses. I place customer questions and emails in a separate, special priority queue and answer them first. If you are a customer of mine you will receive a guaranteed response from me. If there's any time left over, I focus on the community at large and attempt to answer as many of those questions as I possibly can.

Thank you for keeping these guidelines in mind before submitting your comment.

Leave a Reply

[email]
[email]