Multi-label classification with Keras

Today’s blog post on multi-label classification with Keras was inspired from an email I received last week from PyImageSearch reader, Switaj.

Switaj writes:

Hi Adrian, thanks for the PyImageSearch blog and sharing your knowledge each week.

I’m building an image fashion search engine and need help.

Using my app a user will upload a photo of clothing they like (ex. shirt, dress, pants, shoes) and my system will return similar items and include links for them to purchase the clothes online.

The problem is that I need to train a classifier to categorize the items into various classes:

Clothing type: Shirts, dresses, pants, shoes, etc.

Color: Red, blue, green, black, etc.

Texture/appearance: Cotton, wool, silk, tweed, etc.

I’ve trained three separate CNNs for each of the three categories and they work really well.

Is there a way to combine the three CNNs into a single network? Or at least train a single network to complete all three classification tasks?

I don’t want to have to apply them individually in a cascade of if/else code that uses a different network depending on the output of a previous classification.

Thanks for your help.

Switaj poses an excellent question:

Is it possible for a Keras deep neural network to return multiple predictions?

And if so, how is it done?

To learn how to perform multi-label classification with Keras, just keep reading.

Looking for the source code to this post?

Multi-label classification with Keras

2020-06-12 Update: This blog post is now TensorFlow 2+ compatible!

Today’s blog post on multi-label classification is broken into four parts.

In the first part, I’ll discuss our multi-label classification dataset (and how you can build your own quickly).

From there we’ll briefly discuss SmallerVGGNet , the Keras neural network architecture we’ll be implementing and using for multi-label classification.

We’ll then take our implementation of SmallerVGGNet and train it using our multi-label classification dataset.

Finally, we’ll wrap up today’s blog post by testing our network on example images and discuss when multi-label classification is appropriate, including a few caveats you need to look out for.

Our multi-label classification dataset

**Figure 1:** A montage of a multi-class deep learning dataset. We’ll be using Keras to train a multi-label classifier to predict both the *color* and the *type* of clothing.

The dataset we’ll be using in today’s Keras multi-label classification tutorial is meant to mimic Switaj’s question at the top of this post (although slightly simplified for the sake of the blog post).

Our dataset consists of 2,167 images across six categories, including:

Black jeans (344 images)
Blue dress (386 images)
Blue jeans (356 images)
Blue shirt (369 images)
Red dress (380 images)
Red shirt (332 images)

The goal of our Convolutional Neural network will be to predict both color and clothing type.

I created this dataset by following my previous tutorial on How to (quickly) build a deep learning image dataset.

The entire process of downloading the images and manually removing irrelevant images for each of the six classes took approximately 30 minutes.

When trying to build your own deep learning image datasets, make sure you follow the tutorial linked above — it will give you a huge jumpstart on building your own datasets.

Configuring your development environment

To configure your system for this tutorial, I recommend following either of these tutorials:

Either tutorial will help you configure you system with all the necessary software for this blog post in a convenient Python virtual environment.

Please note that PyImageSearch does not recommend or support Windows for CV/DL projects.

Multi-label classification project structure

Go ahead and visit the “Downloads” section of this blog post to grab the code + files. Once you’ve extracted the zip file, you’ll be presented with the following directory structure:

├── classify.py
├── dataset
│   ├── black_jeans [344 entries
│   ├── blue_dress [386 entries]
│   ├── blue_jeans [356 entries]
│   ├── blue_shirt [369 entries]
│   ├── red_dress [380 entries]
│   └── red_shirt [332 entries]
├── examples
│   ├── example_01.jpg
│   ├── example_02.jpg
│   ├── example_03.jpg
│   ├── example_04.jpg
│   ├── example_05.jpg
│   ├── example_06.jpg
│   └── example_07.jpg
├── fashion.model
├── mlb.pickle
├── plot.png
├── pyimagesearch
│   ├── __init__.py
│   └── smallervggnet.py
├── search_bing_api.py
└── train.py

In the root of the zip, you’re presented with 6 files and 3 directories.

The important files we’re working with (in approximate order of appearance in this article) include:

search_bing_api.py : This script enables us to quickly build our deep learning image dataset. You do not need to run this script as the dataset of images has been included in the zip archive. I’m simply including this script as a matter of completeness.
train.py : Once we’ve acquired the data, we’ll use the train.py script to train our classifier.
fashion.model : Our train.py script will serialize our Keras model to disk. We will use this model later in the classify.py script.
mlb.pickle : A scikit-learn MultiLabelBinarizer pickle file created by train.py — this file holds our class names in a convenient serialized data structure.
plot.png : The training script will generate a plot.png image file. If you’re training on your own dataset, you’ll want to check this file for accuracy/loss and overfitting.
classify.py : In order to test our classifier, I’ve written classify.py . You should always test your classifier locally before deploying the model elsewhere (such as to an iPhone deep learning app or to a Raspberry Pi deep learning project).

The three directories in today’s project are:

dataset : This directory holds our dataset of images. Each class class has its own respective subdirectory. We do this to (1) keep our dataset organized and (2) make it easy to extract the class label name from a given image path.
pyimagesearch : This is our module containing our Keras neural network. Because this is a module, it contains a properly formatted __init__.py . The other file, smallervggnet.py contains the code to assemble the neural network itself.
examples : Seven example images are present in this directory. We’ll use classify.py to perform multi-label classification with Keras on each of the example images.

If this seems a lot, don’t worry! We’ll be reviewing the files in the approximate order in which I’ve presented them.

Our Keras network architecture for multi-label classification

**Figure 2:** A VGGNet-like network that I’ve dubbed “SmallerVGGNet” will be used for training a multi-label deep learning classifier with Keras.

The CNN architecture we are using for this tutorial is SmallerVGGNet , a simplified version of it’s big brother, VGGNet . The VGGNet model was first introduced by Simonyan and Zisserman in their 2014 paper, Very Deep Convolutional Networks for Large Scale Image Recognition.

As a matter of completeness we are going to implement SmallerVGGNet in this guide; however, I’m going to defer any lengthy explanation of the architecture/code to my previous post — please refer to it if you have any questions on the architecture or are simply looking for more detail. If you’re looking to design your own models, you’ll want to pick up a copy of my book, Deep Learning for Computer Vision with Python.

Ensure you’ve used the “Downloads” section at the bottom of this blog post to grab the source code + example images. From there, open up the smallervggnet.py file in the pyimagesearch module to follow along:

# import the necessary packages
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import BatchNormalization
from tensorflow.keras.layers import Conv2D
from tensorflow.keras.layers import MaxPooling2D
from tensorflow.keras.layers import Activation
from tensorflow.keras.layers import Flatten
from tensorflow.keras.layers import Dropout
from tensorflow.keras.layers import Dense
from tensorflow.keras import backend as K

On Lines 2-10, we import the relevant Keras modules and from there, we create our SmallerVGGNet class:

class SmallerVGGNet:
	@staticmethod
	def build(width, height, depth, classes, finalAct="softmax"):
		# initialize the model along with the input shape to be
		# "channels last" and the channels dimension itself
		model = Sequential()
		inputShape = (height, width, depth)
		chanDim = -1

		# if we are using "channels first", update the input shape
		# and channels dimension
		if K.image_data_format() == "channels_first":
			inputShape = (depth, height, width)
			chanDim = 1

Our class is defined on Line 12. We then define the build function on Line 14, responsible for assembling the convolutional neural network.

The build method requires four parameters — width , height , depth , and classes . The depth specifies the number of channels in an input image, and classes is the number (integer) of categories/classes (not the class labels themselves). We’ll use these parameters in our training script to instantiate the model with a 96 x 96 x 3 input volume.

The optional argument, finalAct (with a default value of "softmax" ) will be utilized at the end of the network architecture. Changing this value from softmax to sigmoid will enable us to perform multi-label classification with Keras.

Keep in mind that this behavior is different than our original implementation of SmallerVGGNet in our previous post — we are adding it here so we can control whether we are performing simple classification or multi-class classification.

From there, we enter the body of build , initializing the model (Line 17) and defaulting to "channels_last" architecture on Lines 18 and 19 (with a convenient switch for backends that support "channels_first" architecture on Lines 23-25).

Let’s build the first CONV => RELU => POOL block:

		# CONV => RELU => POOL
		model.add(Conv2D(32, (3, 3), padding="same",
			input_shape=inputShape))
		model.add(Activation("relu"))
		model.add(BatchNormalization(axis=chanDim))
		model.add(MaxPooling2D(pool_size=(3, 3)))
		model.add(Dropout(0.25))

Our CONV layer has 32 filters with a 3 x 3 kernel and RELU activation (Rectified Linear Unit). We apply batch normalization, max pooling, and 25% dropout.

Dropout is the process of randomly disconnecting nodes from the current layer to the next layer. This process of random disconnects naturally helps the network to reduce overfitting as no one single node in the layer will be responsible for predicting a certain class, object, edge, or corner.

From there we have two sets of (CONV => RELU) * 2 => POOL blocks:

		# (CONV => RELU) * 2 => POOL
		model.add(Conv2D(64, (3, 3), padding="same"))
		model.add(Activation("relu"))
		model.add(BatchNormalization(axis=chanDim))
		model.add(Conv2D(64, (3, 3), padding="same"))
		model.add(Activation("relu"))
		model.add(BatchNormalization(axis=chanDim))
		model.add(MaxPooling2D(pool_size=(2, 2)))
		model.add(Dropout(0.25))

		# (CONV => RELU) * 2 => POOL
		model.add(Conv2D(128, (3, 3), padding="same"))
		model.add(Activation("relu"))
		model.add(BatchNormalization(axis=chanDim))
		model.add(Conv2D(128, (3, 3), padding="same"))
		model.add(Activation("relu"))
		model.add(BatchNormalization(axis=chanDim))
		model.add(MaxPooling2D(pool_size=(2, 2)))
		model.add(Dropout(0.25))

Notice the numbers of filters, kernels, and pool sizes in this code block which work together to progressively reduce the spatial size but increase depth.

These blocks are followed by our only set of FC => RELU layers:

		# first (and only) set of FC => RELU layers
		model.add(Flatten())
		model.add(Dense(1024))
		model.add(Activation("relu"))
		model.add(BatchNormalization())
		model.add(Dropout(0.5))

		# softmax classifier
		model.add(Dense(classes))
		model.add(Activation(finalAct))

		# return the constructed network architecture
		return model

Fully connected layers are placed at the end of the network (specified by Dense on Lines 57 and 63).

Line 64 is important for our multi-label classification — finalAct dictates whether we’ll use "softmax" activation for single-label classification or "sigmoid" activation in the case of today’s multi-label classification. Refer to Line 14 of this script, smallervggnet.py and Line 95 of train.py .

Implementing our Keras model for multi-label classification

Now that we have implemented SmallerVGGNet , let’s create train.py , the script we will use to train our Keras network for multi-label classification.

I urge you to review the previous post upon which today’s train.py script is based. In fact, you may want to view them on your screen side-by-side to see the difference and read full explanations. Today’s review will be succinct in comparison.

Open up train.py and insert the following code:

# set the matplotlib backend so figures can be saved in the background
import matplotlib
matplotlib.use("Agg")

# import the necessary packages
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.preprocessing.image import img_to_array
from sklearn.preprocessing import MultiLabelBinarizer
from sklearn.model_selection import train_test_split
from pyimagesearch.smallervggnet import SmallerVGGNet
import matplotlib.pyplot as plt
from imutils import paths
import tensorflow as tf
import numpy as np
import argparse
import random
import pickle
import cv2
import os

On Lines 2-20 we import the packages and modules required for this script. Line 3 specifies a matplotlib backend so that we can save our plot figure in the background.

I’ll be making the assumption that you have Keras, scikit-learn, matpolotlib, imutils and OpenCV installed at this point. Be sure to refer to the “Configuring your development environment” section above.

Now that (a) your environment is ready, and (b) you’ve imported packages, let’s parse command line arguments:

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-d", "--dataset", required=True,
	help="path to input dataset (i.e., directory of images)")
ap.add_argument("-m", "--model", required=True,
	help="path to output model")
ap.add_argument("-l", "--labelbin", required=True,
	help="path to output label binarizer")
ap.add_argument("-p", "--plot", type=str, default="plot.png",
	help="path to output accuracy/loss plot")
args = vars(ap.parse_args())

Command line arguments to a script are like parameters to a function — if you don’t understand this analogy then you need to read up on command line arguments.

We’re working with four command line arguments (Lines 23-32) today:

--dataset : The path to our dataset.
--model : The path to our output serialized Keras model.
--labelbin : The path to our output multi-label binarizer object.
--plot : The path to our output plot of training loss and accuracy.

Be sure to refer to the previous post as needed for explanations of these arguments.

Let’s move on to initializing some important variables that play critical roles in our training process:

# initialize the number of epochs to train for, initial learning rate,
# batch size, and image dimensions
EPOCHS = 30
INIT_LR = 1e-3
BS = 32
IMAGE_DIMS = (96, 96, 3)

# disable eager execution
tf.compat.v1.disable_eager_execution()

These variables on Lines 36-39 define that:

Our network will train for 75 EPOCHS in order to learn patterns by incremental improvements via backpropagation.
We’re establishing an initial learning rate of 1e-3 (the default value for the Adam optimizer).
The batch size is 32 . You should adjust this value depending on your GPU capability if you’re using a GPU but I found a batch size of 32 works well for this project.
As stated above, our images are 96 x 96 and contain 3 channels.

Additional information on hyperparamters is provided in the previous post.

Line 42 disables TensorFlow’s Eager Execution mode. We found that this was necessary during our 2020-06-12 Update to achieve the same accuracy as on the original publication date of this article.

From there, the next two code blocks handle loading and preprocessing our training data:

# grab the image paths and randomly shuffle them
print("[INFO] loading images...")
imagePaths = sorted(list(paths.list_images(args["dataset"])))
random.seed(42)
random.shuffle(imagePaths)

# initialize the data and labels
data = []
labels = []

Here we are grabbing the imagePaths and shuffling them randomly, followed by initializing data and labels lists.

Next, we’re going to loop over the imagePaths , preprocess the image data, and extract multi-class-labels.

# loop over the input images
for imagePath in imagePaths:
	# load the image, pre-process it, and store it in the data list
	image = cv2.imread(imagePath)
	image = cv2.resize(image, (IMAGE_DIMS[1], IMAGE_DIMS[0]))
	image = img_to_array(image)
	data.append(image)

	# extract set of class labels from the image path and update the
	# labels list
	l = label = imagePath.split(os.path.sep)[-2].split("_")
	labels.append(l)

First, we load each image into memory (Line 57). Then, we perform preprocessing (an important step of the deep learning pipeline) on Lines 58 and 59. We append the image to data (Line 60).

Lines 64 and 65 handle splitting the image path into multiple labels for our multi-label classification task. After Line 64 is executed, a 2-element list is created and is then appended to the labels list on Line 65. Here’s an example broken down in the terminal so you can see what’s going on during the multi-label parsing:

$ python
>>> import os
>>> labels = []
>>> imagePath = "dataset/red_dress/long_dress_from_macys_red.png"
>>> l = label = imagePath.split(os.path.sep)[-2].split("_")
>>> l
['red', 'dress']
>>> labels.append(l)
>>>
>>> imagePath = "dataset/blue_jeans/stylish_blue_jeans_from_your_favorite_store.png"
>>> l = label = imagePath.split(os.path.sep)[-2].split("_")
>>> labels.append(l)
>>>
>>> imagePath = "dataset/red_shirt/red_shirt_from_target.png"
>>> l = label = imagePath.split(os.path.sep)[-2].split("_")
>>> labels.append(l)
>>>
>>> labels
[['red', 'dress'], ['blue', 'jeans'], ['red', 'shirt']]

As you can see, the labels list is a “list of lists” — each element of labels is a 2-element list. The two labels for each list is constructed based on the file path of the input image.

We’re not quite done with preprocessing:

# scale the raw pixel intensities to the range [0, 1]
data = np.array(data, dtype="float") / 255.0
labels = np.array(labels)
print("[INFO] data matrix: {} images ({:.2f}MB)".format(
	len(imagePaths), data.nbytes / (1024 * 1000.0)))

Our data list contains images stored as NumPy arrays. In a single line of code, we convert the list to a NumPy array and scale the pixel intensities to the range [0, 1] .

We also convert labels to a NumPy array as well.

From there, let’s binarize the labels — the below block is critical for this week’s multi-class classification concept:

# binarize the labels using scikit-learn's special multi-label
# binarizer implementation
print("[INFO] class labels:")
mlb = MultiLabelBinarizer()
labels = mlb.fit_transform(labels)

# loop over each of the possible class labels and show them
for (i, label) in enumerate(mlb.classes_):
	print("{}. {}".format(i + 1, label))

In order to binarize our labels for multi-class classification, we need to utilize the scikit-learn library’s MultiLabelBinarizer class. You cannot use the standard LabelBinarizer class for multi-class classification. Lines 76 and 77 fit and transform our human-readable labels into a vector that encodes which class(es) are present in the image.

Here’s an example showing how MultiLabelBinarizer transforms a tuple of ("red", "dress") to a vector with six total categories:

$ python
>>> from sklearn.preprocessing import MultiLabelBinarizer
>>> labels = [
...     ("blue", "jeans"),
...     ("blue", "dress"),
...     ("red", "dress"),
...     ("red", "shirt"),
...     ("blue", "shirt"),
...     ("black", "jeans")
... ]
>>> mlb = MultiLabelBinarizer()
>>> mlb.fit(labels)
MultiLabelBinarizer(classes=None, sparse_output=False)
>>> mlb.classes_
array(['black', 'blue', 'dress', 'jeans', 'red', 'shirt'], dtype=object)
>>> mlb.transform([("red", "dress")])
array([[0, 0, 1, 0, 1, 0]])

One-hot encoding transforms categorical labels from a single integer to a vector. The same concept applies to Lines 16 and 17 except this is a case of two-hot encoding.

Notice how on Line 17 of the Python shell (not to be confused with the code blocks for train.py ) two categorical labels are “hot” (represented by a “1” in the array), indicating the presence of each label. In this case “dress” and “red” are hot in the array (Lines 14-17). All other labels have a value of “0”.

Let’s construct the training and testing splits as well as initialize the data augmenter:

# partition the data into training and testing splits using 80% of
# the data for training and the remaining 20% for testing
(trainX, testX, trainY, testY) = train_test_split(data,
	labels, test_size=0.2, random_state=42)

# construct the image generator for data augmentation
aug = ImageDataGenerator(rotation_range=25, width_shift_range=0.1,
	height_shift_range=0.1, shear_range=0.2, zoom_range=0.2,
	horizontal_flip=True, fill_mode="nearest")

Splitting the data for training and testing is common in machine learning practice — I’ve allocated 80% of the images for training data and 20% for testing data. This is handled by scikit-learn on Lines 85 and 86.

Our data augmenter object is initialized on Lines 89-91. Data augmentation is a best practice and a most-likely a “must” if you are working with less than 1,000 images per class.

Next, let’s build the model and initialize the Adam optimizer:

# initialize the model using a sigmoid activation as the final layer
# in the network so we can perform multi-label classification
print("[INFO] compiling model...")
model = SmallerVGGNet.build(
	width=IMAGE_DIMS[1], height=IMAGE_DIMS[0],
	depth=IMAGE_DIMS[2], classes=len(mlb.classes_),
	finalAct="sigmoid")

# initialize the optimizer (SGD is sufficient)
opt = Adam(lr=INIT_LR, decay=INIT_LR / EPOCHS)

On Lines 96-99 we build our SmallerVGGNet model, noting the finalAct="sigmoid" parameter indicating that we’ll be performing multi-label classification.

From there, we’ll compile the model and kick off training (this could take a while depending on your hardware):

# compile the model using binary cross-entropy rather than
# categorical cross-entropy -- this may seem counterintuitive for
# multi-label classification, but keep in mind that the goal here
# is to treat each output label as an independent Bernoulli
# distribution
model.compile(loss="binary_crossentropy", optimizer=opt,
	metrics=["accuracy"])

# train the network
print("[INFO] training network...")
H = model.fit(
	x=aug.flow(trainX, trainY, batch_size=BS),
	validation_data=(testX, testY),
	steps_per_epoch=len(trainX) // BS,
	epochs=EPOCHS, verbose=1)

2020-06-12 Update: Formerly, TensorFlow/Keras required use of a method called .fit_generator in order to accomplish data augmentation. Now, the .fit method can handle data augmentation as well, making for more-consistent code. This also applies to the migration from .predict_generator to .predict. Be sure to check out my articles about fit and fit_generator as well as data augmentation.

On Lines 109 and 110 we compile the model using binary cross-entropy rather than categorical cross-entropy.

This may seem counterintuitive for multi-label classification; however, the goal is to treat each output label as an independent Bernoulli distribution and we want to penalize each output node independently.

From there we launch the training process with our data augmentation generator (Lines 114-118).

After training is complete we can save our model and label binarizer to disk:

# save the model to disk
print("[INFO] serializing network...")
model.save(args["model"], save_format="h5")

# save the multi-label binarizer to disk
print("[INFO] serializing label binarizer...")
f = open(args["labelbin"], "wb")
f.write(pickle.dumps(mlb))
f.close()

2020-06-12 Update: Note that for TensorFlow 2.0+ we recommend explicitly setting the save_format="h5" (HDF5 format).

From there, we plot accuracy and loss:

# plot the training loss and accuracy
plt.style.use("ggplot")
plt.figure()
N = EPOCHS
plt.plot(np.arange(0, N), H.history["loss"], label="train_loss")
plt.plot(np.arange(0, N), H.history["val_loss"], label="val_loss")
plt.plot(np.arange(0, N), H.history["accuracy"], label="train_acc")
plt.plot(np.arange(0, N), H.history["val_accuracy"], label="val_acc")
plt.title("Training Loss and Accuracy")
plt.xlabel("Epoch #")
plt.ylabel("Loss/Accuracy")
plt.legend(loc="upper left")
plt.savefig(args["plot"])

2020-06-12 Update: In order for this plotting snippet to be TensorFlow 2+ compatible the H.history dictionary keys are updated to fully spell out “accuracy” sans “acc” (i.e., H.history["val_accuracy"] and H.history["accuracy"]). It is semi-confusing that “val” is not spelled out as “validation”; we have to learn to love and live with the API and always remember that it is a work in progress that many developers around the world contribute to.

Accuracy + loss for training and validation is plotted on Lines 131-141. The plot is saved as an image file on Line 142.

In my opinion, the training plot is just as important as the model itself. I typically go through a few iterations of training and viewing the plot before I’m satisfied to share with you on the blog.

I like to save plots to disk during this iterative process for a couple reasons: (1) I’m on a headless server and don’t want to rely on X-forwarding, and (2) I don’t want to forget to save the plot (even if I am using X-forwarding or if I’m on a machine with a graphical desktop).

Recall that we changed the matplotlib backend on Line 3 of the script up above to facilitate saving to disk.

Training a Keras network for multi-label classification

Don’t forget to use the “Downloads” section of this post to download the code, dataset, and pre-trained model (just in case you don’t want to train the model yourself).

If you want to train the model yourself, open a terminal. From there, navigate to the project directory, and execute the following command:

$ python train.py --dataset dataset --model fashion.model \
	--labelbin mlb.pickle
Using TensorFlow backend.
[INFO] loading images...
[INFO] data matrix: 2165 images (467.64MB)
[INFO] class labels:
1. black
2. blue
3. dress
4. jeans
5. red
6. shirt
[INFO] compiling model...
[INFO] training network...
Epoch 1/30
54/54 [==============================] - 2s 35ms/step - loss: 0.3184 - accuracy: 0.8774 - val_loss: 1.1824 - val_accuracy: 0.6251
Epoch 2/30
54/54 [==============================] - 2s 37ms/step - loss: 0.1881 - accuracy: 0.9427 - val_loss: 1.4268 - val_accuracy: 0.6255
Epoch 3/30
54/54 [==============================] - 2s 38ms/step - loss: 0.1551 - accuracy: 0.9471 - val_loss: 1.0533 - val_accuracy: 0.6305
...
Epoch 28/30
54/54 [==============================] - 2s 41ms/step - loss: 0.0656 - accuracy: 0.9763 - val_loss: 0.1137 - val_accuracy: 0.9773
Epoch 29/30
54/54 [==============================] - 2s 40ms/step - loss: 0.0801 - accuracy: 0.9751 - val_loss: 0.0916 - val_accuracy: 0.9715
Epoch 30/30
54/54 [==============================] - 2s 37ms/step - loss: 0.0636 - accuracy: 0.9770 - val_loss: 0.0500 - val_accuracy: 0.9823
[INFO] serializing network...
[INFO] serializing label binarizer...

As you can see, we trained the network for 30 epochs, achieving:

97.70% multi-label classification accuracy on the training set
98.23% multi-label classification accuracy on the testing set

The training plot is shown in Figure 3:

**Figure 3:** Our Keras deep learning multi-label classification accuracy/loss graph on the training and validation data.

Applying Keras multi-label classification to new images

Now that our multi-label classification Keras model is trained, let’s apply it to images outside of our testing set.

This script is quite similar to the classify.py script in my previous post — be sure to look out for the multi-label differences.

When you’re ready, open create a new file in the project directory named classify.py and insert the following code (or follow along with the file included with the “Downloads”):

# import the necessary packages
from tensorflow.keras.preprocessing.image import img_to_array
from tensorflow.keras.models import load_model
import numpy as np
import argparse
import imutils
import pickle
import cv2
import os

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-m", "--model", required=True,
	help="path to trained model model")
ap.add_argument("-l", "--labelbin", required=True,
	help="path to label binarizer")
ap.add_argument("-i", "--image", required=True,
	help="path to input image")
args = vars(ap.parse_args())

On Lines 2-9 we import the necessary packages for this script. Notably, we’ll be using Keras and OpenCV in this script.

Then we proceed to parse our three required command line arguments on Lines 12-19.

From there, we load and preprocess the input image:

# load the image
image = cv2.imread(args["image"])
output = imutils.resize(image, width=400)
 
# pre-process the image for classification
image = cv2.resize(image, (96, 96))
image = image.astype("float") / 255.0
image = img_to_array(image)
image = np.expand_dims(image, axis=0)

We take care to preprocess the image in the same manner as we preprocessed our training data.

Next, let’s load the model + multi-label binarizer and classify the image:

# load the trained convolutional neural network and the multi-label
# binarizer
print("[INFO] loading network...")
model = load_model(args["model"])
mlb = pickle.loads(open(args["labelbin"], "rb").read())

# classify the input image then find the indexes of the two class
# labels with the *largest* probability
print("[INFO] classifying image...")
proba = model.predict(image)[0]
idxs = np.argsort(proba)[::-1][:2]

We load the model and multi-label binarizer from disk into memory on Lines 34 and 35.

From there we classify the (preprocessed) input image (Line 40) and extract the top two class labels indices (Line 41) by:

Sorting the array indexes by their associated probability in descending order
Grabbing the first two class label indices which are thus the top-2 predictions from our network

You can modify this code to return more class labels if you wish. I would also suggest thresholding the probabilities and only returning labels with > N% confidence.

From there, we’ll prepare the class labels + associated confidence values for overlay on the output image:

# loop over the indexes of the high confidence class labels
for (i, j) in enumerate(idxs):
	# build the label and draw the label on the image
	label = "{}: {:.2f}%".format(mlb.classes_[j], proba[j] * 100)
	cv2.putText(output, label, (10, (i * 30) + 25), 
		cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 255, 0), 2)

# show the probabilities for each of the individual labels
for (label, p) in zip(mlb.classes_, proba):
	print("{}: {:.2f}%".format(label, p * 100))

# show the output image
cv2.imshow("Output", output)
cv2.waitKey(0)

The loop on Lines 44-48 draws the top two multi-label predictions and corresponding confidence values on the output image.

Similarly, the loop on Lines 51 and 52 prints the all the predictions in the terminal. This is useful for debugging purposes.

Finally, we show the output image on the screen (Lines 55 and 56).

Keras multi-label classification results

Let’s put classify.py to work using command line arguments. You do not need to modify the code discussed above in order to pass new images through the CNN. Simply use the command line arguments in your terminal as is shown below.

Let’s try an image of a red dress — notice the three command line arguments that are processed at runtime:

$ python classify.py --model fashion.model --labelbin mlb.pickle \
	--image examples/example_01.jpg
Using TensorFlow backend.
[INFO] loading network...
[INFO] classifying image...
black: 0.00%
blue: 3.58%
dress: 95.14%
jeans: 0.00%
red: 100.00%
shirt: 64.02%

**Figure 4:** The image of a red dress has correctly been classified as *“red”* and *“dress”* by our Keras multi-label classification deep learning script.

Success! Notice how the two classes (“red” and “dress”) are marked with high confidence.

Now let’s try a blue dress:

$ python classify.py --model fashion.model --labelbin mlb.pickle \
	--image examples/example_02.jpg
Using TensorFlow backend.
[INFO] loading network...
[INFO] classifying image...
black: 0.03%
blue: 99.98%
dress: 98.50%
jeans: 0.23%
red: 0.00%
shirt: 0.74%

**Figure 5:** The *“blue”* and *“dress”* class labels are correctly applied in our second test of our Keras multi-label image classification project.

A blue dress was no contest for our classifier. We’re off to a good start, so let’s try an image of a red shirt:

$ python classify.py --model fashion.model --labelbin mlb.pickle \
	--image examples/example_03.jpg
Using TensorFlow backend.
[INFO] loading network...
[INFO] classifying image...
black: 0.00%
blue: 0.69%
dress: 0.00%
jeans: 0.00%
red: 100.00%
shirt: 100.00%

**Figure 6:** With 100% confidence, our deep learning multi-label classification script has correctly classified this red shirt.

The red shirt result is promising.

How about a blue shirt?

$ python classify.py --model fashion.model --labelbin mlb.pickle \
	--image examples/example_04.jpg
Using TensorFlow backend.
[INFO] loading network...
[INFO] classifying image...
black: 0.00%
blue: 99.99%
dress: 22.59%
jeans: 0.08%
red: 0.00%
shirt: 82.82%

**Figure 7:** Deep learning + multi-label + Keras classification of a blue shirt is correctly calculated.

Our model is very confident that it sees blue, but slightly less confident that it has encountered a shirt. That being said, this is still a correct multi-label classification!

Let’s see if we can fool our multi-label classifier with blue jeans:

$ python classify.py --model fashion.model --labelbin mlb.pickle \
	--image examples/example_05.jpg
Using TensorFlow backend.
[INFO] loading network...
[INFO] classifying image...
black: 0.00%
blue: 100.00%
dress: 0.01%
jeans: 99.99%
red: 0.00%
shirt: 0.00%

**Figure 8:** This deep learning multi-label classification result proves that blue jeans can be correctly classified as both *“blue”* and *“jeans”*.

Let’s try black jeans:

$ python classify.py --model fashion.model --labelbin mlb.pickle \
	--image examples/example_06.jpg
Using TensorFlow backend.
[INFO] loading network...
[INFO] classifying image...
black: 100.00%
blue: 0.00%
dress: 0.01%
jeans: 100.00%
red: 0.00%
shirt: 0.00%

**Figure 9:** Both labels, *“jeans”* and *“black”* are correct in this Keras multi-label classification deep learning experiment.

I can’t be 100% sure that these are denim jeans (they look more like leggings/jeggings to me), but our multi-label classifier is!

Let’s try a final example of a black dress (example_07.jpg ). While our network has learned to predict “black jeans” and “blue jeans” along with both “blue dress” and “red dress”, can it be used to classify a “black dress”?

$ python classify.py --model fashion.model --labelbin mlb.pickle \
	--image examples/example_07.jpg
Using TensorFlow backend.
[INFO] loading network...
[INFO] classifying image...
black: 91.28%
blue: 7.70%
dress: 5.48%
jeans: 71.87%
red: 0.00%
shirt: 5.92%

Oh no — a blunder! Our classifier is reporting that the model is wearing black jeans when she is actually wearing a black dress.

What happened here?

Why are our multi-class predictions incorrect? To find out why, review the summary below.

What's next? We recommend PyImageSearch University.

Course information:
86+ total classes • 115+ hours hours of on-demand code walkthrough videos • Last updated: July 2026
★★★★★ 4.84 (128 Ratings) • 16,000+ Students Enrolled

I strongly believe that if you had the right teacher you could master computer vision and deep learning.

Do you think learning computer vision and deep learning has to be time-consuming, overwhelming, and complicated? Or has to involve complex mathematics and equations? Or requires a degree in computer science?

That’s not the case.

All you need to master computer vision and deep learning is for someone to explain things to you in simple, intuitive terms. And that’s exactly what I do. My mission is to change education and how complex Artificial Intelligence topics are taught.

If you're serious about learning computer vision, your next stop should be PyImageSearch University, the most comprehensive computer vision, deep learning, and OpenCV course online today. Here you’ll learn how to successfully and confidently apply computer vision to your work, research, and projects. Join me in computer vision mastery.

Inside PyImageSearch University you'll find:

✓ 86+ courses on essential computer vision, deep learning, and OpenCV topics
✓ 86 Certificates of Completion
✓ 115+ hours hours of on-demand video
✓ Brand new courses released regularly, ensuring you can keep up with state-of-the-art techniques
✓ Pre-configured Jupyter Notebooks in Google Colab
✓ Run all code examples in your web browser — works on Windows, macOS, and Linux (no dev environment configuration required!)
✓ Access to centralized code repos for all 540+ tutorials on PyImageSearch
✓ Easy one-click downloads for code, datasets, pre-trained models, etc.
✓ Access on mobile, laptop, desktop, etc.

Click here to join PyImageSearch University

Summary

In today’s blog post you learned how to perform multi-label classification with Keras.

Performing multi-label classification with Keras is straightforward and includes two primary steps:

Replace the softmax activation at the end of your network with a sigmoid activation
Swap out categorical cross-entropy for binary cross-entropy for your loss function

From there you can train your network as you normally would.

The end result of applying the process above is a multi-class classifier.

You can use your Keras multi-class classifier to predict multiple labels with just a single forward pass.

However, there is a difficulty you need to consider:

You need training data for each combination of categories you would like to predict.

Just like a neural network cannot predict classes it was never trained on, your neural network cannot predict multiple class labels for combinations it has never seen. The reason for this behavior is due to activations of neurons inside the network.

If your network is trained on examples of both (1) black pants and (2) red shirts and now you want to predict “red pants” (where there are no “red pants” images in your dataset), the neurons responsible for detecting “red” and “pants” will fire, but since the network has never seen this combination of data/activations before once they reach the fully-connected layers, your output predictions will very likely be incorrect (i.e., you may encounter “red” or “pants” but very unlikely both).

Again, your network cannot correctly make predictions on data it was never trained on (and you shouldn’t expect it to either). Keep this caveat in mind when training your own Keras networks for multi-label classification.

I hope you enjoyed this post!

To be notified when future posts are published here on PyImageSearch, just enter your email address in the form below!

Download the Source Code and FREE 17-page Resource Guide

Enter your email address below to get a .zip of the code and a FREE 17-page Resource Guide on Computer Vision, OpenCV, and Deep Learning. Inside you'll find my hand-picked tutorials, books, courses, and libraries to help you master CV and DL!

About the Author

Hi there, I’m Adrian Rosebrock, PhD. All too often I see developers, students, and researchers wasting their time, studying the wrong things, and generally struggling to get started with Computer Vision, Deep Learning, and OpenCV. I created this website to show you what I believe is the best possible way to get your start.

228 responses to: Multi-label classification with Keras

Nader Essam

May 7, 2018 at 10:30 am

Great detailed tutorial as usual. thanks for sharing it, Adrian
- Adrian Rosebrock
  
  May 7, 2018 at 10:52 am
  
  Thank you Nader, I’m glad you liked it 🙂
  - Viny
    
    June 7, 2018 at 3:34 am
    
    When I am implementing it for getting the prediction of the classes in real time i.e. taking input frames from webcam, the process became very slow.
    How to make it fast?
    - Adrian Rosebrock
      
      June 7, 2018 at 3:02 pm
      
      What network are you using? Is it the network detailed in this blog post? And what type of webcam? USB/builtin? IP?
  - Praveen
    
    March 4, 2020 at 8:43 am
    
    Sir, is it possible to know whether a person has tucked-in shirt or not?
    I would be grateful if you could answer my question.
Mohammad Raad

May 7, 2018 at 10:59 am

Amazing
- Adrian Rosebrock
  
  May 7, 2018 at 11:08 am
  
  Thank you Mohammad! 🙂
Vishal

May 7, 2018 at 11:14 am

Thank you for all the comprehensive articles you make
- Adrian Rosebrock
  
  May 7, 2018 at 11:39 am
  
  It is my pleasure, Vishal 🙂
Richard Holly

May 7, 2018 at 11:23 am

Interesting one, thanks Adrian.
Maybe next blog should be how to do the same, but with fine tunning. Example – we like to classify bottles (for that we reuse already trained network – ) and aditionally we like to have color attribute (without cheating with opencv :)).
- Adrian Rosebrock
  
  May 7, 2018 at 11:39 am
  
  This same approach is absolutely doable via fine-tuning as well 🙂
Dany

May 7, 2018 at 11:26 am

Thanks for sharing, much appreciated. I had the same question and this post addressed it.
- Adrian Rosebrock
  
  May 7, 2018 at 11:40 am
  
  Thanks Dany, I’m happy I could help clear up the question 🙂
Oliver R.

May 7, 2018 at 11:34 am

Once again, amazing tutorial, dude.

I implemented something similar to that at work a while ago and it was really, really slow and wonky (first project I did that I did not use a tutorial as inspiration for).

Your solution is better by far, I’ve learned a lot from your tutorial and will keep that in mind for the future!
- Adrian Rosebrock
  
  May 7, 2018 at 11:40 am
  
  Thanks Oliver, I really appreciate that. I certainly had a lot of fun creating this tutorial and I’m really happy I can share it with you.
  - Oliver R.
    
    May 7, 2018 at 1:10 pm
    
    No problem!
    
    I started out learning machine learning with your tutorials around December last year. I hope that I will one day be as well versed in the subject as you are!
    - Adrian Rosebrock
      
      May 7, 2018 at 1:10 pm
      
      Keep following the blog and I’m sure you will be 😉
  - ImranKhan
    
    May 7, 2018 at 1:53 pm
    
    proba = model.predict(image)[0] what does it mean by [0]
    - Adrian Rosebrock
      
      May 7, 2018 at 2:12 pm
      
      We normally pass batches of images through the network. The .predict method will return an array of predictions the size of the batch. Therefore, with only one image, we grab the first (0-th) entry for the returned result. If you are new to deep learning and Keras I would recommend working through Deep Learning for Computer Vision with Python where I cover these fundamental concepts and techniques in detail.
  - Hussain
    
    February 18, 2020 at 10:51 pm
    
    Dear Sir, your implementation is amazing. i m here to ask the methodology of implementing this through SVM means conventional machine learning approach.
Lorenzo

May 7, 2018 at 12:01 pm

Hi Adrian,

Thank you for your posts, they’re always very interesting. I do have a question though and it’s related to the sigmoid/softmax function. After some research I found that softmax is actually the one that is used for multiclass labeling, while sigmoid is used for binary classification. My understanding is that since the sigmoid function is like an s, one can theoretically set a threshold to understand whether a picture is of class A or class B. These are the links where I read about it and they explain further what I am talking about:
http://dataaspirant.com/2017/03/07/difference-between-softmax-function-and-sigmoid-function/
https://stats.stackexchange.com/questions/233658/softmax-vs-sigmoid-function-in-logistic-classifier
https://www.quora.com/Why-is-it-better-to-use-Softmax-function-than-sigmoid-function

Was there maybe a mistake in your post? Thank you!
Lorenzo
- Adrian Rosebrock
  
  May 7, 2018 at 12:11 pm
  
  Hey Lorenzo — there was no mistake in the post. I would suggest giving the tutorial another read or two as I think you may be struggling with the difference between multi-class classification and multi-label classification. The post will help clear up the differences for you.
  
  The gist is that there is a difference between multi-class classification and multi-label classification. In multi-class classification there are two or more class labels in our dataset. Our model is trained to predict one of these class labels. The softmax output is typically used for this task as it returns probabilities that are more easily understandable (among other reasons). A very hacky and often inaccurate method to turning a multi-class classification model into a multi-label classifier is to compute the probabilities for every label and then take all predictions that have some threshold T% or higher probability. This method doesn’t work well as our network wasn’t trained end-to-end to jointly learn multiple labels.
  
  In multi-label classification our goal is to train a model where each data point has one or more class labels and thus predict multiple labels. To accomplish multi-label classification we:
  
  1. Swap out the softmax classifier for a sigmoid activation
  2. Train the model using binary cross-entropy with one-hot encoded vectors of labels
  
  Again, give the post another read or two to help clear up your concept question.
ImranKhan

May 7, 2018 at 12:07 pm

i dont know why you dont give me the reply command argument not work
- Adrian Rosebrock
  
  May 7, 2018 at 1:05 pm
  
  Sorry, I’m not sure what you are referring to. If you’re having trouble with the command line arguments before sure to refer to this post.
ets

May 7, 2018 at 12:52 pm

Very interesting article! Would it be possible to train objects with a corresponding value in percentage as label?
- Adrian Rosebrock
  
  May 7, 2018 at 1:08 pm
  
  Could you elaborate a bit more on your question? Are you asking if we could use a real-valued output from the model? The answer is yes, we can, but that would involve training the model to perform regression rather than classification.
  - ets
    
    May 8, 2018 at 1:58 am
    
    I know how to extract real valued output instead of classes from the final network but I was wondering whether it would work for training too. Simply example, if you have 10 binary images would it be possible to use the percentage of white pixels for training?
    
    Thanks
    - Adrian Rosebrock
      
      May 9, 2018 at 9:49 am
      
      What do you mean by “use the percentage of white pixels for training”? Your previous questions seemed to be based on classification but now it appears you are talking about segmentation?
Oliver R.

May 7, 2018 at 1:15 pm

Oh, just one more question:

Do you know of any way to do the reverse of this? As in: You don’t have multiple labels, but rather multiple photos.

For example: you have 4 photos of an object, each from a different perspective, but all 4 photos belong to the same label (f.e. Vans Shoe #12. Same shoe, but several photos of it from different angels).

And then evaluate a photo of a shoe and match it against every label, with the labels having multiple photos to match it against.

That would be really great blog post. I’ve been trying to do this for a few weeks now and I just can’t get anywhere with it.

Thank you!
- Adrian Rosebrock
  
  May 7, 2018 at 1:20 pm
  
  This sounds more like an image search engine/content-based image retrieval. I don’t have any public-facing tutorials on this but I demonstrate how to build such a system inside the PyImageSearch Gurus course. There are ~60 lessons on feature extraction, machine learning, and image search engines in the course — be sure to take a look as your exact case is covered where we have four images in a database and given a query image of one we are looking to find the other three.
  - Oliver R.
    
    May 7, 2018 at 1:25 pm
    
    I have been tempted a few times to purchase the instant access, but I’m worried that my skills in machine learning aren’t up to par to get the full benefit of the courses.
    
    I’ll definitely go ask my boss to get the instant access once I think I am ready to deliver on such an investment, though!
    - Adrian Rosebrock
      
      May 7, 2018 at 2:18 pm
      
      The Gurus course doesn’t assume any previous ML knowledge. It can certainly be helpful but it’s not a requirement. The fact that you are following this blog and that you are asking such questions shows me that you would be ready for the Gurus course. Don’t let the lack of ML experience hold you back. 100’s of other members have started the course knowing only Python, no prior OpenCV or ML/DL experience — they have successfully worked through the course and I’m sure you would be able to (and I’m always around to help as well).
Anirban Ghosh

May 7, 2018 at 1:21 pm

Really Great Article. Plus you always provide the trained parameters so that I don’t have to train the network and just use it to classify. I am a student of your deep learning for Computer Vision, I can say that your articles here perfectly complement what I learn there, and the amount of free knowledge that you share is just not available anywhere else on the net. More over I have never found even a single code of your not working. Thanks for these great articles. Regards,
- Adrian Rosebrock
  
  May 7, 2018 at 1:22 pm
  
  Thank you, Anirban. Your comment really made my day and put a smile on my face 🙂 I’m so happy to hear you are enjoying the blogs, books, and courses. It’s my pleasure to write and create such content and each day I feel lucky and privileged to do so.
Hammad

May 7, 2018 at 1:42 pm

can you please provide this code for real time detection in raspberry pi?
- Adrian Rosebrock
  
  May 7, 2018 at 2:16 pm
  
  I demonstrate how to run Keras + TensorFlow models on the Raspberry Pi in this blog post. I’m confident that you can take the code from the two posts and put the together the solution 🙂
  - Hammad
    
    May 7, 2018 at 2:25 pm
    
    Thankyou Sir 🙂
    
    but i am new to this field and working on a project and unfortunately cannot merge two codes :p
    - Adrian Rosebrock
      
      May 7, 2018 at 2:56 pm
      
      It’s okay if you are new to the field but I would recommend investing in your skills a bit before trying to take on more advanced projects. It may be challenging to learn a new skill but if you intend on working on more computer vision or deep learning projects I would highly suggest taking a look at Deep Learning for Computer Vision with Python where I discuss DL and CV in detail (the book will certainly help you get up to speed and well past the “beginner” stag). Just a suggestion for the future and I wish you the very best of luck with the project!
Justin Hirsch

May 7, 2018 at 1:59 pm

Adrian,

Excellent article. My question is in practice, would Switaj’s solution of using multiple CNNs be a better option for identifying unknown fashion combinations. In a real-world case, you might not have training images of every combination of categories, whereas one would have an example of each clothing type, and color type, etc. Is there a way to train a single network to generalize these situations better, by separating the labels into classes (maybe using a different dense layer for each class with a softmax activation function per class?)

Thank You!

Justin
- Adrian Rosebrock
  
  May 7, 2018 at 2:16 pm
  
  Potentially, but that really depends on the data and the distribution of classes in the dataset. I would also be careful with your terminology here. A label belongs to a class and the terms are typically used interchangeably. It might be better to refer to “sub-classes” and “parent classes”.
  
  The unknown combinations is problematic but will likely be more accurate and more efficient once you have enough training data. My hybrid recommendation approach would be to train the network on single classes, have it reach reasonable accuracy, then perform transfer learning via feature extraction for any other more complex tasks. I would still recommend the multi-label approach and gathering more data though.
  - Monali
    
    April 17, 2019 at 6:04 am
    
    Hi Adrian,
    
    I am trying to solve a multi-label classification problem where in I have an image and inside that image I have 3 different images (passport, DL, SSN). When I pass in that single image to the model then it should return me the all the classes present in that single image. Could you please share some link or thoughts on how to build a multi label model for the same.
    - Adrian Rosebrock
      
      April 18, 2019 at 6:42 am
      
      I wouldn’t recommend taking a classification approach. Instead, treat it like an object detection problem. You would train an object detector capable of detecting the passport, drivers license, and social security card. Based on the output of the detector you can determine which is present in the input image.
Suke

May 7, 2018 at 9:15 pm

Hi Adrian, another usefull post! I want to know if it is possible to Single Shot Detector (SSD) to output “object pose estimation” too? The current implementation of SSD only output the prediction of class and its bounding box.

Thank you.
- Adrian Rosebrock
  
  May 8, 2018 at 5:51 am
  
  No, your network would need to be trained to perform pose estimation as well. There are papers that discuss how to do this but you cannot take a network trained for object detection and have it output pose estimation as well unless the model was also trained to perform pose estimation.
Ritika

May 7, 2018 at 10:56 pm

Very good information shared…
- Adrian Rosebrock
  
  May 8, 2018 at 5:50 am
  
  Thank you, Ritika!
Lavz

May 8, 2018 at 3:31 am

Excellent Article..
- Adrian Rosebrock
  
  May 8, 2018 at 5:50 am
  
  Thanks Lavz!
Andrey

May 8, 2018 at 9:48 am

Another great tutorial!
Thank you, Adrian.
- Adrian Rosebrock
  
  May 9, 2018 at 9:38 am
  
  Thanks Andrey, I’m glad you liked the tutorial 🙂
Sivaramakrishan Rajaraman

May 8, 2018 at 1:17 pm

Great post, Adrian. There are instances where I have images having more than two labels, like for instance, blue shirt black jeans where I have a collection of images belonging to the folder “blue shirt black jeans”. I need to split the labels into four and append it to the list. How am I supposed to modify the code in the train.py and classify.py scripts to split the labels?
- Adrian Rosebrock
  
  May 9, 2018 at 9:34 am
  
  Exactly how you need to update the code depends on your directory structure but you should take a close look at Lines 11 and 12 followed by the Python shell output were I demonstrate how to parse out the labels. Do not try to train your network until you are 100% positive you are parsing your labels correctly.
Michel

May 9, 2018 at 5:16 am

Hello Adrian, great post !
I would argue that in this case of multi label classification, you would need to use 2 softmaxes (one on shirt/dress/jeans and one on red/blue/black) instead of 6 sigmoids, as there are really 2 classifications going on here. I have done that for an age + gender classification from face pictures and it worked well (it’s a little bit trickier to configure in Keras).
Again congrats on the very clear explanations.
Michel
- Adrian Rosebrock
  
  May 9, 2018 at 9:26 am
  
  Just to clarify, are you referring to two parallel FC layers that both receive the same input from the previous layer?
  - Michel
    
    May 9, 2018 at 3:43 pm
    
    Yes.
    The interest of this is, in my view, that it combines two classification problems into one network. The two parallel softmaxes “force” a selection of the garment type one one side and of the color on the other side.
    - Adrian Rosebrock
      
      May 10, 2018 at 6:13 am
      
      Got it, I understand now. What you are referring to is called “multi-output”, not to be confused with “multi-class” or “multi-label”. A multi-output network can actually combine both “multi-class” (via a softmax classifier) and “multi-label” (via the sigmoid method used in this post). I’m actually planning on covering multi-output networks in a future blog post but I’m going to move it up in the schedule as I agree with you, there would be a lot of interest in this and I would love to cover it.
Faizan Amin

May 9, 2018 at 5:24 am

Hi I get the following error while running the training file

usage: train.py [-h] -d DATASET -m MODEL -l LABELBIN [-p PLOT]
train.py: error: the following arguments are required: -d/–dataset, -m/–model, -l/–labelbin

can anyone help me
- Adrian Rosebrock
  
  May 9, 2018 at 9:24 am
  
  If you are new to command line arguments, that’s okay, but make sure you read up on them first. Once you have an understanding of command line arguments you can solve the problem.
Alex

May 9, 2018 at 1:13 pm

Excellent article. I have been following these posts for some time now as i skill up in deep learning and find these hands on examples an excellent way to learn/comprehend. Did have a question. I tailored your example to model on 2 classes and using a slight variation (sparse_categorical_crossentropy) as i was getting errors when using categorical_crossentropy. Is this a suitable approach? Secondly, if I were to train on a large dataset (i.e. 10-15k images+) is there a way to batch process loading into main memory (line 53) vs sequential to avoid out of memory errors?

Thanks for your help! I’ve purchased your Deep Learning for Computer Vision books and am already looking forward to taking my skills further moving from recognition –> detection –> segmentation
- Adrian Rosebrock
  
  May 10, 2018 at 6:17 am
  
  Hey Alex, thanks for being a PyImageSearch reader 🙂 So a few things to keep in mind here:
  
  1. If you’re only using two classes for classification you should be using binary cross-entropy, not categorical cross-entropy. We use binary cross-entropy for 2 classes and categorical cross-entropy for more than two classes.
  2. You mentioned tailoring this example to your own needs but if you’re using categorical cross-entropy + a softmax classifier I think you may be defeating the purpose of this tutorial. In this guide we are learning to perform multi-label classification where our network can return multiple predictions from a single FC layer. To accomplish this, we need a sigmoid activation + binary cross-entropy as our loss.
  3. You can absolutely work with large datasets. Since you already own a copy of Deep Learning for Computer Vision with Python make sure you refer to the “Working with HDF5” chapter in the Practitioner Bundle where I show you how to (1) serialize an image dataset to HDf5 and (2) write a custom generator to yield batches of data from the HDF5 dataset on disk. Using this approach you can scale your training algorithm to work with gigabytes/terabytes of data.
JJ

May 9, 2018 at 7:43 pm

Hey Adrian,

I have an open source project on Github which may be able to use this functionality but I’m wondering if I’m allowed to include parts of this code (in their original and/or modified forms) in a public repository (assuming I give credit to you including a link to this page in the comments and/or readme)?
- Adrian Rosebrock
  
  May 10, 2018 at 6:11 am
  
  Sure, feel free to use the code. I kindly ask that you give credit and include a link back to the PyImageSearch site in the README. If you can do that I would really appreciate it 🙂 Having a link back really helps me out. Thank you!
YOHANNIS KIFLE (JOE KIFLE)

May 10, 2018 at 2:10 am

Just awesome, perfect!
- Adrian Rosebrock
  
  May 10, 2018 at 6:10 am
  
  Thanks so much, I’m glad you enjoyed it!
Swapnil

May 12, 2018 at 12:48 am

Hello Adrian !!!
I started out learning machine learning with your tutorials around last 3months..and its awesome
Done with most of your tutorials.
Appreciate it.!!!
Thanks for sharing your knowledge with us.
I am doing one of the project where I want to detect person only with own custom only..
Then detect the facial expressions of that person is he (happy,angry,sad,neutral,surprise) as u had done face recognition .
And do body dynamics of detected person,for if is having some gun or some bomb in his clothes.. This all can help us in mall for getting theifs or in airpot.

Can u help me how will be body dynamic or to detect inner part of clothes,if he has something wrong inside his body..
Will it be helpful if you do this for us.
Or help me little bit .to get me in track.
Thank you
- Adrian Rosebrock
  
  May 12, 2018 at 6:15 am
  
  Thanks for the comment Swapnil — and I’m so happy to hear you have been finding the PyImageSearch blog help! 🙂
  
  Human activity recognition and pose estimation are still an open area of research. It’s far from perfect but we’re making good strides. I would suggest starting with something like OpenPose. That would help you obtain the “joints” for the human body. From there you might want to run a temporal analysis on how those points change over time. You might be able to find some sort of suspicious behavior.
  
  Otherwise, keep in mind that CV and DL algorithms are not magic. If we cannot “see” the object (gun, bomb, etc.) in the image/video then our CV algorithms won’t either.
  - aleena k varghese
    
    November 5, 2018 at 5:25 am
    
    Please share details about activity recognition
    - Adrian Rosebrock
      
      November 6, 2018 at 1:16 pm
      
      I don’t have any tutorials on activity recognition at the moment but I will certainly consider it for the future.
furkan

May 12, 2018 at 11:57 pm

help me pls systemexit 2 error
- Adrian Rosebrock
  
  May 14, 2018 at 6:54 am
  
  Can you share the full stacktrace of the error? Which script is generating the error?
MImranKhan

May 13, 2018 at 1:27 am

imagePath.split(os.path.sep)[-2].split(“_”) what does it mean by [-2]
- Adrian Rosebrock
  
  May 14, 2018 at 6:53 am
  
  It’s the index of the array/slice. Please give the post another read — I dissect this line of code thoroughly. I would also suggest opening up a Python shell and playing around with it. Insert some “print” statements, debug, and learn from it 🙂
Moni93

May 15, 2018 at 11:15 am

Hello Adrian! I Great tutorial for beginners who want to get started with a concrete case of multi_label classification.
1/ I have two question for you my friend: why use Accuracy as a metric? To me it’s not too significant because in each output have many possible zeros , so even in the worse case ( which is not predicting anything), your model will have high accuracy but still doesn’t reflect reality of things: example, imagine output of shape (30,) , what if you are supposed to predict two ones and all the rest are zeros, in the worst case your classifier will tell you that all labels are 0 , which in data case give you accuracy= 28/30 , near to 1 but not significant? Why don’t use recall and precision for each label? ( I think it’s not easy to implement directly in Keras because time ago some developers added recall and precision, but it was calculated by batch size, which can be misleading if you want to calculate later the recall and precision over an entire epoch, Best way is to use directly TensorFlow)

2/Why chose assemble images in groups where each group represents a pair (color,dressing) ‘where color and dressing can take certain values ‘ ? Can’t we just regroup images by colors and dressing independently ? I mean instead of having folders regrouped by pair (color, dressing) , just construct folders regrouped by attribute ( where attribute can be either color or dressing) ? I am asking you this question because I am working on a case where I have 36 labels ( 2 for gender , 15 for clothes, 19 for color) : So i did not construct folders by triplet (gender,clothes,color , i just constructed my folders by attribute ( where attribute can be either the gender , the cloth or the color.

Again, thank you for this tutorial, may be very useful to many since you are introducing many useful notions ( how to build the dataset of images, image augmentation, pre-processing data).
- Adrian Rosebrock
  
  May 17, 2018 at 7:09 am
  
  1. I think this really depends on how your accuracy measure is implemented. If it’s a naive accuracy measure as the one you mentioned, yes, that would be a problem. However, I believe the Keras implementation is actually looking deeper than that and looking for the entire vector to be classified vector. Perhaps my understanding there is wrong but we would need to consult the Keras code as I don’t think this specific use case is covered in the documentation. I would encourage you to take a look! Dive in. Research. Discover. That’s the best way to learn.
  
  In either case you can leave out the accuracy measure, swap in a different measure, or implement your own. All are possible using Keras.
  
  2. How you decide you build your dataset directory structure is entirely up to you. I used a simple directory structure here as that is all that is required for the project. It allows us to focus on the code and techniques for multi-label classification rather than dataset directory structure which is an entirely different subject.
  
  To be honest, if you wanted more than two labels you should be storing the image filename + labels in a CSV or JSON file. I will leave that for readers to implement.
- Shijia
  
  June 14, 2018 at 6:23 am
  
  I am sure that your understanding of the accuracy is right. If the number of class is large, the default accuracy in Keras is misleading. In this blog, the number of class is 5, therefore it seems fine to use the accuracy metric.
Arusze

May 16, 2018 at 12:30 pm

Amazing article! But any hint how get both test accuracy and validation accuracy for each epoch? Thanks!
- Adrian Rosebrock
  
  May 17, 2018 at 6:49 am
  
  Take a look at my reply to “Kangyue”. The gist is that you would take part of your training data and do another “train_test_split” on it to generate your validation data. If you’re interested in learning more about the fundamentals of machine learning/deep learning, including data split best practices and training your own neural networks, be sure to refer to Deep Learning for Computer Vision with Python.
Kangyue

May 16, 2018 at 1:39 pm

Hi there, it’s very clear that you split 80% data for training, 20% data for testing, any chance you would like to think split some data for validation? Thanks.
- Adrian Rosebrock
  
  May 17, 2018 at 6:47 am
  
  You could certainly do a validation split as well. I would recommend doing an 80/20 split to obtain your training data and testing data, then take 10-15% of your split training data for validation.
MuTaTeD

May 17, 2018 at 2:26 am

Hi Adrian
Great tutorial, just about what I was looking for 😉
One question, how can we evaluate the accuracy / specificity etc for the multi-label setup.
For example I have 14 label and each (training and test) image will have exactly 3 labels (plus some structure of which labels can come together and which can not etc).

Now I wish to evaluate my system using different matrices… Some guidance please.

Regards
- Adrian Rosebrock
  
  May 17, 2018 at 6:42 am
  
  I’m not sure what you mean by evaluating using different matrices. Are you referring to your vectors of class labels? The “accuracy” metric in Keras will help you determine the accuracy. In fact, the accuracy is already computed using the method proposed in this blog post.
  - MuTaTeD
    
    May 17, 2018 at 12:22 pm
    
    Thanks for the response.
    I meant, how to generate and interpret the confusion matrix etc for this system to review if the system is reporting 98% validation accuracy, it it just for each individual label or a group of labels especially since the setup had all the labels competing each other in independent Bernoulli trials (line 103 – 105 of your code)
    
    For example if the system reports 92% accuracy in my case then doesn’t this mean that the system has 92% probability of accurately discovering (or rejecting) a particular correct label for the image. Thus if I have 3 labels to be applied for the said image then in effect I have 0.92^3 = 77.8% probability that all 3 labels are correct
    - Adrian Rosebrock
      
      May 22, 2018 at 6:56 am
      
      In that case I would recommend looping over each of your testing images individually, computing the labels, and comparing them to the ground-truth. You would compare each and every entry in the vector to determine if they match or differ. If it’s not identical you can increment your “incorrect” counter.
Kangyue

May 18, 2018 at 10:16 am

That makes sense. Now, I have another question, instead of test the image one by one, is there any that you have a test data array and do the test according to this whole data set?
- Adrian Rosebrock
  
  May 22, 2018 at 6:41 am
  
  Yes. You build a batch of images by stacking them together via NumPy. I don’t have an example of this on the PyImageSearch blog (yet) but I do cover it in detail inside Deep Learning for Computer Vision with Python.
Jeffry

May 19, 2018 at 9:48 am

Hi Adrian,
I really interesting with your explaination about this tutorial.
But I have a big question in my mind
My first question is I still think what the different VGGnet with Smallvggnet as long i read your explain in tutorial still same start with preprosessing, use relu and softmax too. On my oppinion maybe I use smallvggnet because I cant use VGGnet with my own computer equipment? or something else we not a big dataset? or other I still not have an answer.
My second question is how to make all label predict as our output?
My third question is can this tutorial not only use image? Can make a real time video?
- Adrian Rosebrock
  
  May 22, 2018 at 6:24 am
  
  1. Exactly which architecture you should use depends on your image dataset and the problem you are trying to solve. How to choose an architecture and optimizer is covered in side Deep Learning for Computer Vision with Python.
  
  2. Yes, the model.predict() function will return probabilities for every class label.
  
  3. Yes, you can use video as well. I would suggest using this post to help you get started.
Charles

May 21, 2018 at 10:18 pm

Hi Andrian, I think a “two headed” model is more proper to this kind of multi-label classification, and I implemented it in pytorch (I am not familiar with keras), I added two head to the top of a pretrained resnet-18 by replacing it’s fully connected layer, one head for classifying color and another for classifying the cloth type, and at last I got an accuracy of 0.99 .
- Adrian Rosebrock
  
  May 22, 2018 at 5:54 am
  
  Hey Charles — what you are referring to is actually called “multi-output” classification. Each of the fully-connected heads can be used for either single label or multi-label classification. I’ll be discussing how to implement this method in a future blog post releasing soon.
Rafael

May 23, 2018 at 7:03 am

Hi Adrian, I had a question:
What about whenever you have different large datasets?
– imagine that dataset1 contains information about colours
– dataset2 contains information about types.
This is just an example, imagine you have 10 different datasets but each one contains different data (to be red is independent to be a shirt)

How would you train this?
- Adrian Rosebrock
  
  May 23, 2018 at 7:10 am
  
  You could train N separate networks, one for each dataset, or one network with N fully-connected heads, one for each of the datasets. The problem with the latter approach is that your datasets may not not be entirely relevant and in that case a single network approach would fail to learn high accuracy discerning patterns.
Vincent

May 27, 2018 at 12:02 pm

Hi Adrain,

Are you going to cover transfer learning in medical imaging? It seems that transfer learning from ImageNet models don’t work nearly as well on medical imaging data because the medical images have so little in common with photographs.
- Adrian Rosebrock
  
  May 28, 2018 at 9:38 am
  
  I would consider it but I would need suggestions on which medical image dataset readers would like me to cover.
  - Vincent
    
    May 29, 2018 at 2:02 am
    
    Thanks for the reply. Do think about it because AI in medical imaging is a hot topic and not many bloggers teach how to do transfer learning on medical imaging. If you write a book to cover the application of deep learning to medical imaging I will definitely buy it!!
    - Adrian Rosebrock
      
      May 31, 2018 at 5:27 am
      
      Medical imaging is such an interesting topic but one of the problems is having enough “interesting” datasets to use that are also publicly available for others to access and work with. I’m always looking for recommendations on public use medical image datasets so if you have any recommendations let me know.
  - Ashong Nartey
    
    May 29, 2018 at 4:43 am
    
    Chest X ray data.
    - Adrian Rosebrock
      
      May 31, 2018 at 5:25 am
      
      Do you have a chest X-ray dataset that you are working with?
Mouad

May 29, 2018 at 12:25 am

Hi Adrian,

Great tutorial as always, thanks a lot.

I have a question concerning your choice of binary_crossentropy instead of categorical_crossentropy, or rather a problem. Is it true that the accuracy/loss is actually inaccurate in the model since you use the binary one? On my dataset, I got like 96% accuracy since the first epoch and I found it very weird, so I wrote a slow script to evaluate the validation set one by one to check the real accuracy and it was only 60%

It may be because of my naive accuracy check, but then I searched in stackoverflow and found this thread : https://stackoverflow.com/questions/42081257/keras-binary-crossentropy-vs-categorical-crossentropy-performance

Thanks again !
- Adrian Rosebrock
  
  May 31, 2018 at 5:33 am
  
  I think you may be confusing accuracy, loss, and our objective function we are minimizing. See my reply to “Lorenzo” on May 7, 2018.
Belhal Karimi

June 1, 2018 at 8:10 am

Hi Adrian, thanks a lot for this.
Stupid question: why do the probabilities of Dress + Jeans does not sum to 1?
- Belhal Karimi
  
  June 1, 2018 at 8:12 am
  
  Actually of Dress+ Jeans + Shirt: most of the time the total probabilites for the label of the same nature (color or type of clothes) sum over 1
  - Adrian Rosebrock
    
    June 5, 2018 at 8:26 am
    
    See my reply to your comment (June 5, 2018 at 8:23 am).
Belhal Karimi

June 1, 2018 at 9:51 am

Thanks so much for this.

How come the probabilities for Dress and Shirt sum to over 100% in the first example?
- Adrian Rosebrock
  
  June 5, 2018 at 8:23 am
  
  Keep in mind that these are treated as two independent labels — that is the entire point of this tutorial. Since we are using a sigmoid activation with binary cross-entropy our single fully-connected layer can predict multiple labels.
  - Belhal Karimi
    
    June 25, 2018 at 11:27 am
    
    Got it. Thanks
Bhushan Muthiyan

June 1, 2018 at 9:41 pm

Hello Adrian,

Thank you for your great tutorial.
I wanted to know about the “fashion.model” file.
How do I build this file for my own dataset with different set of images all together ?

Please let me know.

Thank you.
- Adrian Rosebrock
  
  June 5, 2018 at 8:13 am
  
  The fashion.model file is our serialized Keras model. If you would like to build your own dataset be sure to refer to this blog post.
Andre

June 11, 2018 at 9:03 am

Why not two outputs both with softmax activation functions? Since you know that your classes are disjoint (there cannot be jeans and dress) this would probably work better than a general sigmoid function.
- Adrian Rosebrock
  
  June 11, 2018 at 10:22 am
  
  I actually covered that in a followup post 🙂
Gyat

June 17, 2018 at 6:13 am

@Adrian,

First of all, thank you for your wonderful blog. I have learnt a lot from you. This tutorial too, is amazing. However, as I was trying to reproduce your results, I noticed something unusual:

Background: Your code is set to display the images in a new “Window”. However, I am on a system which does not support a GUI. Hence I had to change it to display the images inside a Jupyter Notebook. I did this by using “import matplotlib.pyplot as plt” and then using the imshow() function of plt.

The unusual: If I use your default code settings (specifically use the cv2.imread() function), the images come out to be blue-ish. Always. The predictions make sense though.
However, I can see the picture properly if I use imageio’s imread(), but the machine sees and predicts something else. Any idea why this behavior happens?
Here’s the link where I have uploaded the two images.
https://github.com/Gyat/queriesNReferences/tree/master/fashionExploration
They have been generated using the same code, the only difference being change in the imread() function. Correct output has cv2.imread() and incorrect output has imageio.imread().
- Adrian Rosebrock
  
  June 19, 2018 at 8:55 am
  
  It’s actually not unusual once you understand what’s going on under the hood. The cv2.imread function returns images in BGR order but matplotib expects them in RGB order. See this blog post for more information.
Sarthak Ahuja

June 19, 2018 at 1:19 am

Hi Adrian !

First of all, thank you so much for your amazing tutorials! They are great for anyone trying to learn machine/Deep learning and even OpenCV.

I need to classify attributes in a face like colour of eye, hair, skin; facial hair, lighting and so on. Each has few sub-categories in it. So should I directly apply sigmoid on all the labels or separately apply softmax on each subcategory like hair/eye colour etc?
Which one will be better in this case?
Or should I combine both as some subclasses are binary?
So I should choose binary cross entropy for binary-class classification and categorical-cross entropy for multi-class classification? And combine them together afterwards in the same model?

Moreover, should I approach this a multi-output problem or a multi-label classification problem?

I have read your similiar tutorial done using dlib and opencv, but I am using CNNs for it.

Thanks for your help!

Sarthak
- Adrian Rosebrock
  
  June 19, 2018 at 8:28 am
  
  It’s hard to say which method will work best without seeing the data itself. Normally in these situations a deep learning practitioner would run two experiments:
  
  – One with a single FC layer with sigmoid activations
  – And another experiment with multiple FC layers
  
  The short answer is that you’ll need to evaluate and run experiments to determine your best course of action.
Stephen Cruz

June 21, 2018 at 10:17 am

Hi. Thank you for the tutorial. I encountered a problem in the output when i tried to get an image from the internet and use it for testing .

Traceback (most recent call last):
…
(h, w) = image.shape[:2]
AttributeError: ‘NoneType’ object has no attribute ‘shape’

The examples work out just fine but when the image comes from google. Those lines are displayed in the prompt. I’m new to deep learning so i really don’t understand yet.
Thank you again for this wonderful tutorial
- Adrian Rosebrock
  
  June 21, 2018 at 2:21 pm
  
  Your path to your input images is incorrect. Double-check your file paths. Read this tutorial on NoneType errors. If you are new to OpenCV I would suggest you read through both:
  
  1. Practical Python and OpenCV
  2. Deep Learning for Computer Vision with Python
  
  Both of these books will help you quickly and efficiently get up to speed.
niloo

June 22, 2018 at 2:20 am

Dear Dr.Rosebrock

I am writing to you to ask a few questions about Multi-label classification with Keras in this link:”https://pyimagesearch.com/2018/05/07/multi-label-classification-with-keras/”
– What is the maximum number of classes we can learn using this network?
– How many number of samples per class should we have at least?
– and would you introduce any references to understand how “finalAct = sigmoid” will enable us to perform multi-label classification?

I am eagerly looking forward to hearing from you.
kind regards.
- Adrian Rosebrock
  
  June 25, 2018 at 2:06 pm
  
  1. There isn’t a real “maximum”. You would just need to modify your network architecture to accept more classes and likely deepen it as the classes become more complex.
  2. I recommend 250-1,000 for simple NNs. You should try to get higher than 1,000 for more deeper NNs and more complex.
  3. Re-read the post. I discuss why the sigmoid activation function is used.
  
  I would also suggest you read through Deep Learning for Computer Vision with Python to help you get up to speed with deep learning. The book covers both fundamentals and advanced topics, ensuring you can effectively apply deep learning to your projects.
Paul

June 27, 2018 at 12:15 am

Dear Dr.Rosebrock

I am kind of confused about the metric ‘accuracy’. For single-label, accuracy can be easy to understand, the number of data points that are being correctly predicted among the whole data points.

But for multi-label classification, how to define correctly predicted? For example in your case, do both the output two labels(e.g. blue & dress) need to be same as the ground truth two labels(e.g. blue & dress)? Or the most possible label(e.g. blue) is the same as one of the ground truth labels( e.g. blue & dress)?
- Adrian Rosebrock
  
  June 28, 2018 at 8:10 am
  
  Both predicted labels should match the two ground-truth labels.
panpan zhu

July 11, 2018 at 8:33 am

It is a great work with a lot of details. Thanks for your sharing!
- Adrian Rosebrock
  
  July 13, 2018 at 5:15 am
  
  Thanks panpan, I’m glad you found it helpful! 🙂
Ario

July 22, 2018 at 1:24 am

Hi!
I renamed the output model and labelbin name to create a new model but I got the following error:

OSError: Unable to create file (unable to open file: name = ‘/keras-multi-label/my.model/’, errno = 21, error message = ‘Is a directory’, flags = 13, o_flags = 242)

what is the problem?
- Ario
  
  July 22, 2018 at 1:57 am
  
  Hi again!
  I found the problem and I fixed it but now I’ve got another error.
  my dataset is 5905.66MB and I turned all the data augmentation parameters to 0 like this :
  
  ImageDataGenerator(rotation_range=0, width_shift_range=0,
  height_shift_range=0, shear_range=0, zoom_range=0,
  horizontal_flip=False, fill_mode=”nearest”)
  
  and now I have ” MemoryError ”.
  any ideas?
  - Adrian Rosebrock
    
    July 25, 2018 at 8:25 am
    
    The code used in this post assumes that you can fit your entire image dataset into RAM. Your image dataset is too large to fit into RAM. I discuss techniques to create custom data generators that scale to terabytes of images inside Deep Learning for Computer Vision with Python — I suggest you start there.
Enjia Chen

August 3, 2018 at 7:16 am

Hi Adrian, I want to run the code in the server with 4 P40 GPUs (OS: CentOS7) because it is hard to run it in my local computer (Windows7). However I have a problem, you know for the safety reasons server can’t connect to Internet so I have to install Keras offline. And do you have any good idea to handle this tough problem for I tried to install .whl one by one but it seemed useless, meanwhile there’s a strange phenomenon that keras_preprocessing.whl and keras_application.whl rely on each other.Thanks!
- Adrian Rosebrock
  
  August 7, 2018 at 7:07 am
  
  Hey Enjia — can you physically access the machine? Or can you connect to it through a local intranet? If so, clone down all the repositories you need, move them to a thumb drive (or connect via intranet), copy the files to the server, and then install from source.
Abhishek

August 4, 2018 at 4:20 am

Hey can you please explain me the contents of fashion.model
- Adrian Rosebrock
  
  August 7, 2018 at 6:54 am
  
  The fashion.model file is a binary HDF5 file generated by Keras. It includes the model architecture and the weights for each layer. It is not meant to be human readable and it’s only mean to be read and interpreted by the Keras library.
Dhruv Chamania

August 5, 2018 at 7:30 am

What will be the maP and the all that stuff for the given network. How to calculate it. Can you give a tutorial on that, not really properly understanding the concept. Every trained netowrk architecture I see gives those parameters.
- Adrian Rosebrock
  
  August 7, 2018 at 6:50 am
  
  Hey Drhuv, I cannot guarantee I will have an entire tutorial dedicated to mAP but I will consider it for the future.
Dhruv

August 6, 2018 at 6:17 am

Hi Adrian,

I went ahead and implemented your network into google colab.

https://github.com/dhruvchamania/Using-Google-Colab-To-Train-CNN

Could you provide a more detailed guide for this especially for networks like VGG, GoolgeNet, AlexNet.

You could maybe refine the tedious approach of mine of training in GoogleColab.
- Adrian Rosebrock
  
  August 7, 2018 at 6:44 am
  
  Hi Drhuv — I actually provided super detailed guides for VGG, GoogLeNet, AlexNet, SqueezeNet, ResNet and many more inside Deep Learning for Computer Vision with Python. That would be by far my recommended starting point for you.
  - Dhruv
    
    August 8, 2018 at 4:43 am
    
    Ok thanks. In process of saving to buy pyimageserch btw.
    - Adrian Rosebrock
      
      August 9, 2018 at 2:57 pm
      
      Sounds great, Dhruv 🙂 I promise that it is an excellent investment of both your time and finances. Always feel free to reach out if you have any questions.
Johan

August 15, 2018 at 3:51 pm

Hey Adrian, thanks for your tutorial! It helped me a lot for my personal project. I just have a question about an issue that I had:

I tried to personalize it by training a simple model to recognize triangles, squares, and circles in a picture. I only had to change the dataset by downloading pictures from kaggle. The model that I trained is not accurate at all. It only detects triangle and only says 100%. I was windering if I also had to change mlb.pickle.
- Adrian Rosebrock
  
  August 17, 2018 at 8:14 am
  
  Is there a particular reason you’re using multi-label classification here? A shape should be only one type: a triangle, square, or etc. A square cannot be a triangle, so again, I’m a bit confused why you are using multi-label classification.
Chandu

August 25, 2018 at 8:56 am

Hi Adrian,

Thanks a lot for the tutorial. Its helping me a lot 🙂 .
I tried to implement the same with a different dataset which is similar but the image sizes are less.Most of them are between 10k to 30k.
I can see that images are loaded but there is some prob while resizing.

When I try running this code after making necessary changes its throwing this error:

image = cv2.resize(image, (IMAGE_DIMS[1], IMAGE_DIMS[0]))
error: (-215:Assertion failed) !ssize.empty() in function ‘cv::resize’

can you help me on this?
- Adrian Rosebrock
  
  August 30, 2018 at 9:34 am
  
  Double-check your path to the input directory of images. It sounds like a path is invalid and then “cv2.imread” is returning None.
  - Aniket
    
    August 31, 2018 at 5:22 am
    
    I had the same issue a while ago.. Once double check the image data.
    add print (“[INFO] Image Name : ” + imagePath) line just after image = cv2.imread(imagepath). you can know which of the image file is corrupted or the filename is not correct.. also the file names must be in decimal number format.
    - Adrian Rosebrock
      
      September 3, 2018 at 5:02 pm
      
      Thanks for sharing, Aniket!
Aniket Srivastav

September 4, 2018 at 12:23 am

Hi Adrian,

I have a small confusion while training thee model. I can see that you have around 2000 images as total. now you are splitting in into 80:20 ratio for training and validation respectively. so according to this data ~1600 images for training and ~400 images for validation. so the STEP PER EPOCH = 1600. but I can see while training it is been trained on just 57 images. May I know the reason. I tried with my own dataset (9000) images with 9 classes. here while training, the STEP PER EPOCH is only 257. I am confused on how is it splitting the images for training and validation..
- Adrian Rosebrock
  
  September 5, 2018 at 8:49 am
  
  Where are you seeing your steps per epoch value?
Aniket

September 6, 2018 at 2:40 am

I am sorry. I missed that you are dividing the step per epoch by batch size. Could you explain me why you are dividing the step per epoch by the batch size. I mean what’s the difference if I take the step per epoch as total no. of training data instead of dividing it by batch size. Is it making the training more robust. if so then could you please explain me how?
- Adrian Rosebrock
  
  September 11, 2018 at 8:51 am
  
  It has nothing to do with accuracy or making the model robust. It has everything to do with the training procedure itself. The .fit_generator function does not know how many steps there are per epoch since it will generate date infinitely. Instead, we tell it how many steps there are by dividing the total number of images by the number of epochs, resulting in the steps per epoch. If you need further help understanding the fundamentals of deep learning and training procedures (including my best practices, tips, and tricks), I would recommend you read through Deep Learning for Computer Vision with Python.
KVS Setty

September 20, 2018 at 7:31 am

Hello Adrian,

A great tutorial with excellent algorithm and implementation (code) explanation.

But a small hiccup , please refer to train.py code explanation you have refereed to code lines 53 to 61, but no such line no’s in the block above the explanation, I think some how the line numbers are re-numbered to 1 to 12 . The code is still understandable anyway, except for some very newbies like me.

I appreciate your great effort in making most complicated concepts of DL and CV, understandable to a beginner.
Nihel

September 24, 2018 at 1:11 pm

hello Adrian,
Really an excellent article, just a question what the figure 54 when doing the training network and thank you in advance.
Jitender Saini

September 30, 2018 at 11:24 am

Hello,
It’s really a nice tutorial and I followed it up and created my model for clothes classifications. But I’m facing a serious problem with this approach is, I’m training around 19 GB of data with this that is taking infinite time and memory to train, I’m trying to change the approach in this model with custom generator but I’m not succeed, It will be a great help if you can really post an approach with custom generator for train.py file.

Thanks a lot
Jitender
- Adrian Rosebrock
  
  October 8, 2018 at 10:54 am
  
  Hey Jitender, I would suggest you take a look at Deep Learning for Computer Vision with Python, in particular the Practitioner Bundle, where I demonstrate how to build custom Keras generators that can be used to efficiently scale to terabytes of data. Be sure to give it a look as I’m more than confident it will help you with your own projects 🙂
sara

October 8, 2018 at 7:03 am

Hello Adrian,

what is the filter name used in this network ??
- Adrian Rosebrock
  
  October 8, 2018 at 9:30 am
  
  Hey Sara, the filters are automatically learned by the CNN. There is no “pre-defined” kernel such as Scharr, Sobel, Prewitt, etc.
zeyawu

October 21, 2018 at 11:12 pm

Hello, Adrian!
Thank you for the code provided.When I use code to load big data, there is not enough memory. So I tried to load the data in bulk. However, I am confused because the category data changes with the loaded data. Do you have any suggestions for improvement?
- Adrian Rosebrock
  
  October 22, 2018 at 7:53 am
  
  How much data are you working with? Inside my book, Deep Learning for Computer Vision with Python I discuss how to train networks datasets too large to fit into memory (in the order of terabytes of data). Check it out as I believe it will help you solve your problem/project.
Abhijit

October 31, 2018 at 1:19 am

Hi Adrian,

Thanks for the tutorial, Just a small doubt. How does binary cross entropy function work in the multilabel problem? Does it take the average or the sum of individual binary cross entropy?
Akash

November 6, 2018 at 6:24 am

from pyimagesearch.smallervggnet import SmallerVGGNet
Traceback (most recent call last):

File “”, line 1, in
from pyimagesearch.smallervggnet import SmallerVGGNet

ModuleNotFoundError: No module named ‘pyimagesearch’

I’am getting this error.Please help.
Also if I want to use some other dataset can I use this same code and only change the path to my required dataset?
- Adrian Rosebrock
  
  November 6, 2018 at 12:34 pm
  
  Make sure you use the “Downloads” section of this blog post to download the source code rather than copying and pasting. You likely accidentally introduced an error when copying and pasting the code.
  - Akash
    
    November 6, 2018 at 1:24 pm
    
    Sir I have used the downloads section.But I still get this error
    - Adrian Rosebrock
      
      November 6, 2018 at 1:37 pm
      
      Make sure you are in the directory where you downloaded the source code.
      
      $ cd path_to_your_download $ python train.py --dataset dataset --model fashion.model \ --labelbin mlb.pickle
Akash

November 6, 2018 at 1:22 pm

I want to use some other dataset can I use this same code and only change the path to my required dataset?
- Adrian Rosebrock
  
  November 6, 2018 at 1:37 pm
  
  Yes. Just make sure you dataset has the same directory structure as mine.
  - Akash
    
    November 7, 2018 at 3:16 am
    
    I don’t have the .model and .pickle file for my dataset.How do I work around that.?
    - Adrian Rosebrock
      
      November 10, 2018 at 10:20 am
      
      Those files are automatically created for you when you train your model.
Akash

November 12, 2018 at 8:42 am

Traceback (most recent call last):

…
(h, w) = image.shape[:2]
AttributeError: ‘NoneType’ object has no attribute ‘shape’

I get this error when I use a personal image but everything works perfectly fine when I use an image from examples.What could be the reason
- Adrian Rosebrock
  
  November 13, 2018 at 4:41 pm
  
  The path to your input image is incorrect. Double-check it and then read this tutorial on NoneType errors.
janak

December 6, 2018 at 5:40 am

Hi Adrian,

I am performing the multi-labelling classification but I do not know that how can I make labbeling for it. I have 10 labels for 3000 images.

Can you share any idea me how I can make Multi-hot encoding?

Thank you
- Adrian Rosebrock
  
  December 6, 2018 at 9:27 am
  
  The MultiLabelBinarizer class performs multi-label encoding for you. I would suggest taking a look at it (we use it in this tutorial).
  - janak
    
    December 10, 2018 at 9:29 am
    
    Thank you for nice guidance.
    
    can you suggest me some feed forward neural network, that can work as a multi labelling?
    - Adrian Rosebrock
      
      December 11, 2018 at 12:44 pm
      
      Nearly any CNN can work for multi-label classification, it just requires updating the final fully-connected layer at the end of the network.
Melvin

January 4, 2019 at 4:47 pm

Hi Adrian, this was an amazing tutorial, but how would you handle mutually exclusive type for example a dress that is red and black?

I see you were asking about a medical data set for imaging kaggle has a contest about image cell classification. It’s quite complex classification.
- Adrian Rosebrock
  
  January 5, 2019 at 8:38 am
  
  You could still use same approach, you would just need to encode your training label vector such as that the “red” and “dress” labels are activated in the one-hot encoding.
Saptarshi Dutta Gupta

January 28, 2019 at 5:12 am

Is there any tutorial on multiclass image classification?
- Adrian Rosebrock
  
  January 28, 2019 at 5:51 pm
  
  This tutorial covers multi-class classification. Perhaps you mean multi-output classification? Could you be a bit more specific regarding what you are trying to accomplish?
David

January 28, 2019 at 10:53 am

Maybe a silly question:

If you have the computational resources, why not just train two models on the same dataset? One for clothing and one for color. Wouldn’t that have a better chance of requiring a sample of every combination of clothing/color?

Besides computation time, is there any real disadvantage to having two models?
- Adrian Rosebrock
  
  January 28, 2019 at 5:49 pm
  
  It really is dependent on your problem and whether the two may correlate. I think you’re focusing too strictly on the example of clothing classification. That may not be the case for other datasets. If you have the computational resources it may may sense but you would run experiments to validate that — let your empirical results guide you.
Exor

February 2, 2019 at 7:14 am

any change we can draw a box on the predicted image ? for example incase we have 2 items in the same image ?
- Adrian Rosebrock
  
  February 5, 2019 at 9:36 am
  
  This tutorial covers image classification — you are trying to perform object detection.
Walid

February 19, 2019 at 2:50 pm

Amazing tutorial as usual
Thanks a million

Walid
- Adrian Rosebrock
  
  February 20, 2019 at 12:09 pm
  
  Thanks Walid!
Chee Siong

February 26, 2019 at 8:33 am

For the above example, may I know which bundle of your book covers?
- Adrian Rosebrock
  
  February 27, 2019 at 5:38 am
  
  You can see which topics are covered in which bundles on the official Deep Learning for Computer Vision with Python page. Let me know if you have any questions on it.
anju

March 6, 2019 at 6:22 am

Hello sir,
can I use this classification to shape detection through image classification?
- Adrian Rosebrock
  
  March 8, 2019 at 5:40 am
  
  Potentially yes but without seeing what your image dataset is it’s hard to know if that would work. Give it a try though!
  - anju
    
    March 16, 2019 at 5:18 am
    
    can we work in real time ?
    - Adrian Rosebrock
      
      March 19, 2019 at 10:16 am
      
      Yes, the model covered here today can run in real-time, even on a CPU.
      - anju
        
        March 23, 2019 at 3:37 am
        
        Thank You..It work ..
      - Adrian Rosebrock
        
        March 27, 2019 at 9:21 am
        
        Awesome, I’m glad it worked for you!
      - anju
        
        March 25, 2019 at 5:22 am
        
        Thank You ! It Work 80%..
Sandra

March 16, 2019 at 10:53 pm

How to rectify this error on running train.py?

train.py: error: the following arguments are required: -d/–dataset, -m/–model, -l/–labelbin
- Adrian Rosebrock
  
  March 19, 2019 at 10:11 am
  
  You need to supply the command line arguments to the script.
feng

March 18, 2019 at 8:06 am

Hi Adrian, this was an amazing tutorial,This is a great project, I want to know how the dataset’s annotations.pickle get, can you talk about the specific process? Thank you in advance, look forward to your reply
- Adrian Rosebrock
  
  March 19, 2019 at 10:00 am
  
  There isn’t a file named “annotations.pickle” in this project so I’m not sure what you’re referring to. Could you clarify?
wassan

March 22, 2019 at 4:25 pm

Hi Adrian ,

It is simply great tutorial, I just want to ask about the accuracy metric, Is it binary accuracy or categorial accuracy?

And is it better to consider F1 score for multi-labels classification task.

Thanks,
anju

March 23, 2019 at 3:36 am

Hello sir,
can we use tabular dataset(csv files) in multi label classification in real time??

Can you please suggest?
- Adrian Rosebrock
  
  March 27, 2019 at 9:21 am
  
  Yes, that is absolutely possible. However without knowing what your CSV file contains or what the goal of the project is I can’t provide you with any other suggestions.
Aarju Gupta

April 12, 2019 at 2:40 am

hello Adrian

if i want to make simple cat dog multilabel classifier only changes with softmax to signoid with 2 dense layer and taking the loss function binary , so the classifier will able to predict both cat and dog at a time if i will give input image where cat and dog both are presented.
- Adrian Rosebrock
  
  April 12, 2019 at 11:19 am
  
  In that case I would actually recommend object detection instead.
Lamin L Janneh

April 12, 2019 at 10:42 am

Hi Adrian,
I have a project which is 75% related to your blog post. I have learned a lot and currently I’m trying to implement a full-fledged project.
- Adrian Rosebrock
  
  April 12, 2019 at 11:15 am
  
  Thanks Lamin, I’m glad the tutorial was able to help you!
kartik

May 1, 2019 at 4:57 am

Adrian the man who made computer vision a piece of cake.
- Adrian Rosebrock
  
  May 1, 2019 at 11:20 am
  
  Thank you for the kind words, kartik 🙂
Bilal Ahmed

May 9, 2019 at 5:38 pm

I am new to deep learning I am working on a multi-class classification problem i have a dataset with 1501 classes, every class has 10 images each but these images are not divided into sub directories. help me how I can divide these images accordingly into its 1501 directory. I can differentiate the images with image name each class has a special name to start with. Kindly help me I am so confused because its tedious work to do it manually.
- Adrian Rosebrock
  
  May 15, 2019 at 3:16 pm
  
  If you are new to deep learning I would recommend you first read Deep Learning for Computer Vision with Python. That book will teach you how to build your own datasets and train deep neural networks on your own data.
amit raj

June 22, 2019 at 6:54 am

Hi Adrian,
Thanks for your great tutorial. I am trying to solve a multilabel problem too.
My training labels are in separate csv files .
The csv files contain filename in first column and varying number of labels(alphanumeric) separated by comma in second column.

I also have a separate csv files where all labels (alphanumeric) are indicated in first column and their realworld meaning are represented in second column.
There are 5529 labels in total ..
In the final result, i can have upto 100 labels per image.
I don’t know if i can use onehot encoding in this case,since it is not exactly categorical like the example you provide. Can you please guide me. I am a newbie.
- Adrian Rosebrock
  
  June 26, 2019 at 1:40 pm
  
  Have you tried computing an embedding instead? Keep in mind that PyImageSearch is predominately a computer vision blog, I don’t do much work in non-CV Problems. Good luck!
- Scott Quadrelli
  
  April 15, 2020 at 7:38 pm
  
  Hi Amit,
  
  I thought I would comment for anyone else who reads this comment.
  
  The way to solve your problem is with ‘datagen.flow_from_dataframe’ from the keras pre-processing class. If you set ‘class_mode’ mode to ‘catagorical’ this will encode the labels for you.
  
  Hope this helps anyone with a similar problem in the the future.
  - Adrian Rosebrock
    
    April 16, 2020 at 7:47 am
    
    Thanks for sharing, Scott!
ali

July 16, 2019 at 4:29 am

Hi Adrian
I’m using google colab to run your project but I’m in trouble with that
the problem is that form lunching the program we must pass the input and output arguments with argparse but in colab, we can’t do this, briefly, I can’t handle the program with colab
could you help me on how to fix this problem?
- Adrian Rosebrock
  
  July 25, 2019 at 10:01 am
  
  Follow this guide which will teach you how to hardcode the “args” dictionary instead of using command line arguments.
Steffan

August 28, 2019 at 4:43 am

Can i also get the direction on how to add Texture/Appearance to this ..or how can i create another CNN classifier for Texture.I have tried some ways but the result is very bad ,i think i have to do some type of preprocessing before passing in to CNN
- Adrian Rosebrock
  
  September 5, 2019 at 10:43 am
  
  CNNs actually naturally learn texture. If you need help training your own custom CNNs I would recommend reading Deep Learning for Computer Vision with Python.
lhee

September 24, 2019 at 10:05 pm

Great….
- Adrian Rosebrock
  
  September 25, 2019 at 10:32 am
  
  I’m glad you liked it!
Anubhav

September 30, 2019 at 9:39 am

Hi Adrian ,

first of all , thanks a lot for amazing tutorials. I am learning machine learning just for fun and your tutorials helped me a lot so far until i stumbled upon these multi-label and multi-class models which are still confusing me. I see that you tried to solve the same problem using two different models.

but Is there any advantage of using one model over other?

the multi-lable model is more confusing me because i still didn’t understand why sigmoid function and binary loss is used in multi-label classification?

In which cases only one of the model can be used?

what is the difference between class and labels in your tutorials?
- Adrian Rosebrock
  
  October 3, 2019 at 12:33 pm
  
  Hey Anubhav — it’s great that you are studying machine learning for fun. I would honestly recommend you read through Deep learning for Computer Vision with Python — that book will teach you the fundamentals and help remove some of the confusion that is surrounding you now.
mehrdad azizi

December 22, 2019 at 7:22 am

Hi,
I tried to train with another backbone but accuracy stuck in ~0.82.
test efficient net and resnet50
do you have any idea what is the problem?
if I want to change the data generator to do not load at once what should I do?
- Adrian Rosebrock
  
  December 26, 2019 at 9:58 am
  
  I would suggest reading Deep Learning for Computer Vision with Python where I include my tips, suggestions, and best practices to improve your model accuracy.
Beny Nugraha

January 14, 2020 at 11:05 am

Hello,

First of all, thank you for the guide, this helps me with one of my work. I have one question though, if I have one new category, let’s say it is “green shoe”, but I don’t want to include them in the training dataset. I only want to include them in the testing dataset, and I want the model to classify the “green shoe” as “unknwon” category. What is your suggestion to do such a thing?
Thank you in advance.
- Adrian Rosebrock
  
  January 16, 2020 at 10:30 am
  
  I would suggest creating an “unknown” class that contains images/classes that are different than the rest of your classes. Alternatively you could filter on the predicted probability of the classes and filter out low probability ones.
Kishore

January 16, 2020 at 2:43 pm

Can i apply you code by using Jupyter notebook ? and can i test any downloaded jeans image from Downloads ?
- Adrian Rosebrock
  
  January 23, 2020 at 9:37 am
  
  Yes to both.
Joshua

February 15, 2020 at 5:02 pm

Hi Adrian,

I am worried that this tutorial might be misleading to people who don’t have their bearings yet with CV and ML. To me, it seems like you haven’t actually trained a multi-label classifier in the way that the asker was requesting.

Because you have condensed the labels from both the “Colour” category and the “Item” category into a single one-hot vector, the classifier has no semantic distinction between these two categories of labels. MultiLabelBinarizer is a tool better suited for labels within a singular category. (See the example in the docs: labels are within the “movie genre” category, but it would never combine “movie genre” with a separate category entirely.)

I think this is the reason the classifier overfit to the training data. You have effectively hacked together a multilabel classifier that isn’t doing multilabel classification! You are using tools that aren’t suited for the problem at hand. A true multilabel classifier would be able to predict black dresses **without using images of black dresses**. It would have a separate output node for colour (to find patterns shared between items of all item types) and for items (to find patterns shared between of all colours).

This is exactly what the asker did when they say they trained separate classifiers. So, what I believe the asker was really asking about was if there was a way to train a multilabel classifier *while still maintaining distinctions between categories*. For example, there could be specific architectures that take in multiple separate sets of labels to try and find correlations between items on a category by category basis.

I hope this is not too critical — I look up to your tutorials and I really value how you have made CV more accessible! I just felt… disappointed to see a lower quality of work than what I’ve come to expect from you. 🙁

Hope to hear back!
- Adrian Rosebrock
  
  February 20, 2020 at 9:35 am
  
  Hey Joshua — while I appreciate the comment, I think you should have taken the time to read the multi-output and multi-losses post which does exactly what you are referring to.
  
  Secondly, it’s wonderful to have you as a long-term reader. It truly seems like you are getting a lot of value out of the PyImageSearch blog for free. If you truly are “expecting” high quality work from me you should support the PyImageSearch blog and purchase one of my books/courses.
  
  Purchases of my books/courses help fund PyImageSearch as a free resource for you and others to learn from.
  
  I hope to one day see you as a customer.
Petr

February 25, 2020 at 6:13 am

Great article!
Could you tell me how to get the coordinates of the bounding box of the predicted image so that you can circle the found article of clothing?
- Adrian Rosebrock
  
  February 27, 2020 at 9:14 am
  
  You need to look into object detection methods.
  - Petr
    
    March 1, 2020 at 9:39 am
    
    Thank you, but this is not a good option, since keras is not a caffe. And how to extract coordinates for a bounding box from the current model?
    - Adrian Rosebrock
      
      March 4, 2020 at 1:36 pm
      
      There are Keras object detector implementations for popular object detector frameworks. If you’re interested, they are covered inside Deep Learning for Computer Vision with Python.
AbbyH

April 15, 2020 at 1:29 pm

Hi Adrian,

I am using the dataset : https://www.kaggle.com/kmader/food41
It has 101 categories (folders) of food images, each folder has 1000 images. Doing classification this is easy, but my I have to attach calorie values as well to this. So with predicting food, for example, as pizza, I have to say how many calories too.

I can predict average calorie per serving size and just pull the details from an excel file after I do the classification. But my requirement is to do predictions on calories too, not just pull a value from a hard coded file.

After I saw your article, I got this idea to divide food images into 2 sub-categories:
1. Small_portion, like a slice of pizza or apple pie (apple_pie_277 kcal)
2. Big_portion, like a whole pie or pizza (apple_pie_345 kcal).
Do you think similar to your code this can work?

The only problem I have is to manually create sub-categories folders and also, images in the already existing folders, for small & big portions, are not evenly distributed.

Please let me know if you have any suggestions for this type of model.
- Adrian Rosebrock
  
  April 16, 2020 at 7:49 am
  
  This sounds like a reasonable approach. I would suggest you give it a try and evaluate the accuracy. Run experiments, gather results, and refine your process. Let your empirical results guide you.
  
  Secondly, take a look into creating your own custom data generators for Keras/TensorFlow — that will help with the dataset organization process.
  - AbbyH
    
    April 16, 2020 at 11:45 am
    
    Thanks a lot Adrian, for a quick reply.
    Also this article really saved my time, because it seems, everyone else just use ‘MNIST’ dataset to explain same things over and over.
    Its hard to find original content for any real world scenario.

Comment section

Hey, Adrian Rosebrock here, author and creator of PyImageSearch. While I love hearing from readers, a couple years ago I made the tough decision to no longer offer 1:1 help over blog post comments.

At the time I was receiving 200+ emails per day and another 100+ blog post comments. I simply did not have the time to moderate and respond to them all, and the sheer volume of requests was taking a toll on me.

Instead, my goal is to do the most good for the computer vision, deep learning, and OpenCV community at large by focusing my time on authoring high-quality blog posts, tutorials, and books/courses.

If you need help learning computer vision and deep learning, I suggest you refer to my full catalog of books and courses — they have helped tens of thousands of developers, students, and researchers just like yourself learn Computer Vision, Deep Learning, and OpenCV.

Click here to browse my full catalog.

Looking for the source code to this post?

Multi-label classification with Keras

Our multi-label classification dataset

Configuring your development environment

Multi-label classification project structure

Our Keras network architecture for multi-label classification

Implementing our Keras model for multi-label classification

Training a Keras network for multi-label classification

Applying Keras multi-label classification to new images

Keras multi-label classification results

What's next? We recommend PyImageSearch University.

Summary

Download the Source Code and FREE 17-page Resource Guide

About the Author

228 responses to: Multi-label classification with Keras

Comment section

PyImageSearch University

Introduction to Distributed Training in PyTorch

Solving System of Linear Equations with LU Decomposition

Mean Average Precision (mAP) Using the COCO Evaluator

Topics

Books & Courses

PyImageSearch

Looking for the source code to this post?

Multi-label classification with Keras

Our multi-label classification dataset

Configuring your development environment

Multi-label classification project structure

Our Keras network architecture for multi-label classification

Implementing our Keras model for multi-label classification

Training a Keras network for multi-label classification

Applying Keras multi-label classification to new images

Keras multi-label classification results

What's next? We recommend PyImageSearch University.

Summary

Download the Source Code and FREE 17-page Resource Guide

About the Author

Reader Interactions

A fun, hands-on deep learning project for beginners, students, and hobbyists

A gentle guide to deep learning object detection

228 responses to: Multi-label classification with Keras

Comment section

Similar articles

You can learn Computer Vision, Deep Learning, and OpenCV.

Footer

Topics

Books & Courses

PyImageSearch

Access the code to this tutorial and all other 500+ tutorials on PyImageSearch

What's included in PyImageSearch University?