Keras: Multiple Inputs and Mixed Data

In this tutorial, you will learn how to use Keras for multi-input and mixed data.

You will learn how to define a Keras architecture capable of accepting multiple inputs, including numerical, categorical, and image data. We’ll then train a single end-to-end network on this mixed data.

Today is the final installment in our three part series on Keras and regression:

  1. Basic regression with Keras
  2. Training a Keras CNN for regression prediction
  3. Multiple inputs and mixed data with Keras (today’s post)

In this series of posts, we’ve explored regression prediction in the context of house price prediction.

The house price dataset we are using includes not only numerical and categorical data, but image data as well — we call multiple types of data mixed data as our model needs to be capable of accepting our multiple inputs (that are not of the same type) and computing a prediction on these inputs.

In the remainder of this tutorial you will learn how to:

  1. Define a Keras model capable of accepting multiple inputs, including numerical, categorical, and image data, all at the same time.
  2. Train an end-to-end Keras model on the mixed data inputs.
  3. Evaluate our model using the multi-inputs.

To learn more about multiple inputs and mixed data with Keras, just keep reading!

Looking for the source code to this post?
Jump right to the downloads section.

Keras: Multiple Inputs and Mixed Data

In the first part of this tutorial, we will briefly review the concept of both mixed data and how Keras can accept multiple inputs.

From there we’ll review our house prices dataset and the directory structure for this project.

Next, I’ll show you how to:

  1. Load the numerical, categorical, and image data from disk.
  2. Pre-process the data so we can train a network on it.
  3. Prepare the mixed data so it can be applied to a multi-input Keras network.

Once our data has been prepared you’ll learn how to define and train a multi-input Keras model that accepts multiple types of input data in a single end-to-end network.

Finally, we’ll evaluate our multi-input and mixed data model on our testing set and compare the results to our previous posts in this series.

What is mixed data?

Figure 1: With the Keras’ flexible deep learning framework, it is possible define a multi-input model that includes both CNN and MLP branches to handle mixed data.

In machine learning, mixed data refers to the concept of having multiple types of independent data.

For example, let’s suppose we are machine learning engineers working at a hospital to develop a system capable of classifying the health of a patient.

We would have multiple types of input data for a given patient, including:

  1. Numeric/continuous values, such as age, heart rate, blood pressure
  2. Categorical values, including gender and ethnicity
  3. Image data, such as any MRI, X-ray, etc.

All of these values constitute different data types; however, our machine learning model must be able to ingest this “mixed data” and make (accurate) predictions on it.

You will see the term “mixed data” in machine learning literature when working with multiple data modalities.

Developing machine learning systems capable of handling mixed data can be extremely challenging as each data type may require separate preprocessing steps, including scaling, normalization, and feature engineering.

Working with mixed data is still very much an open area of research and is often heavily dependent on the specific task/end goal.

We’ll be working with mixed data in today’s tutorial to help you get a feel for some of the challenges associated with it.

How can Keras accept multiple inputs?

Figure 2: As opposed to its Sequential API, Keras’ functional API allows for much more complex models. In this blog post we use the functional API to support our goal of creating a model with multiple inputs and mixed data for house price prediction.

Keras is able to handle multiple inputs (and even multiple outputs) via its functional API.

The functional API, as opposed to the sequential API (which you almost certainly have used before via the Sequential  class), can be used to define much more complex models that are non-sequential, including:

  • Multi-input models
  • Multi-output models
  • Models that are both multiple input and multiple output
  • Directed acyclic graphs
  • Models with shared layers

For example, we may define a simple sequential neural network as:

This network is a simple feedforward neural without with 10 inputs, a first hidden layer with 8 nodes, a second hidden layer with 4 nodes, and a final output layer used for regression.

We can define the sample neural network using the functional API:

Notice how we are no longer relying on the Sequential  class.

To see the power of Keras’ function API consider the following code where we create a model that accepts multiple inputs:

Here you can see we are defining two inputs to our Keras neural network:

  1. inputA : 32-dim
  2. inputB : 128-dim

Lines 21-23 define a simple 32-8-4  network using Keras’ functional API.

Similarly, Lines 26-29 define a 128-64-32-4  network.

We then combine the outputs of both the  x and y on Line 32. The outputs of  x and y are both 4-dim so once we concatenate them we have a 8-dim vector.

We then apply two more fully-connected layers on Lines 36 and 37. The first layer has 2 nodes followed by a ReLU activation while the second layer has only a single node with a linear activation (i.e., our regression prediction).

The final step to building the multi-input model is to define a Model  object which:

  1. Accepts our two inputs
  2. Defines the outputs  as the final set of FC layers (i.e., z ).

If you were to use Keras to visualize the model architecture it would look like the following:

Figure 3: This model has two input branches that ultimately merge and produce one output. The Keras functional API allows for this type of architecture and others you can dream up.

Notice how our model has two distinct branches.

The first branch accepts our 128-d input while the second branch accepts the 32-d input. These branches operate independently of each other until they are concatenated. From there a single value is output from the network.

In the remainder of this tutorial, you will learn how to create multiple input networks using Keras.

The House Prices dataset

Figure 4: The House Prices dataset consists of both numerical/categorical data and image data. Using Keras, we’ll build a model supporting the multiple inputs and mixed data types. The result will be a Keras regression model which predicts the price/value of houses.

In this series of posts, we have been using the House Prices dataset from Ahmed and Moustafa’s 2016 paper, House price estimation from visual and textual features.

This dataset includes both numerical/categorical data along with images data for each of the 535 example houses in the dataset.

The numerical and categorical attributes include:

  1. Number of bedrooms
  2. Number of bathrooms
  3. Area (i.e., square footage)
  4. Zip code

A total of four images are provided for each house as well:

  1. Bedroom
  2. Bathroom
  3. Kitchen
  4. Frontal view of the house

In the first post in this series, you learned how to train a Keras regression network on the numerical and categorical data.

Then, last week, you learned how to perform regression with a Keras CNN.

Today we are going to work with multiple inputs and mixed data with Keras.

We are going to accept both the numerical/categorical data along with our image data to the network.

Two branches of a network will be defined to handle each type of data. The branches will then be combined at the end to obtain our final house price prediction.

In this manner, we will be able to leverage Keras to handle both multiple inputs and mixed data.

Obtaining the House Prices dataset

To grab the source code for today’s post, use the “Downloads” section. Once you have the zip file, navigate to where you downloaded it, and extract it:

And from there you can download the House Prices dataset via:

The House Prices dataset should now be in the keras-multi-input  directory which is the directory we are using for this project.

Project structure

Let’s take a look at how today’s project is organized:

The Houses-dataset folder contains our House Prices dataset that we’re working with for this series. When we’re ready to run the  script, you’ll just need to provide a path as a command line argument to the dataset (I’ll show you exactly how this is done in the results section).

Today we’ll be reviewing three Python scripts:

  • pyimagesearch/ : Handles loading and preprocessing our numerical/categorical data as well as our image data. We previously reviewed this script over the past two weeks, but I’ll be walking you through it again today.
  • pyimagesearch/ : Contains our Multi-layer Perceptron (MLP) and Convolutional Neural Network (CNN). These components are the input branches to our multi-input, mixed data model. We reviewed this script last week and we’ll briefly review it today as well.
  • : Our training script will use the pyimagesearch  module convenience functions to load + split the data and concatenate the two branches to our network + add the head. It will then train and evaluate the model.

Loading the numerical and categorical data

Figure 5: We use pandas, a Python package, to read CSV housing data.

We covered how to load the numerical and categorical data for the house prices dataset in our Keras regression post but as a matter of completeness, we will review the code (in less detail) here today.

Be sure to refer to the previous post if you want a detailed walkthrough of the code.

Open up the  file and insert the following code:

Our imports are handled on Lines 2-8.

From there we define the load_house_attributes  function on Lines 10-33. This function reads the numerical/categorical data from the House Prices dataset in the form of a CSV file via Pandas’ pd.read_csv  on Lines 13 and 14.

The data is filtered to accommodate an imbalance. Some zipcodes only are represented by 1 or 2 houses, therefore we just go ahead and drop  (Lines 23-30) any records where there are fewer than 25  houses from the zipcode. The result is a more accurate model later on.

Now let’s define the process_house_attributes  function:

This function applies min-max scaling to the continuous features via scikit-learn’s MinMaxScaler  (Lines 41-43).

Then, one-hot encoding for the categorical features is computed, this time via scikit-learn’s LabelBinarizer  (Lines 47-49).

The continuous and categorical features are then concatenated and returned (Lines 53-57).

Be sure to refer to the previous posts in this series for more details on the two functions we reviewed in this section:

  1. Regression with Keras
  2. Keras, Regression, and CNNs

Loading the image dataset

Figure 6: One branch of our model accepts a single image — a montage of four images from the home. Using the montage combined with the numerical/categorial data input to another branch, our model then uses regression to predict the value of the home with the Keras framework.

The next step is to define a helper function to load our input images. Again, open up the  file and insert the following code:

The load_house_images  function has three goals:

  1. Load all photos from the House Prices dataset. Recall that we have four photos per house (Figure 6).
  2. Generate a single montage image from the four photos. The montage will always be arranged as you see in the figure.
  3. Append all of these home montages to a list/array and return to the calling function.

Beginning on Line 59, we define the function which accepts a Pandas dataframe and dataset inputPath .

From there, we proceed to:

  • Initialize the images  list (Line 61). We’ll be populating this list with all of the montage images that we build.
  • Loop over houses in our data frame (Line 64). Inside the loop, we:
    • Grab the paths to the four photos for the current house (Lines 67 and 68).

Let’s keep making progress in the loop:

The code so far has accomplished the first goal discussed above (grabbing the four house images per house). Let’s wrap up the  load_house_images  function:

  • Still inside the loop, we:
    • Perform initializations (Lines 72 and 73). Our inputImages  will be in list form containing four photos of each record. Our outputImage  will be the montage of the photos (like Figure 6).
    • Loop over 4 photos (Line 76):
      • Load, resize, and append each photo to inputImages  (Lines 79-81).
    • Create the tiling (a montage) for the four house images (Lines 87-90) with:
      • The bathroom image in the top-left.
      • The bedroom image in the top-right.
      • The frontal view in the bottom-right.
      • The kitchen in the bottom-left.
    • Append the tiling/montage outputImage  to images  (Line 94).
  • Jumping out of the loop, we return  all the  images  in the form of a NumPy array (Line 97).

We’ll have as many images  as there are records we’re training with (remember, we dropped a few of them in the process_house_attributes  function).

Each of our tiled images  will look like Figure 6 (without the overlaid text of course). You can see the four photos therein have been arranged in a montage (I’ve used larger image dimensions so we can better visualize what the code is doing). Just as our numerical and categorical attributes represent the house, these four photos (tiled into a single image) will represent the visual aesthetics of the house.

If you need to review this process in further detail, be sure to refer to last week’s post.

Defining our Multi-layer Perceptron (MLP) and Convolutional Neural Network (CNN)

Figure 7: Our Keras multi-input + mixed data model has one branch that accepts the numerical/categorical data (left) and another branch that accepts image data in the form a 4-photo montage (right).

As you’ve gathered thus far, we’ve had to massage our data carefully using multiple libraries: Pandas, scikit-learn, OpenCV, and NumPy.

We’ve organized and pre-processed the two modalities of our dataset at this point via :

  • Numeric and categorical data
  • Image data

The skills we’ve used in order to accomplish this have been developed through experience + practice, machine learning best practices, and behind the scenes of this blog post, a little bit of debugging. Please don’t overlook what we’ve discussed so far using our data massaging skills as it is key to the rest of our project’s success.

Let’s shift gears and discuss our multi-input and mixed data network that we’ll build with Keras’ functional API.

In order to build our multi-input network we will need two branches:

  • The first branch will be a simple Multi-layer Perceptron (MLP) designed to handle the categorical/numerical inputs.
  • The second branch will be a Convolutional Neural Network to operate over the image data.
  • These branches will then be concatenated together to form the final multi-input Keras model.

We’ll handle building the final concatenated multi-input model in the next section — our current task is to define the two branches.

Open up the  file and insert the following code:

Lines 2-11 handle our Keras imports. You’ll see each of the imported functions/classes going forward in this script.

Our categorical/numerical data will be processed by a simple Multi-layer Perceptron (MLP).

The MLP is defined by create_mlp  on Lines 13-24.

Discussed in detail in the first post in this series, the MLP relies on the Keras Sequential  API. Our MLP is quite simple having:

  • A fully connected ( Dense ) input layer with ReLU activation  (Line 16).
  • A fully-connected hidden layer, also with ReLU activation  (Line 17).
  • And finally, an optional regression output with linear activation (Lines 20 and 21).

While we used the regression output of the MLP in the first post, it will not be used in this multi-input, mixed data network. As you’ll soon see, we’ll be setting  regress=False  explicitly even though it is the default as well. Regression will actually be performed later on the head of the entire multi-input, mixed data network (the bottom of Figure 7).

The MLP branch is returned on Line 24.

Referring back to Figure 7, we’ve now built the top-left branch of our network.

Let’s now define the top-right branch of our network, a CNN:

The create_cnn  function handles the image data and accepts five parameters:

  • width : The width of the input images in pixels.
  • height : How many pixels tall the input images are.
  • depth : The number of channels in our input images. For RGB color images, it is three.
  • filters : A tuple of progressively larger filters so that our network can learn more discriminate features.
  • regress : A boolean indicating whether or not a fully-connected linear activation layer will be appended to the CNN for regression purposes.

The inputShape  of our network is defined on Line 29. It assumes “channels last” ordering for the TensorFlow backend.

The Input  to the model is defined via the inputShape  on (Line 33).

From there we begin looping over the filters and create a set of CONV => RELU > BN => POOL layers. Each iteration of the loop appends these layers. Be sure to check out Chapter 11 from the Starter Bundle of Deep Learning for Computer Vision with Python for more information on these layer types if you are unfamiliar.

Let’s finish building the CNN branch of our network:

We Flatten  the next layer (Line 49) and then add a fully-connected layer with BatchNormalization  and Dropout  (Lines 50-53).

Another fully-connected layer is applied to match the four nodes coming out of the multi-layer perceptron (Lines 57 and 58). Matching the number of nodes is not a requirement but it does help balance the branches.

On Lines 61 and 62, a check is made to see if the regression node should be appended; it is then added in accordingly. Again, we will not be conducting regression at the end of this branch either. Regression will be performed on the head of the multi-input, mixed data network (the very bottom of Figure 7).

Finally, the model is constructed from our inputs  and all the layers we’ve assembled together, x  (Line 65).

We can then  return  the CNN branch to the calling function (Line 68).

Now that we’ve defined both branches of the multi-input Keras model, let’s learn how we can combine them!

Multiple inputs with Keras

We are now ready to build our final Keras model capable of handling both multiple inputs and mixed data. This is where the branches come together and ultimately where the “magic” happens. Training will also happen in this script.

Create a new file named , open it up, and insert the following code:

Our imports and command line arguments are handled first.

Notable imports include:

  • datasets : Our three convenience functions for loading/processing the CSV data and loading/pre-processing the house photos from the Houses Dataset.
  • models : Our MLP and CNN input branches which will serve as our multi-input, mixed data.
  • train_test_split : A scikit-learn function to construct our training/testing data splits.
  • concatenate : A special Keras function which will accept multiple inputs.
  • argparse : Handles parsing command line arguments.

We have one command line argument to parse on Lines 15-18, --dataset , which is the path to where you downloaded the House Prices dataset.

Let’s load our numerical/categorical data and image data:

Here we’ve loaded the House Prices dataset as a Pandas dataframe (Lines 23 and 24).

Then we’ve loaded our images  and scaled them to the range [0, 1] (Lines 29-30).

Be sure to review the load_house_attributes  and load_house_images  functions above if you need a reminder on what these functions are doing under the hood.

Now that our data is loaded, we’re going to construct our training/testing splits, scale the prices, and process the house attributes:

Our training and testing splits are constructed on Lines 35 and 36. We’ve allocated 75% of our data for training and 25% of our data for testing.

From there, we find the maxPrice  from the training set (Line 41) and scale the training and testing data accordingly (Lines 42 and 43). Having the pricing data in the range [0, 1] leads to better training and convergence.

Finally, we go ahead and process our house attributes by performing min-max scaling on continuous features and one-hot encoding on categorical features. The process_house_attributes  function handles these actions and concatenates the continuous and categorical features together, returning the results (Lines 48 and 49).

Ready for some magic?

Okay, I lied. There isn’t actually any “magic” going on in this next code block! But we will concatenate  the branches of our network and finish our multi-input Keras network:

Handling multiple inputs with Keras is quite easy when you’ve organized your code and models.

On Lines 52 and 53, we create our mlp  and cnn  models. Notice that regress=False  — our regression head comes later on Line 62.

We’ll then concatenate  the mlp.output  and cnn.output  as shown on Line 57. I’m calling this our combinedInput because it is the input to the rest of the network (from Figure 3 this is concatenate_1  where the two branches come together).

The combinedInput  to the final layers in the network is based on the output of both the MLP and CNN branches’ 8-4-1  FC layers (since each of the 2 branches outputs a 4-dim FC layer and then we concatenate them to create an 8-dim vector).

We tack on a fully connected layer with four neurons to the combinedInput  (Line 61).

Then we add our "linear"  activation  regression head (Line 62), the output of which is the predicted price.

Our Model  is defined using the inputs  of both branches as our multi-input and the final set of layers x  as the output  (Line 67).

Let’s go ahead and compile, train, and evaluate our newly formed model :

Our model  is compiled with "mean_absolute_percentage_error"  loss  and an Adam  optimizer with learning rate decay  (Lines 72 and 73).

Training is kicked off on Lines 77-80. This is known as fitting the model (and is also where all the weights are tuned by the process known as backpropagation).

Calling model.predict  on our testing data (Line 84) allows us to grab predictions for evaluating our model. Let’s perform evaluation now:

To evaluate our model, we have computed absolute percentage difference (Lines 89-91) and used it to derive our final metrics (Lines 95 and 96).

These metrics (price mean, price standard deviation, and mean + standard deviation of the absolute percentage difference) are printed to the terminal with proper currency locale formatting (Lines 100-103).

Multi-input and mixed data results

Figure 8: Real estate price prediction is a difficult task, but our Keras multi-input + mixed input regression model yields relatively good results on our limited House Prices dataset.

Finally, we are ready to train our multi-input network on our mixed data!

Make sure you have:

  1. Configured your dev environment according to the first tutorial in this series.
  2. Used the “Downloads” section of this tutorial to download the source code.
  3. Downloaded the house prices dataset using the instructions in the “Obtaining the House Prices dataset” section above.

From there, open up a terminal and execute the following command to kick off training the network:

Our mean absolute percentage error starts off very high but continues to fall throughout the training process.

By the end of training, we are obtaining of 22.41% mean absolute percentage error on our testing set, implying that, on average, our network will be ~22% off in its house price predictions.

Let’s compare this result to our previous two posts in the series:

  1. Using just an MLP on the numerical/categorical data: 26.01%
  2. Using just a CNN on the image data: 56.91%

As you can see, working with mixed data by:

  1. Combining our numerical/categorical data along with image data
  2. And training a multi-input model on the mixed data…

…has led to a better performing model!


In this tutorial, you learned how to define a Keras network capable of accepting multiple inputs.

You learned how to work with mixed data using Keras as well.

To accomplish these goals we defined a multiple input neural network capable of accepting:

  • Numerical data
  • Categorical data
  • Image data

The numerical data was min-max scaled to the range [0, 1] prior to training. Our categorical data was one-hot encoded (also ensuring the resulting integer vectors were in the range [0, 1]).

The numerical and categorical data were then concatenated into a single feature vector to form the first input to the Keras network.

Our image data was also scaled to the range [0, 1] — this data served as the second input to the Keras network.

One branch of the model included strictly fully-connected layers (for the concatenated numerical and categorical data) while the second branch of the multi-input model was essentially a small Convolutional Neural Network.

The outputs of both branches were combined and a single output (the regression prediction) was defined.

In this manner, we were able to train our multiple input network end-to-end, resulting in better accuracy than using just one of the inputs alone.

I hope you enjoyed today’s blog post — if you ever need to work with multiple inputs and mixed data in your own projects definitely consider using the code covered in this tutorial as a template.

From there you can modify the code to your own needs.

To download the source code, and be notified when future tutorials are published here on PyImageSearch, just enter your email address in the form below!


If you would like to download the code and images used in this post, please enter your email address in the form below. Not only will you get a .zip of the code, I’ll also send you a FREE 17-page Resource Guide on Computer Vision, OpenCV, and Deep Learning. Inside you'll find my hand-picked tutorials, books, courses, and libraries to help you master CV and DL! Sound good? If so, enter your email address and I’ll send you the code immediately!

, , , , , ,

75 Responses to Keras: Multiple Inputs and Mixed Data

  1. Rich February 4, 2019 at 11:39 am #

    This is a great example of fusion, thank you Adrian!

    Do you have an idea of how you would use pre-trained weights, with the features already mapped, of something trained on VOC or Imagenet in the CNN branch? Where would something like that fit in the process?

    • Adrian Rosebrock February 4, 2019 at 11:51 am #

      Do you already know how to perform fine-tuning? If not, make sure you read up on the topic. Deep Learning for Computer Vision with Python covers fine-tuning and transfer learning in detail.

      For a project such as this one you would remove perform the standard fine-tuning process for a CNN but this time have a separate branch for your categorical and numeric value inputs. From there you can train your network.

      • Hendrick October 28, 2019 at 5:17 am #

        Hi Adrian . I am your big fans. How to create a dataset for this tutorial

        • Adrian Rosebrock November 7, 2019 at 10:38 am #

          Hey Hendrick — could you be a bit more specific? What type of dataset are you trying to create?

    • Khine March 7, 2019 at 11:40 am #

      Dear Rich,

      For your case, you would need to write custom function to load pre-trained weight first. And, you can use resulted matrix in input layer.

    • riyaz July 8, 2019 at 7:17 am #

      saving images to numpy array will require so much space what is another option to save it ?
      if we want to train a network on different images of same class , how do i use image generator during training to pass train directory of two diff images?

  2. Arslan February 4, 2019 at 11:53 am #

    You are really inspiration for me to always learn new concepts in machine learning. Let me know if you have any plans for Speech Recognition/Voice related projects?
    Thanks again for the great post.

    • Adrian Rosebrock February 4, 2019 at 12:26 pm #

      Thanks Arslan 🙂 I primarily cover computer vision here so I don’t currently have any plans for speech recognition tutorials but I do think it’s an interesting topic. Thanks for the suggestion but to be honest it’s probably a bit unlikely for me to cover here.

  3. Laxmi February 5, 2019 at 1:28 am #

    Thanks Adrian for the great tutorial as always! Being a newbie in deep learning I have been following your post rapidly these days with resourceful information. I am not sure if I can ask this question here; my apology in advance if not. I studied Francis´s keras book mentioning about the usages of Sequential and functional APIs; also seen in your posts and here as well. But I have noticed some examples like this have used Sequential model also for the model architecture from Inception too. Is it possible to use Sequential model instead for functional API for Inception, ResNet or Xception too like this medium post?

    • Adrian Rosebrock February 5, 2019 at 9:16 am #

      Any Sequential API network can be implemented with a Functional API. The reverse is not true.

      • Laxmi February 6, 2019 at 8:11 pm #

        Thank you for the response. So, we can not use Sequential Model for inception, resnet or other deeper CNN then, is it?

        • Adrian Rosebrock February 7, 2019 at 6:57 am #

          Any model that is non-sequential, such as Inception or ResNet, cannot be implemented using the Sequential class. You would need the Functional API instead.

  4. Yash Rathod February 5, 2019 at 8:20 am #

    Hey Adrian, you got some wonderful stuff here. I follow each and every article you post and certainly I love each of them. Also its due to you I’ve gained so much interest in computer vision and machine learning.

    • Adrian Rosebrock February 5, 2019 at 9:10 am #

      Thanks for being a reader Yash, I’m glad you’re enjoying the tutorials!

  5. Henry February 5, 2019 at 4:15 pm #

    For the maxPrice, would it be better if we use maxPrice = df[“price”].max()

    This will guarantee the scaled trainY and testY values are within the range [0, 1].

    maxPrice = trainAttrX[“price”].max() will work most of time but it is possible the real maxPrice is in testAttrX[“price”], not in trainAttrX[“price”]. The outcome may not be significant but I feel maxPrice = df[“price”].max() is more logical.

    • Adrian Rosebrock February 7, 2019 at 7:21 am #

      Technically you are right, that would absolutely guarantee the target values are in the range [0, 1].

      However, that is incorrect in the context of running a deep learning experiment.

      We are not allowed to use our test set to determine any information on the training process. We can only use the training set to determine any values required for normalization, scaling, preprocessing, etc.

      • Henrik T. March 21, 2019 at 7:53 am #

        Also, for future predictions of unknown “samples”, you might encounter samples with even higher price. So it will be impossible to ensure that future samples stay within the [0, 1] range (unless you set a very high “maxPrice” – but then you no longer span training data from [0, 1] which is sub-optimal).

  6. Juan E. Tapiero February 5, 2019 at 11:23 pm #

    Hi Adrian, I have been happily following this project (and all the cool stuff you post). I just have a small question… why do you think I obtain different mean and standard deviation results without changing anything at all on the code? Thanks!

    • Adrian Rosebrock February 7, 2019 at 7:18 am #

      The weights in the neural network are randomly initialized. Neural networks are stochastic algorithms. You will get slightly different results each time and since we’re using a very small dataset a poor initialization of weights may lead to worse results.

      • Denis Brion February 8, 2019 at 9:25 am #

        Thanks Juan for asking a question I was asking myself and Adrian for answering.
        I hesitated between poor random initialisation (is there a way to make a fixed random seed with Keras: for debugging purposes, it would be more comfortable being able to reproduce a bug) or overflow/numerical accidents: as I use a RPi , I was unsure of the quality of Keras (and I noticed that it used 3 processors, with learning times 10 times slower than on Adrian PC…) and it gives very few time to try to fix / adapt things.

  7. Stalin Amirtharaj K February 6, 2019 at 7:01 am #

    Another great article , Adrian ! your article’s serves as reference for all of my learning & POCs. Thanks a lot 🙂

    • Adrian Rosebrock February 7, 2019 at 7:08 am #

      Thanks so much Stalin, I’m glad you’re enjoying the tutorials!

  8. Yingjie February 7, 2019 at 10:50 pm #

    Thank you for your explanation! This article helps me a lot! However, I’m still confused about implementing multiple inputs and multiple outputs case. In this article, different inputs have the same size, how does Keras deal with inputs with different size and multiple outputs? Especially when I’d like to iteratively sample a mini-batch from dataset A and then sample a mini-batch from dataset B. Thanks!

    • Xu Zhang March 13, 2019 at 2:45 pm #

      @Yingjie & @ Adrian Rosebrock,
      In this post, there are two datasets with the same sample numbers. If some samples don’t have images, what should we do? Does Keras API for the multi-input model have to keep the same numbers of samples for different input? Many thanks

      • Adrian Rosebrock March 13, 2019 at 3:04 pm #

        There is an entire body of literature that governs “missing values” in machine learning. You should spend some time researching it and looking into it.

  9. Fadwa Fawzy February 9, 2019 at 3:12 am #

    Hello Adrian,

    Thank you so much for this awesome tutorial.

    I have a question though, why did you choose #bedRs and #BathRs to be continuous features not categorical.

    I thought continuous feature means it can accepts real values like the area, but a house can’t have 2.5 bathrooms :D.

    • Adrian Rosebrock February 10, 2019 at 6:49 am #

      Actually a house can have 2.5 bathrooms. The “.5” is normally referred to as a “half bath”, such as a toilet + sink but no shower/bath.

  10. Fadwa Fawzy February 9, 2019 at 4:30 am #

    Another question. What if the number of images per entry is not consistent?. For example, the number of images per house entry ranges from 0 to 10.

    • Adrian Rosebrock February 10, 2019 at 6:49 am #

      See the other comments on this post where I address the same question.

  11. Kent February 11, 2019 at 1:27 pm #

    Hi Adrian,
    Sorry for late post. Thank you for the tutorial. Your efforts are appreciated. Sometimes intermediate results can be pulled as an embedding, to my understanding. If a joint embedding was the goal, could that be the output of the concatenation function ? Could this embedding be used in a similarity calculation ? Thank you for patience with my question. I am still trying to understand nuances. Kent

    • Adrian Rosebrock February 14, 2019 at 1:29 pm #

      You are correct, am embedding can be used here as well.

  12. Marco February 14, 2019 at 3:58 pm #

    Did you get a chance to try this with individual images that are larger than 32×32? To me this feels like it cannot capture much information about the house.

    • Adrian Rosebrock February 15, 2019 at 6:18 am #

      You can modify the architecture and try experiments with larger images if you would like. The problem is that if you increase the spatial dimensions of the images you will need to deepen your network as well. That raises a big problem as our dataset itself is so small. Realistically we would need a larger dataset to obtain higher accuracy.

      • Ravi March 6, 2019 at 5:41 am #

        Hi Adrian,

        Thanks for your tutorial. please suggest on how can i get through “from pyimagesearch import models” as i could find the respective packages. Many thanks ..

        • Adrian Rosebrock March 8, 2019 at 5:43 am #

          You need to use the “Downloads” section of the post to download the “pyimagesearch” module.

  13. Kyoosik Kim March 11, 2019 at 7:54 pm #

    Thank you so much for your work! Inspired by the tutorial, I have been doing some NLP project. I wish you could give me some advice if I can ask. There are two types of inputs; text and numerical. The texts are converted into TfIdf vectors in 1000 length and the numerical feature is simply one column added onto each vector. Now, concatenating them creates a new set of features of 1001 size. Each row should be sparse since 1000 of them are TfIdf vector values. My question is whether or not the only numerical feature may be overpowered by the TfIdf vector values. If so, how can I put more weight on the numerical feature?

    • Adrian Rosebrock March 13, 2019 at 3:24 pm #

      I don’t do much work in NLP. You should reach out to Jason over at Machine Learning Mastery. He knows a bit more about NLP than I do and would likely be able to give you a more detailed answer.

  14. Henrik T. March 21, 2019 at 9:26 am #

    Hi Adrian,
    Excellent tutorial (the series)! Previously I have been searching “high and low” for good introductions to CNN-regression, without really finding any. And the introduction to the Keras functional API and example of mixed input is equally great.
    I have two questions that I hope you can address:
    1) With a mixed input setup, is it possible to use augmentation on the CNN branch (I know it doesn’t make sense with the Houses example), or will that break the “alignment” of the two branches?
    2) Again, in relation to the mixed input, I am trying to get my head straight on the following: does the merging of the MPL impact the learning in CNN branch? What I mean is: assuming IDENTICAL initialization of the weights in the “pure CNN” regression example and “mixed input CNN” regression example will the weights end up differently due to the concatenated MPL? Or does the CNN branch act as an “independent feature extractor”?

    • Adrian Rosebrock March 22, 2019 at 8:38 am #

      Thanks Henrik, I’m glad you enjoyed the guide!

      1. You can certainly apply data augmentation to the CNN branch. I would encourage it, actually. You can either augment each image individually or simply augment the entire montage. I would test both.

      2. The two branches are independent until they merge. They don’t share any intermediary layers until they are concatenated.

  15. ashish March 27, 2019 at 4:52 am #

    Hi, thanks for the great tutorial. I have a doubt , what will happen if we set the regress= true for individual network (i.e. cnn and mlp). Is it okay to use regress= True for both of the network. I have tried the same and getting good results for my application

  16. Peter Yu April 14, 2019 at 5:04 am #

    Hi Adrian,

    Many thanks for your excellent and detailed tutorial, which exactly solves the puzzles I am now coming across.

    I hope to be allowed for a further question. What if I use the ImageDataGenerator to preprocess trainImagesX, how do I get to combine the preprocessed trainImagesX and another categorical or numeric attribute trainAttrX using model.fit_generator?

    • Adrian Rosebrock April 18, 2019 at 7:28 am #

      I would suggest reading this tutorial to learn how to create your own custom data generator for Keras.

  17. Ismail April 18, 2019 at 3:49 am #

    Excellent tutorial Adrian. Please keep this up.

    • Adrian Rosebrock April 18, 2019 at 6:25 am #

      Thanks Ismail!

  18. Ulf Wallgren May 4, 2019 at 5:10 am #

    Hi Adrian
    Exelent tutorial!
    Have you any plans for transforming this or any old tutoriai to Thensorflow 2.0 alpha0?

    • Adrian Rosebrock May 8, 2019 at 1:28 pm #

      I’m not sure what you mean. This code will work using TensorFlow 2.0. Either use TF 2 as a backend to Keras or use the “tf.keras” implementation directly.

  19. sara May 7, 2019 at 10:21 am #

    Dear Adrian:

    I really appreciate your useful tutorials, Just simple question. If I have multiple EEG time-series file from different patients, could I concatenate all of them then feed the data to the NN?


  20. Daniel May 17, 2019 at 6:11 pm #

    Hey Adrian, many thanks for this very detailed analysis, unfortunately you don’t find enough of this awesome approaches. However, I have a short question about your analysis, why is the last step in which the dataframe testY is subtracted by preds carried out?

    diff = preds.flatten() – testY

    I did the analysis once and the values in preds[1].flatten() and without .flatten() are always constant, is that right?
    Only by subtracting the values change (as in your analysis), is this common when working with multiple inputs, or why are the values constant?

    • Adrian Rosebrock May 23, 2019 at 10:06 am #

      Both to the “preds” and “testY” are vectors of the predicted values and the ground-truth values, respectively. We take the element-wise difference using that line of code.

  21. Nishu June 5, 2019 at 9:27 am #

    Hello Adrain,

    Loved your Tutorial!!! I got some new ideas just reading through your tutorial!! Thank you very much! Please post such tutorials more!!

    I have a question. Presently I have a data set that contains a label for two images simultaneously. Basically if two images are related, label is 1 and 0 if not. I don’t know how to put these two images in Keras model. I have converted these two images into array. But I am stuck now!!

    Description about data set:
    Total rows: 50000
    each row contain two images in one column and label in another column

  22. Samuel Howell August 15, 2019 at 2:06 pm #

    Hello Adrain,

    Awesome tutorial! This is the only thing I could find online to show me how to use image and non image data together, very awesome!

    But now that I have done the tutorial, I am confused about how to use the model to evaluate new data.

    For example, if I have 4 images for a house, and all the data except for the price, how would I use my model to predict a price for it? I have my model saved and I know how to load it, I’m just not sure how to correctly plug in my data….

    • Adrian Rosebrock August 16, 2019 at 5:27 am #

      Thanks for the kind words, Samuel. I’m glad you enjoyed the post. To make predictions you use the “model.predict” method. Form your input vector in the same way as we do in the guide (concatenate the categorical values, etc.)

  23. Mark Ryan August 18, 2019 at 6:04 pm #

    Great, clear article. Thanks very much for sharing it.

    I am very interested in two aspects of getting multi-input Keras models into production:
    1. creating a pipeline to encapsulate the data preparation steps so that you can apply the pipeline to apply the model to new data values. I’ve looked at wrapping the keras model so it can be used with a scikit learn pipeline, but I haven’t seen any examples of this working for mult-input Keras models.

    2. deploying multi-input Keras models. I’ve attempted to deploy a multi-input Keras model with AWS Sagemaker, but there seem to be some showstopper issues with the needed libraries that expect single input for Keras models.

    Any tips on the above two aspects of putting multi-input Keras models into production? Have you seen any end-to-end examples of multi-input Keras models that incorporate a pipeline for data prep as well as deployment? THANKS!!

  24. j September 22, 2019 at 11:07 pm #

    Good job, a great contributor to the advancement of knowledge

    • Adrian Rosebrock September 25, 2019 at 10:41 am #

      Thanks so much! 🙂

  25. quzhou October 7, 2019 at 7:43 pm #

    Hi Adrian,
    Thank you so much for the tutorial. How do you make sure that the which house attribute (line24) and house image (line29) corresponding to the same house?


    • Adrian Rosebrock October 10, 2019 at 10:19 am #

      Both functions access our Pandas’ DataFrame and loop over the valid entries sequentially, thus ensuring that the entries correspond to the same house.

      • bit_scientist January 14, 2020 at 4:01 am #

        I was about to ask a similar question and looked through the comments as you recommended.

        I have both image and text data-sets of patients. Image data-set is in a folder (not in Pandas’ Dataframe) and text data-set is in .xls file. What ponders me is that how should I create respective models for them so that I can later concatenate safely? Images’ name and column in text data contain ID values for me to keep the consistency between the two.

        Now question, do you think it is a good way to convert both images and text data into Pandas’ Dataframe separetely like you did and proceed?

        Thanks for all your contribution to the field.

        • Adrian Rosebrock January 16, 2020 at 10:31 am #

          It’s hard to say without knowing more details about your project along with your current experience level with deep learning. Feel free to send me a note and we can continue the conversation over email.

  26. Шахбоз Кодиров October 21, 2019 at 3:09 am #

    Dear Mr. Adrian Rosebrock! My name is Shahboz Qodirov. I am a post-graduate student of the South Ural State University, the High School of Electronics and Computer Science. I’m attached to the department of Informational and Measuring Equipment.
    My field of research is a computer science and information processing. The topic of my thesis is “Development of artificial neural network for predicting drill pipe sticking”.
    Dear Mr. Adrian Rosebrock, I read your work (Keras: Multiple Inputs and Mixed Data), it’s awesome work! I haven’t seen such work on the Internet, you are genius! I really want to reproduce your work, but it doesn’t work out for me, since I use Jupyter notebook. I downloaded codes for this work from your (post) site, and really want to repeat your work. I want to deal with your approach. Dear Adrian Rosebrock, please, send me the codes of your work suitable for jupyter notebook, I will be very grateful to you.
    Yours sincerely, Shahboz Qodirov.

  27. Arjun October 24, 2019 at 7:44 am #

    Hi adrian,
    I have been a fan of your work for the last few months.
    I am currently working on a project which is music generation from lyrics. It is an interesting one but seems a bit too much to handle for a beginner like me. This basically deals with matching the syllables of lyrics to the musical notes and making a rnn model. I am doing this project based on a paperwork presented on this same project. Bit it deals with two sequences as input. I am attaching a link of the paper presented and I would like some insight from you on this project.
    Please do help me with your valuable insight.
    Thank you

    • Adrian Rosebrock October 25, 2019 at 10:16 am #

      That sounds like a very neat project Arjun; however, I ask that you be respectful of my time and read this entry in my FAQ. While it’s my pleasure to help you on your journey to understanding computer vision and deep learning, I do not dissect papers/code on a 1:1 basis. It’s just too time consuming on my part — I hope you understand.

  28. Abeer November 7, 2019 at 5:16 pm #

    I love this tutorial. it is exactly what is need. Really thank you for ur hard work!!

    • Adrian Rosebrock November 14, 2019 at 9:37 am #

      Thanks Abeer! 🙂

  29. Edward November 15, 2019 at 9:51 pm #

    Hi Adrian,

    Have you tried multiple inputs (numerical + categorical and image) with data augmentation for the image?

    • Furong December 3, 2019 at 3:22 am #

      Hi, I am interested to know what kind of data augmentation you mean? Thanks

  30. cham November 18, 2019 at 12:09 am #

    Dear Mr.Adrian Rosebrock.
    I love this tutorial and other two week’s tutorials.
    I read three part series on Keras and regression.
    I really appreciate your useful and detail explanation!

    However,I have one question.
    How can I create a regression model which predicts n-dimensional vector, not just scalar like house price.

    For example, I want a model to predict 3-dimentional vector (x,y,z) as below.

    input: image & some atrributes.
    output: (x,y,z)

    How can I create such a model based on “keras-multi-input” ?


    Of course, I will use (x,y,z) training data instead of house price.

    • Adrian Rosebrock November 21, 2019 at 9:15 am #

      (x, y, z) data is just a standard integer/floating point to a neural network. You could just pass the vector in to the network as a flattened list.

  31. Nicolas Morales November 21, 2019 at 12:33 pm #

    Thank you for this tutorial! It has been very informative. Wouldn’t you want to use the third dimension in your montage, on lines 87 to 90?

  32. Defneh December 2, 2019 at 11:36 am #

    Adraian you are amazing !!! Thank you for this tutorial, I am new in machine learning. I dont know if my question make sense but is it possible to use this method for image classification.For classifying the health of a patient ( have the disease or not ) as you mentioned in the tutorial . what are the general changes that I have to make to this code ?
    Thank you again

    • Adrian Rosebrock December 5, 2019 at 10:32 am #

      Yes, you can use this method for any neural network that accepts multi-modal data. Exactly what changes are required depends on your dataset.

  33. Ernie December 9, 2019 at 9:49 am #

    Looking over this, it appears to be a great tutorial.
    But, Python does not manage memory very well and is not good at using GPU’s either.
    Have you ported this to C or C++ ?
    Or has anyone else done this? I would love a copy of that.
    Thanks for the great tutorial !

  34. aravali December 20, 2019 at 6:56 pm #

    Is it possible to Fuse image and Time series ECG signal in LSTM model.
    I want to use time series ECG signal along with images of patients.
    I am unable to use them together in LSTM.
    Please help me if it is possible.

  35. Cristian Arteaga January 12, 2020 at 2:56 pm #

    Dear Adrian. Thank you so much for this great post. It’s awesome to find such high quality and complete explanations with code like the ones you provide. I hope you can help me with a quick question about the difference in training regimes for each data type and how it affects when I concatenate both NN models. I am facing two main challenges.
    First, in my problem, the MLP for numerical data converges after 150 epochs and the CNN for image data converges after 13 epochs. Any advice on how to deal with such difference (of required epochs) when I concatenate both the MLP and CNN? Can the Adam decay help to this?
    Second, the loss values have different ranges or scale. For the numerical MLP the range is (0,8) and for the image CNN the range is (0,2). Any advice on how to deal with this difference in ranges of loss values when I concatenate both NN?

    I think my two challenges are related. I suspect the different training regimes for each NN is giving me bad results when I concatenate them (bad metrics). I would appreciate any advice you can give me. Thanks in advance!

Before you leave a comment...

Hey, Adrian here, author of the PyImageSearch blog. I'd love to hear from you, but before you submit a comment, please follow these guidelines:

  1. If you have a question, read the comments first. You should also search this page (i.e., ctrl + f) for keywords related to your question. It's likely that I have already addressed your question in the comments.
  2. If you are copying and pasting code/terminal output, please don't. Reviewing another programmers’ code is a very time consuming and tedious task, and due to the volume of emails and contact requests I receive, I simply cannot do it.
  3. Be respectful of the space. I put a lot of my own personal time into creating these free weekly tutorials. On average, each tutorial takes me 15-20 hours to put together. I love offering these guides to you and I take pride in the content I create. Therefore, I will not approve comments that include large code blocks/terminal output as it destroys the formatting of the page. Kindly be respectful of this space.
  4. Be patient. I receive 200+ comments and emails per day. Due to spam, and my desire to personally answer as many questions as I can, I hand moderate all new comments (typically once per week). I try to answer as many questions as I can, but I'm only one person. Please don't be offended if I cannot get to your question
  5. Do you need priority support? Consider purchasing one of my books and courses. I place customer questions and emails in a separate, special priority queue and answer them first. If you are a customer of mine you will receive a guaranteed response from me. If there's any time left over, I focus on the community at large and attempt to answer as many of those questions as I possibly can.

Thank you for keeping these guidelines in mind before submitting your comment.

Leave a Reply