Keras vs. tf.keras: What’s the difference in TensorFlow 2.0?

In this tutorial you’ll discover the difference between Keras and tf.keras , including what’s new in TensorFlow 2.0.

Today’s tutorial is inspired from an email I received last Tuesday from PyImageSearch reader, Jeremiah.

Jeremiah asks:

Hi Adrian, I saw that TensorFlow 2.0 was released a few days ago.

TensorFlow developers seem to be promoting Keras, or rather, something called tf.keras, as the recommended high-level API for TensorFlow 2.0.

But I thought Keras was its own separate package?

I’m so confused on “which Keras package” I should be using when training my own networks.

Secondly, is TensorFlow 2.0 worth upgrading to?

I’ve seen a few tutorials in the deep learning blogosphere discussing TensorFlow 2.0 but with all the confusion regarding Keras, tf.keras , and TensorFlow 2.0, I’m at a loss for where to start.

Could you shed some light on this area?

Great questions, Jeremiah.

Just in case you didn’t hear, the long-awaited TensorFlow 2.0 was officially released on September 30th.

And while it’s certainly a time for celebration, many deep learning practitioners such as Jeremiah are scratching their heads:

  • What does the TensorFlow 2.0 release mean for me as a Keras user?
  • Am I supposed to use the keras package for training my own neural networks?
  • Or should I be using the tf.keras submodule inside TensorFlow 2.0 instead?
  • Are there TensorFlow 2.0 features that I should care about as a Keras user?

The transition from TensorFlow 1.x to TensorFlow 2.0 is going to be a bit of a rocky one, at least to start, but with the right understanding, you’ll be able to navigate the migration with ease.

Inside the rest of this tutorial, I’ll be discussing the similarities between Keras, tf.keras , and the TensorFlow 2.0 release, including the features you should care about.

To learn the difference between Keras, tf.keras, and TensorFlow 2.0, just keep reading!

Keras vs. tf.keras: What’s the difference in TensorFlow 2.0?

In the first part of this tutorial, we’ll discuss the intertwined history between Keras and TensorFlow, including how their joint popularities fed each other, growing and nurturing each other, leading us to where we are today.

I’ll then discuss why you should be using tf.keras for all your future deep learning projects and experiments.

Next, I’ll discuss the concept of a “computational backend” and how TensorFlow’s popularity enabled it to become Keras’ most prevalent backend, paving the way for Keras to be integrated into the tf.keras submodule of TensorFlow.

Finally, we’ll discuss some of the most popular TensorFlow 2.0 features you should care about as a Keras user, including:

  • Sessions and eager execution
  • Automatic differentiation
  • Model and layer subclassing
  • Better multi-GPU/distributed training support

Included in TensorFlow 2.0 is a complete ecosystem comprised of TensorFlow Lite (for mobile and embedded devices) and TensorFlow Extended for development production machine learning pipelines (for deploying production models).

Let’s get started!

The intertwined relationship between Keras and TensorFlow

Figure 1: Keras and TensorFlow have a complicated history together. Read this section for the Cliff’s Notes of their love affair. With TensorFlow 2.0, you should be using tf.keras rather than the separate Keras package.

Understanding the complicated, intertwined relationship between Keras and TensorFlow is like listening to the love story of two high school sweethearts who start dating, break up, and eventually find their way together — it’s long, detailed, and at some points even contradictory.

Instead of recalling the full love story for you, instead we’ll review the CliffsNotes:

  • Keras was originally created and developed by Google AI Developer/Researcher, Francois Chollet.
  • Francois committed and released the first version of Keras to his GitHub on March 27th, 2015.
  • Initially, Francois developed Keras to facilitate his own research and experiments.
  • However, with the explosion of deep learning popularity, many developers, programmers, and machine learning practitioners flocked to Keras due to its easy-to-use API.
  • Back then, there weren’t too many deep learning libraries available — the popular ones included Torch, Theano, and Caffe.
    • The problem with these libraries was that it was like trying to write assembly/C++ to perform your experiments — tedious, time-consuming, and inefficient.
    • Keras, on the other hand, was extremely easy to use, making it possible for researchers and developers to iterate on their experiments faster.
  • In order to train your own custom neural networks, Keras required a backend.
    • A backend is a computational engine — it builds the network graph/topology, runs the optimizers, and performs the actual number crunching.
    • To understand the concept of a backend, consider building a website from scratch. Here you may use the PHP programming language and a SQL database. Your SQL database is your backend. You could use MySQL, PostgreSQL, or SQL Server as your database; however, your PHP code used to interact with the database will not change (provided you’re using some sort of MVC paradigm that abstracts the database layer, of course). Essentially, PHP doesn’t care what database is being used, as long as it plays with PHP’s rules.
    • The same is true with Keras. You can think of the backend as your database and Keras as your programming language used to access the database. You can swap in whatever backend you like, and as long as it abides by certain rules, your code doesn’t have to change.
    • Therefore, you can think of Keras as a set of abstractions that makes it easier to perform deep learning (side note: While Keras always enabled rapid prototyping, it was not flexible enough for researchers. That’s changing in TensorFlow 2.0 — more on that later in this article).
  • Originally, Keras’ default backend was Theano and was the default until v1.1.0.
  • At the same time, Google had released TensorFlow, a symbolic math library used for machine learning and training neural networks.
    • Keras started supporting TensorFlow as a backend, and slowly but surely, TensorFlow became the most popular backend, resulting in TensorFlow being the default backend starting from the release of Keras v1.1.0.
  • Once TensorFlow became the default backend for Keras, by definition, both TensorFlow and Keras usage grew together — you could not have Keras without TensorFlow, and if you installed Keras on your system, you were also installing TensorFlow.
    • Similarly, TensorFlow users were becoming increasingly more drawn to the simplicity of the high-level Keras API.
  • The tf.keras submodule was introduced in TensorFlow v1.10.0, the first step in integrating Keras directly within the TensorFlow package itself.
    • The tf.keras package is/was separate from the keras package you would install via pip (i.e., pip install keras).
    • The original keras package was not subsumed into tensorflow to ensure compatibility and so that they could both organically develop.
  • However, that’s now changing — when Google announced TensorFlow 2.0 in June 2019, they declared that Keras is now the official high-level API of TensorFlow for quick and easy model design and training.
  • With the release of Keras 2.3.0, Francois has stated that:
    • This is the first release of Keras that brings the keras package in sync with tf.keras
    • It is the final release of Keras that will support multiple backends (i.e., Theano, CNTK, etc.).
    • And most importantly, going forward all deep learning practitioners should switch their code to TensorFlow 2.0 and the tf.keras package.
    • The original keras package will still receive bug fixes, but moving forward, you should be using tf.keras.

As you can tell, the history between Keras and TensorFlow is long, complicated, and intertwined.

But the most important takeaway for you, as a Keras user, is that you should be using TensorFlow 2.0 and tf.keras for future projects.

Start using tf.keras in all future projects

Figure 2: What’s the difference between Keras and tf.keras in TensorFlow 2.0?

On September 17th, 2019 Keras v2.3.0 was officially released — in the release Francois Chollet (the creator and chief maintainer of Keras), stated that:

Keras v2.3.0 is the first release of Keras that brings keras in sync with tf.keras

It will be the the last major release to support backends other than TensorFlow (i.e., Theano, CNTK, etc.)

And most importantly, deep learning practitioners should start moving to TensorFlow 2.0 and the tf.keras package

For the majority of your projects, that’s as simple as changing your import lines from:

To prefacing the import with tensorflow:

If you are using custom training loops or using Sessions then you’ll have to update your code to use the new GradientTape feature, but overall, it’s fairly easy to update your code.

To help you in (automatically) updating your code from keras to tf.keras, Google has released a script named tf_upgrade_v2 script, which, as the name suggests, analyzes your code and reports which lines need to be updated — the script can even perform the upgrade process for you.

You can refer here to learn more about automatically updating your code to TensorFlow 2.0.

Computational “backends” for Keras

Figure 3: What computational backends does Keras support? What does it mean to use Keras directly in TensorFlow via tf.keras?

As I mentioned earlier in this post, Keras relies on the concept of a computational backend.

The computational backend performs all the “heavy lifting” in terms of constructing a graph of the model, numeric computation, etc.

Keras then sits on top of this computational engine as an abstraction, making it easier for deep learning developers/practitioners to implement and train their models.

Originally, Keras supported Theano as its preferred computational backend — it then later supported other backends, including CNTK and mxnet, to name a few.

However, the most popular backend, by far, was TensorFlow which eventually became the default computation backend for Keras.

As more and more TensorFlow users started using Keras for its easy to use high-level API, the more TensorFlow developers had to seriously consider subsuming the Keras project into a separate module in TensorFlow called tf.keras.

TensorFlow v1.10 was the first release of TensorFlow to include a branch of keras inside tf.keras.

Now that TensorFlow 2.0 is released both keras and tf.keras are in sync, implying that keras and tf.keras are still separate projects; however, developers should start using tf.keras moving forward as the keras package will only support bug fixes.

To quote Francois Chollet, the creator and maintainer of Keras:

This is also the last major release of multi-backend Keras. Going forward, we recommend that users consider switching their Keras code to tf.keras in TensorFlow 2.0.

It implements the same Keras 2.3.0 API (so switching should be as easy as changing the Keras import statements), but it has many advantages for TensorFlow users, such as support for eager execution, distribution, TPU training, and generally far better integration between low-level TensorFlow and high-level concepts like Layer and Model.

It is also better maintained.

If you’re a both a Keras and TensorFlow user, you should consider switching your code over to TensorFlow 2.0 and tf.keras.

Sessions and Eager Execution in TensorFlow 2.0

Figure 4: Eager execution is a more Pythonic way of working dynamic computational graphs. TensorFlow 2.0 supports eager execution (as does PyTorch). You can take advantage of eager execution and sessions with TensorFlow 2.0 and tf.keras. (image source)

TensorFlow 1.10+ users that utilize the Keras API within tf.keras will be familiar with creating a Session to train their model:

Creating the Session object and requiring the entire model graph to be built ahead of time was a bit of a pain, so TensorFlow 2.0 introduced the concept of Eager Execution, thereby simplifying the code to:

The benefit of Eager Execution is that the entire model graph does not have to be built.

Instead, operations are evaluated immediately, making it easier to get started building your models (as well as debugging them).

For more details on Eager Execution, including how to use it with TensorFlow 2.0, refer to this article.

And if you want a comparison on Eager Execution vs. Sessions and the impact it has on the speed of training a model, refer to this page.

Automatic differentiation and GradientTape with TensorFlow 2.0

Figure 5: How is TensorFlow 2.0 better at handling custom layers or loss functions? The answer lies in automatic differentiation and GradientTape. (image source)

If you’re a researcher who needed to implement custom layers or loss functions, you likely didn’t like TensorFlow 1.x (and rightfully so).

TensorFlow 1.x’s custom implementations were clunky to say the least — a lot was left to be desired.

With the release of TensorFlow 2.0 that is starting to change — it’s now far easier to implement your own custom losses.

One way it’s becoming easier is through automatic differentiation and the GradientTape implementation.

To utilize GradientTape all we need to do is implement our model architecture:

Define our loss function and optimizer:

Create the function responsible for performing a single batch update:

And then train the model:

The GradientTape magic handles differentiation for us behind the scenes, making it far easier to work with custom losses and layers.

And speaking of custom layer and model implementations, be sure to refer to the next section.

Model and layer subclassing in TensorFlow 2.0

TensorFlow 2.0 and tf.keras provide us with three separate methods to implement our own custom models:

  1. Sequential
  2. Function
  3. Subclassing

Both the sequential and functional paradigms have been inside Keras for quite a while, but the subclassing feature is still unknown to many deep learning practitioners.

I’ll be doing a dedicated tutorial on the three methods next week, but for the time being, let’s take a look at how to implement a simple CNN based on the seminal LeNet architecture using (1) TensorFlow 2.0, (2)  tf.keras, and (3) the model subclassing feature:

Notice how the LeNet class is a subclass of Model.

The constructor (i.e., the init) of LeNet defines each of the individual layers inside the model.

The call method then performs the forward-pass, enabling you to customize the forward pass as you see fit.

The benefit of using model subclassing is that your model:

  • Becomes fully-customizable.
  • Enables you to implement and utilize your own custom loss implementations.

And since your architecture inherits the Model class, you can still call methods like .fit(), .compile(), and .evaluate(), thereby maintaining the easy-to-use (and familiar) Keras API.

If you’re interested in learning more about LeNet, you can refer to this previous article.

TensorFlow 2.0 introduces better multi-GPU and distributed training support

Figure 6: Is TenorFlow 2.0 better with multiple GPU training? Yes, with the single worker MirroredStrategy. (image source)

TensorFlow 2.0 and tf.keras provide better multi-GPU and distributed training through their MirroredStrategy.

To quote the TensorFlow 2.0 documentation, “The MirroredStrategy supports synchronous distributed training on multiple GPUs on one machine”.

If you want to use multiple machines (each having potentially multiple GPUs), you should take a look at the MultiWorkerMirroredStrategy.

Or, if you are using Google’s cloud for training, check out the TPUStrategy.

For now though, let’s assume you are on a single machine that has multiple GPUs and you want to ensure all of your GPUs are used for training.

You can accomplish this by first creating your MirroredStrategy:

You then need to declare your model architecture and compile it within the scope of the strategy:

And from there you can call .fit to train the model:

Provided your machine has multiple GPUs, TensorFlow will take care of the multi-GPU training for you.

TensorFlow 2.0 is an ecosystem, including TF 2.0, TF Lite, TFX, quantization, and deployment

Figure 7: What is new in the TensorFlow 2.0 ecosystem? Should I use Keras separately or should I use tf.keras?

TensorFlow 2.0 is more than a computational engine and a deep learning library for training neural networks — it’s so much more.

With TensorFlow Lite (TF Lite) we can train, optimize, and quantize models that are designed to run on resource-constrained devices such as smartphones and other embedded devices (i.e., Raspberry Pi, Google Coral, etc.).

Or, if you need to deploy your model to production, you can use TensorFlow Extended (TFX), an end-to-end platform for model deployment.

Once your research and experiments are complete, you can leverage TFX to prepare the model for production and scale your model using Google’s ecosystem.

With TensorFlow 2.0 we are truly starting to see a better, more efficient bridge between research, experimentation, model preparation/quantization, and deployment to production.

I’m truly excited about the release of TensorFlow 2.0 and the impact it will have on the deep learning community.

Credits

All code examples from this post came from TensorFlow 2.0’s official examples. Be sure to refer to the complete code examples provided by Francois Chollet for more details.

Additionally, definitely check out Sayak Paul’s Ten Important Updates from TensorFlow 2.0 article which helped inspire today’s blog post.

Summary

In this tutorial, you learned about Keras, tf.keras, and TensorFlow 2.0.

The first important takeaway is that deep learning practitioners using the keras package should start using tf.keras inside TensorFlow 2.0.

Not only will you enjoy the added speed and optimization of TensorFlow 2.0, but you’ll also receive new feature updates — the latest release of the keras package (v2.3.0) will be the last release to support multiple backends and feature updates. Moving forward, the keras package will receive only bug fixes.

You should seriously consider moving to tf.keras and TensorFlow 2.0 in your future projects.

The second takeaway is that TensorFlow 2.0 is that it’s more than a GPU-accelerated deep learning library.

Not only do you have the ability to train your own models using TensorFlow 2.0 and tf.keras, but you can now:

  • Take those models and prepare them for mobile/embedded deployment using TensorFlow Lite (TF Lite).
  • Deploy the models to production using TensorFlow Extended (TF Extended).

From my perspective, I’ve already started porting my original keras code to tf.keras. I would suggest you start doing the same.

I hope you enjoyed today’s tutorial — I’ll be back with new TensorFlow 2.0 and tf.keras tutorials soon.

To be notified when future tutorials are published here on PyImageSearch (and receive my free 17-page Resource Guide PDF on Computer Vision, Deep Learning, and OpenCV), just enter your email address in the form below!

, , ,

33 Responses to Keras vs. tf.keras: What’s the difference in TensorFlow 2.0?

  1. Urvish October 21, 2019 at 11:32 am #

    Great post. Waiting for more poste like this on Tr 2.0 from using tf.dataset to how to deploy model using flask or django in Python. Please, create some more posts on how to use tf.keras for creating custom models, losses etc. Appreciate your work Adrian. Thanks a lot.

    • Adrian Rosebrock October 21, 2019 at 12:27 pm #

      Thanks Urvish. I have plans to write a few guides on using “tf.data” to improve training time. Next week you’ll see an example of implementing and training a custom model via model subclassing as well.

  2. Gregory Pierce October 21, 2019 at 1:24 pm #

    Thanks for posting. I had already started porting all of my keras. to tf.keras when it became available. One thing I’m starting to see between TF2 and PyTorch is some level of convergence in what those low-level APIs look like.

    • Adrian Rosebrock October 21, 2019 at 1:30 pm #

      Absolutely. TensorFlow 2.0’s gradient tape feature is fantastic and certainly helps with that.

  3. David Bonn October 21, 2019 at 3:02 pm #

    Hi Adrian,

    This post is very helpful and clarifies a lot of things about why I would want to upgrade to TF 2.0.

    When the smoke clears a bit I will probably do exactly that. I am excited about the improvements for multi-gpu (and possibly multi-system) training, which was always kind of a weak spot in Keras.

    Thanks for the post.

    • Adrian Rosebrock October 21, 2019 at 6:00 pm #

      Thanks David. Expect some more TensorFlow 2.0 posts in the future as well! 🙂

  4. gab October 21, 2019 at 4:27 pm #

    Great Article!

    • Adrian Rosebrock October 21, 2019 at 5:59 pm #

      Thanks Gab!

  5. Kantha Girish October 21, 2019 at 6:27 pm #

    Hi Adrain,

    Thank you for a detailed write-up on tensorflow-2.0. I have been using pytorch for quite some time. Subclassing has been in pytorch (probably since its inception). I like how the implementations look quite similar. It helps someone like me switch back and forth between tensorflow and pytorch easily.

    Thanks,
    Kantha Girish

    • Adrian Rosebrock October 25, 2019 at 10:05 am #

      Yep, it will get much easier to switch back and forth if you choose to use PyTorch.

  6. John Viguerie October 21, 2019 at 7:41 pm #

    Timely post. Thanks.

    • Adrian Rosebrock October 25, 2019 at 10:05 am #

      You are welcome, John!

  7. Pawel Rolbiecki October 22, 2019 at 4:49 am #

    Hello Adrian.
    I have a very short question. Does TensorFlow Lite in version 2.0 work with Rasp Berry pi and Intel’s Neural Compute Stick 2? I tried to search it but with no results.
    I suppose that if the answer is “YES” than it be obvious to use it because of speed benefits.
    Thanks.
    Pawel

  8. Borys Kabakov October 22, 2019 at 5:20 am #

    At last someone wrote a good digest on this topic, thank you very much!

    • Adrian Rosebrock October 25, 2019 at 10:05 am #

      Thanks Borys!

  9. Reyhan October 22, 2019 at 10:19 am #

    I understand basic machine learning / neural network (without using library), where do i start if i want to learn to use tensorflow 2.0?

  10. Julian F Winter October 22, 2019 at 2:16 pm #

    The problem is that upgrading Tensorflow is a painful process with CudaDnn updates and Anaconda lagging Tensorflow dependencies. Tensorflow under Anaconda environment is too fragile.

    • Adrian Rosebrock October 25, 2019 at 10:04 am #

      That’s also part of the reason why I don’t use Anaconda.

    • JaganY October 28, 2019 at 1:04 am #

      conda install -c anaconda tensorflow-gpu

      I could install TF2.0 with one command after creating an environment with python 3.7.4 on ubuntu 18.03 and GTX 1060. Zero errors

  11. Oscar Rangel October 22, 2019 at 6:00 pm #

    Great Article!! thanks so I am gathering that in the near furure there is no need to install Keras if we use tf.keras in all our code. is this correct?

    Thanks.

    • Adrian Rosebrock October 25, 2019 at 10:04 am #

      That is correct.

  12. Xu Zhang October 22, 2019 at 6:18 pm #

    Thank you so much for your great post.
    I will use tf.keras instead of keras in my future projects.
    However, how about trained keras models? Is it possible to transfer those trained keras models into tf.keras models? Many thanks.

    • Adrian Rosebrock October 25, 2019 at 10:04 am #

      Most models should be transferrable between the two; however, once you diverge between versions you can run into implementation differences and bug fixes. The models very well may run but could lead to inconsistent results. I try to match versions whenever possible.

      • Xu Zhang October 25, 2019 at 2:15 pm #

        Thank you so much for your reply.

        Do you mean that I need to implement differences and retrain the models? It is not feasible because I have taken so long to train, evaluate and finalize the models. Is there any way to transfer the trained models into new tf.keras for deployment. Thanks again

        • Adrian Rosebrock October 30, 2019 at 9:26 am #

          Try loading the models via Keras/tf.keras’ load_model function and then evaluate the model on your existing test set. If the results are near identical then you can safely move on to deployment. If they are substantially different you’ll want to re-train the model using tf.keras.

  13. Mohanad October 24, 2019 at 1:11 am #

    Thank you very much.
    I am interesting in Raspberry Pi, Google Coral and mobile applications using Tensorflow Lite.
    I really appreciate your efforts in this post and looking forward to see more posts in this field.

    • Adrian Rosebrock October 25, 2019 at 10:03 am #

      Thanks Mohanad. If you’re interested in The RPi, Google Coral, and embedded devices for deep learning then you should take a look at Raspberry Pi for Computer Vision where I cover them in detail.

      • Mohanad October 26, 2019 at 2:31 pm #

        Thank you so much Adrian. I am wondering if there is a way to buy the complete bundle and use it as an educated purposes for specific college, so that more students can benefit form it?

        • Adrian Rosebrock October 30, 2019 at 9:24 am #

          Hey Mohanad — send me an email and we can discuss there.

  14. Syed Talha Bukhari October 26, 2019 at 10:32 am #

    Hi Adrian!

    I had questions about Eager Execution after reading this article:

    1.) You stated:
    “… so TensorFlow 2.0 introduced the concept of Eager Execution …”
    But Eager Execution was available even before TensorFlow 2.0 was released, although it has Eager Execution enabled by default.

    2.) ‘model.fit’ is older than Eager Execution, and in no way demonstrates the concept. Eager Execution basically meant TensorFlow immediately computed output of a computation and returned to Python. ‘model.fit’ can (and is usually) run without Eager Execution.

    Thank you for time!

  15. Hassan October 29, 2019 at 4:03 pm #

    Excellent Article about Keras benchmarking

    • Adrian Rosebrock October 30, 2019 at 9:24 am #

      Thanks Hassan!

Before you leave a comment...

Hey, Adrian here, author of the PyImageSearch blog. I'd love to hear from you, but before you submit a comment, please follow these guidelines:

  1. If you have a question, read the comments first. You should also search this page (i.e., ctrl + f) for keywords related to your question. It's likely that I have already addressed your question in the comments.
  2. If you are copying and pasting code/terminal output, please don't. Reviewing another programmers’ code is a very time consuming and tedious task, and due to the volume of emails and contact requests I receive, I simply cannot do it.
  3. Be respectful of the space. I put a lot of my own personal time into creating these free weekly tutorials. On average, each tutorial takes me 15-20 hours to put together. I love offering these guides to you and I take pride in the content I create. Therefore, I will not approve comments that include large code blocks/terminal output as it destroys the formatting of the page. Kindly be respectful of this space.
  4. Be patient. I receive 200+ comments and emails per day. Due to spam, and my desire to personally answer as many questions as I can, I hand moderate all new comments (typically once per week). I try to answer as many questions as I can, but I'm only one person. Please don't be offended if I cannot get to your question
  5. Do you need priority support? Consider purchasing one of my books and courses. I place customer questions and emails in a separate, special priority queue and answer them first. If you are a customer of mine you will receive a guaranteed response from me. If there's any time left over, I focus on the community at large and attempt to answer as many of those questions as I possibly can.

Thank you for keeping these guidelines in mind before submitting your comment.

Leave a Reply

[email]
[email]