I’m writing a book on Deep Learning and Convolutional Neural Networks (and I need your advice).

dl_book_announcement_header

You may have heard me mention it in a passing comment on the PyImageSearch blog…

Maybe I even hinted at it in a 1-on-1 email…

Or perhaps you simply saw the writing on the wall due to the recent uptick in Deep Learning/Neural Network tutorials here on the blog…

But I’m here today to tell you that the rumors are true:

I am writing a new book on Deep Learning with a focus on:

  • Neural Networks and Machine Learning.
  • Convolutional Neural Networks (CNNs).
  • Object detection/localization with Deep Learning.
  • Training large-scale (ImageNet-level) networks.
  • Hands-on implementation using the Python programming language and the Keras + mxnet libraries.

As you can see, this book will mainly focus on Deep Learning in the context of image classification and understanding. While there are many facets of Deep Learning, I feel more qualified (and most capable of) writing a Deep Learning book specific to Convolutional Neural Networks and image classification (and therefore the relation to computer vision in general).

But before I can publish the book, I need your help first…

To start, I haven’t finalized the name of the book yet. I’ll be sending an email with a survey to nail down the book title next week, so if you’re interested in helping name the book, stay tuned for this email.

After the book is named, I’ll be launching a Kickstarter campaign in mid-January 2017 to finish funding the Deep Learning book.

Once the funding is complete, I fully intend on having the book completed and published by Summer/Autumn 2017.

To learn more about the upcoming Deep Learning + Convolutional Neural Network + Image Classification book, and more importantly, lend your opinion to help shape the future of the book, keep reading.

I need your advice on my upcoming Deep Learning book

Every year I take a 3-day “retreat” of sorts.

I venture out to a National Park in the western part of the United States (normally somewhere in Utah, Arizona, or Nevada).

I sit by myself along the peaks of Angels LandingThe Subway, or even Observation Point totally disconnected:

Figure 1: Sitting on top of Observation Point, contemplating life, existence...and what to put in the upcoming Deep Learning book.

Figure 1: Sitting on top of Observation Point, contemplating life, existence…and what to put in the upcoming Deep Learning book.

Besides the clothes on my back, my hiking gear, and a supply of food and water I’m essentially “naked” — I don’t bring my laptop, iPad, or anything electronic (besides a cellphone, turned off, and safely waterproofed and buried in my bag for emergency purposes).

I sit there overlooking the majestic views with only two “natural” resources: my pen and notebook.

About 6 months ago I took one of these 3-day retreats and spent hours enraptured in thought, pondering what topics I should cover in this upcoming Deep Learning + Image Classification book.

These days were some of the most fruitful, productive periods of my life.

After only 3 days of hiking I had over 60+ pages of notes in my notebook.

As soon as I got home from the trip I immediately broke ground on the book.

I started writing code.

Gathering results.

And I’m now pleased to say that I’ve coded over 70% of the projects and gathered over 60% of the results required to put this Deep Learning + Convolutional Neural Network book together…

…but I need your help to get the rest of the way there.

Any successful writer, entrepreneur, or business owner (especially in a highly technical field) will tell you the importance of sharing what you’re doing/building with your audience so they can provide you with feedback, input, and insights.

The last thing you would want to do is write a book (or build a product) that no one wants.

I’m no different.

In order to make this Deep Learning book a success, I need your help.

In the following section I’ve included a rough sketch of what I plan to cover in this Deep Learning book.

This sketch is by no means finalized, but it does reflect what I think is important to review in a book focusing on Deep Learning, Convolutional Neural Networks, and Image Classification.

Take a look at this list of topics, then be sure to send me an email, shoot me a message, or reply in the comments form at the bottom of this post with your feedback/suggestions.

What is going to be covered in this Deep Learning + Convolutional Neural Network book?

My general plan for this upcoming Deep Learning book is to focus primarily on Convolutional Neural Networks (CNNs) and image classification/understanding using the Python programming language.

This book will be very detailed, including academic citations and references to current state-of-the-art work, while at the same time being super practical and hands-on (with lots of code) in the same style as the rest of the PyImageSearch blog. All code examples will be done in Python using either Keras or the mxnet library.

While there are many facets of Deep Learning, my primary area of expertise lies within computer vision and image classification — thus I think it’s important that I write what I know about.

When it comes to structuring the Deep Learning book, I’ll be breaking the book into “tiers”, just like I do for Practical Python and OpenCV.

By breaking the book into tiers I’ll be able to allow you (the reader) to select the tier that best fits your needs and budget.

This means that if you’re just getting started with Deep learning you’ll be able to afford one of the cheaper bundles to help you get up to speed.

And if you already have experience in the Deep Learning world (or if you simply want the complete package) and want to learn more advanced, scalable techniques, then you’ll be able to go for one of the higher tier bundles as well.

I haven’t fully fleshed out where the “line” will be drawn separating each of the tiers/bundles (although I have a pretty good idea), but below you can find a list of the topics I plan on covering in the book.

If you have any suggestions for additional topics to be covered, please send me an email, leave a comment on this blog post, or shoot me a message.

Deep Learning + Convolutional Neural Network book topics

As promised, here is a rough outline of the topics I plan to cover inside this Deep Learning + Convolutional Neural Network book.

If you have a suggestion of a topic to cover, just a leave a comment on this post or shoot me a message and I’ll see if we can make it happen!

Machine Learning Basics

  • Learn how to setup and configure your development environment to study Deep Learning.
  • Understand image basics, including coordinate systems; width, height, depth;, and aspect ratios.
  • Review popular image datasets used to benchmark Machine Learning, Deep Learning, and Convolutional Neural Networks algorithms.
  • Form a solid understanding of machine learning basics, including:
    • The simple k-NN classifier.
    • Parameterized learning (i.e., “learning from data):
      • Data and feature vectors.
      • Understanding scoring functions.
      • How loss functions work.
      • Defining weight matrices and bias vectors (and how they facilitate learning).
    • Basic optimization methods (i.e., how “learning” is actually done) via:
      • Gradient descent.
      • Stochastic Gradient Descent (SGD).

Neural Network Basics

  • Discover feedforward neural network architectures.
  • Implement the classic Perceptron algorithm by hand.
    • Use the Perceptron algorithm to learn actual functions (and understand the limitations of the Perceptron algorithm).
  • Take an in-depth dive into the Backpropagation algorithm.
    • Implement Backpropagation by hand using Python + NumPy.
    • Utilize a worksheet to help you practice this critical algorithm.
  • Grasp multi-layer networks (and train them from scratch).
  • Implement neural networks both by hand and with the Keras library.

Introduction to Convolutional Neural Networks

  • Understand convolutions (and why they are so much easier to grasp than they seem).
  • Study Convolutional Neural Networks (what they are used for, why we use them, etc.).
  • Review the building blocks of Convolutional Neural Networks, including:
    • Convolutional layers
    • Activation layers
    • Pooling layers
    • Batch Normalization
    • Dropout
    • …etc.
  • Discover common network architecture patterns you can use to design architectures of your own with minimal frustration and headaches.
  • Utilize out-of-the-box CNNs for classification that are pre-trained and ready to be applied to your own images/image datasets (VGG16, VGG19, ResNet50, etc.).
  • Train your first Convolutional Neural Network from scratch.
  • Save and load your own network models from disk.
  • Checkpoint your models to spot high performing epochs/restart training.
  • Learn how to spot underfitting and overfitting, allowing you to correct for them and improve your classification accuracy.
  • Utilize decay and learning rate schedulers.
  • Train the classic LeNet architecture from scratch to recognize handwritten digits.

Work With Your Own Datasets

  • Learn how to gather your own training images.
  • Discover how to annotate and label your dataset.
  • Train a Convolutional Neural Network on top of your dataset.
  • Evaluate the accuracy of your model.
  • …all of this explained by demonstrating how to gather, annotate, and train a CNN to break image captchas.

Advanced Convolutional Neural Networks

  • Discover how to use transfer learning to:
    • Treat pre-trained networks as feature extractors to obtain high classification accuracy with little effort.
    • Utilize fine-tuning to boost the accuracy of pre-trained networks.
    • Apply data augmentation to increase network classification accuracy without gathering more training data.
  • Understand rank-1 and rank-5 accuracy (and how we use them to measure the classification power of a given network).
  • Explore how network ensembles can be used to increase classification accuracy simply by training multiple networks.
  • Discover my optimal pathway for applying Deep Learning techniques to maximize classification accuracy (and which order to apply these techniques in to achieve greatest effectiveness).

Scaling to Large Image Datasets

  • Learn how to convert an image dataset from raw images on disk to HDF5 format, making networks easier (and faster) to train.
  • Compete in Stanford’s cs231n Tiny ImageNet classification challengeand take home the #1 position.
  • Implement the VGGNet architecture (and variants of).
  • Hand-code the AlexNet architecture and use it to:
  • Utilize image cropping for an easy way to boost accuracy on your test set.
  • Explore more advanced optimization algorithms, including:
    • RMSprop
    • Adagrad
    • Adadelta
    • Adam
    • Adamax
    • Nadam
    • …and how to fine-tune SGD parameters.

Object Detection and Localization using Deep Learning

  • Utilize naive image pyramids and sliding windows for object detection.
  • Train your own YOLO detector for recognizing objects in images/video streams in real-time.

ImageNet: Large Scale Visual Recognition Challenge

  • Discover what the massive ImageNet (1,000 category) dataset is and why it’s considered the de-facto image classification challenge to benchmark algorithms.
  • Obtain and convert the ImageNet dataset into a format suitable for training.
  • Learn how to utilize multiple GPUs to train your networks in parallel, greatly reducing training time.
  • Find out how to train the AlexNet and VGGNet architectures on ImageNet.
  • Apply the SqueezeNet architecture to ImageNet to obtain a (high accuracy) model, fully deployable to smaller devices such as the Raspberry Pi.

ImageNet: Tips and Tricks

  • Save weeks (and even months) of training time by discovering learning rate schedulers that actually work.
  • Spot overfitting on ImageNet and catch it before you waste hours (or even days) watching your validation accuracy stall.
  • Learn how to restart training from saved epochs, lower learning rates, and boost accuracy.
  • Unlock the same techniques deep learning pros use to tune hyperparameters to their networks.

Case Studies

  • Train a network to predict the gender and age from people in images using Deep Learning techniques.
  • Automatically classify the make and model of a car using Convolutional Neural Networks.
  • Determine (and correct) image orientation using pre-trained CNNs.

So, what do you think?

As you can see, this is shaping up to be quite the Deep Learning + Convolutional Neural Network + Image Classification book.

If you have any feedback or suggestions on topics that should (or even should not) be covered, please feel free to contact me or leave a comment at the bottom of this blog post.

Why a Kickstarter campaign?

I’ll have more details on the upcoming Kickstarter campaign for the new Deep Learning book in the next couple of weeks, but since I know I’ll be getting asked that question now, I wanted to say a few words about it.

First, I love Kickstarter campaigns.

They are a fantastic way to spread the word about a project beyond your own audience (this is especially true if the project is successfully funded).

Secondly, I already have experience running a Kickstarter campaign.

Perhaps not many readers know this, but the PyImageSearch Gurus course was originally funded by a Kickstarter campaign back in 2015.

Without this Kickstarter campaign I would have not had the funds necessary to dedicate my time to writing, licensing course software, and putting the entire course experience together.

In the context of this deep learning book, I’ll be running a Kickstarter campaign to raise funding for (additional) GPU instances in the Amazon EC2 ecosystem.

I already have my NVIDIA DIGITS DevBox which has been running around the clock performing experiments and gathering results for nearly 6 months — but at the end of the day, it’s still only one machine.

The more GPU instances I have running the faster I can gather results which therefore allows me to publish the Deep Learning book quicker.

These GPU instances are also not cheap.

The px.8xlarge (8 GPUs) and p2.16xlarge (16 GPUs) on Amazon EC2 clock in at $7.20 and $14.40 per hour, respectively.

I want to stress the point that as a reader of this book you will not have to utilize high end GPUs or be expected to have access them.

The basic chapters of this book will easily be able to run on your CPU while the more advanced chapters can utilize more commodity-based GPUs.

Instead, the point of me using the higher tier GPU instances in the EC2 ecosystem is that I’ll be able to publish the Deep Learning book faster (and more importantly, get it in your hands quicker).

Interested in learning more?

To stay in the loop regarding this upcoming Deep Learning + Convolutional Neural Network book, please click the following button and enter your email address:

Along with book updates, I’ll also be sending a short survey to help name the book within the next week, so be sure to keep an eye on your inbox — I really need your input!

Keep in mind: I am writing this book for you.

I want to wrap up this post by saying that I am writing this Deep Learning book for YOU.

If you see any topics that you would like to be included in the book, please email me, send me a message, or post in the comments section below.

I cannot guarantee that I’ll be able to accommodate all (or even most) of the requests/suggestions, but I will do my absolute best to consider all opinions and suggestions to help make this the BEST deep learning book available today.

Keep an eye on your inbox for the book title survey, otherwise I will be back in early January with the finalized details on the Kickstarter campaign and book topics list.

, , ,

32 Responses to I’m writing a book on Deep Learning and Convolutional Neural Networks (and I need your advice).

  1. Salvatore December 12, 2016 at 12:52 pm #

    should be very nice to see more informations about use of keras with multiple GPUs in hands-on section…

    we can’t wait long time after this exiting announcement!

    • Adrian Rosebrock December 14, 2016 at 8:49 am #

      I’ll absolutely be covering multiple GPUs 🙂

  2. Alexon December 12, 2016 at 2:03 pm #

    Hugely excited, I will definetly be getting a copy.
    Any chance we can get our hands on a physical copy? Something to display proudly on the bookcase.

    One thing that I would love to see, where you mention getting your own datasets, is making the most of what you can get (image flipping, background swapping and whatever tricks we can add to the arsenel).
    Where you talk about a ccoordinate system you mention depth, does that mean we can play with some 3D pointclouds? Kinect or similar would be totally awesome.

    • Adrian Rosebrock December 14, 2016 at 8:47 am #

      I will be breaking the book into tiers/bundles similar to Practical Python and OpenCV. One of the tiers will absolutely include a physical copy 🙂

      I’ll also be covering data augmentation it detail as well.

      Regarding coordinate systems, I was simply referring to the basics of image dimensions (width, height, number of channels). I don’t have any plans to cover 3D point clouds/Kinect.

  3. dbv December 12, 2016 at 3:52 pm #

    Good to know you’re working on this book. Some initial thoughts:

    * Focus on one framework (eg. keras) otherwise it becomes confusing very quickly.
    * Look into TensorFlow Slim as it could usurp keras in the near future.
    * Better application case studies outside of the conventional examples. Maybe a survey to readers to find out what applications people are building and use that to focus on industrial uses.
    * Andrew Ng did a good session at NIPS 2016 on the “Nuts and Bolts of Deep-Learning …” and skewing towards the practical will be of value.

    • Adrian Rosebrock December 14, 2016 at 8:43 am #

      Thanks for the suggestion on TensorFlow Slim. I’d be really surprised if any library knocks Keras off as the top library for modular deep learning development. Theano and TensorFlow can do everything Keras does, but makes life so much easier since Keras creates a nice little API wrapper.

      I’ll also be sure to review Andrew Ng’s talk in more detail.

  4. Bapireddy.K December 12, 2016 at 8:49 pm #

    Hi Adrian, I would suggest that there is lot cutting edge research being done by companies like google, facebook and other big tech companies in the deep learning field. Also most of these companies are open sourcing their work. It would be really helpful to the users like us if there is a section where we can know about the latest research about these fields, and how to practice those models.

    • Adrian Rosebrock December 14, 2016 at 8:40 am #

      By definition cutting edge and state-of-the-art research changes on a daily basis. By the time I included information on (for example) the latest NIPs proceedings new papers would be published on arXiv that already build on these methods. I could see doing a supplementary material/companion website for this, but again, by definition the state-of-the-art changes constantly so that’s likely not the most appropriate topic for a book.

  5. Surya Prakash December 12, 2016 at 9:58 pm #

    Hello Adrian! I am a beginner to the subject so I don’t think I have much to say on the topics themselves. Although I think it would be a good idea to compare and contrast the hardware available in today’s market, why GPUs can do the job better than the CPUs?, why not FPGAs? because it would give beginners like me something sensible to work with instead of blindly guiding us through the concepts. I wish you good luck on your upcoming book!

    • Adrian Rosebrock December 14, 2016 at 8:38 am #

      Great points on the hardware Surya, thank you for the suggestion.

  6. Sumana Bhandari December 12, 2016 at 11:01 pm #

    Hi Adrian,

    Congratulations for writing new book on Deep Learning which will be a rich source of information and will be of immense benefit to learn!! Your rough outline of the topics look excellent! I have a suggestion for an additional topic. Any chance of including Extreme Learning Machine in the advanced topics?

    • Adrian Rosebrock December 14, 2016 at 8:37 am #

      Thank you for the input Sumana, I appreciate it!

      ELMs, while interesting, I’m not sure have a context in this book related to deep learning, especially related to image classification. I’ll consider it, but must admit that I don’t think I’ll be covering it.

  7. AMITAVA KARAK December 13, 2016 at 2:43 am #

    Hello ..A very warm greetings from Amitava Karak form India.

    I wish if you could include some NVDIA Jetson TX 1 hands on experiment in this book it will be great.

    • Adrian Rosebrock December 14, 2016 at 8:33 am #

      I’ll consider this, but I’d like to ensure the book isn’t specific to a set of hardware. I could see doing a separate Jetson TX1 mini-book/guide, but I’m not sure it will fit into a book that focuses on the broader concepts of deep learning and visual recognition.

  8. Stefanie Lueck December 13, 2016 at 2:50 am #

    Simply awesome!

  9. Robin December 13, 2016 at 4:13 am #

    Really?I am very glad when I saw the news of this blog.In my opinion,I think you can consider adding more popular ML&DL libraries,such as sklearn,TensorFlow etc.Recently, the good news is that install TensorFlow on Windows system is easier than before.

    • Adrian Rosebrock December 14, 2016 at 8:32 am #

      The scikit-learn library will be used when appropriate. Keras can also run with either a TensorFlow or Theano backend.

  10. Abkul December 13, 2016 at 7:19 am #

    Hi Adrian,

    Cant wait to buy the book!!!!!

    The Deep learning book will be a major leap forward to bridge the “scanty” literature available today.
    The following are my suggestions:
    …Introductory/refresher Mathematical topics relevant to Deep learning/Neural networks
    …Deep believe networks
    –Autoencoders
    ….Representation learning
    …techniques for learning from small datasets
    ….deep generative models
    …deep feedforward networks
    …major algorithms
    ….implementations approaches
    …real word CASE studies
    ….

    As always I believe something excellent will come out of your effort.

    • Adrian Rosebrock December 14, 2016 at 8:32 am #

      Thank you for sharing your input Abkul! Restricted Boltzmann Machines and Deep Belief Networks should certainly be covered.

  11. Jay December 13, 2016 at 1:02 pm #

    That’d be wonderful, Adrian, and how amazing you are working! No doubt it’d be another great source of knowledge for anyone interested in the area.

  12. Paulo December 13, 2016 at 3:49 pm #

    Hi Adrian,

    Great news could you please add R-CNN module on your book 🙂

    I will really appreciate that

    Regards ,
    Paulo

    • Adrian Rosebrock December 14, 2016 at 8:28 am #

      I will certainly consider covering R-CNNs as well.

  13. Purin December 13, 2016 at 10:21 pm #

    Hi Arian,

    I am really interested in learning about deep learning and computer vision. I have manufacturing background. I have been using several commercial software for machine vision for years. Halcon is one of the software that utilizes deep learning for OCR and Pass/Fail detection. For this reason, I would like to learn how to use deep learning and open source software like OpenCV in computer vision applications.

    • Adrian Rosebrock December 14, 2016 at 8:27 am #

      Thanks for sharing Purin, I appreciate your input!

  14. Toribio December 14, 2016 at 12:25 pm #

    Congratulations !!

  15. WILLIAM REIS FERNANDES December 14, 2016 at 1:06 pm #

    Hello Adrian,

    I’d love to read you talking about Tensorflow and Train Image to set standards.

    With your articles, you can already develop something.

    • Adrian Rosebrock December 18, 2016 at 9:12 am #

      I primarily used Keras for deep learning (along with a bit of mxnet). Keras can wrap around either TensorFlow or Theano, so the end result is essentially the same.

  16. Bonzadog December 14, 2016 at 2:52 pm #

    I will look forward to this book and to start learning about deep learning. I hope that it will not be too complicated.

    • Adrian Rosebrock December 18, 2016 at 9:11 am #

      The book will be designed for both:

      1. Beginners with little-to-no machine learning or neural network experience.
      2. Advanced readers who already use deep learning on a regular basis.

      Regardless of your experience level, this book will be appropriate for you.

  17. Ashwin Kumr December 28, 2016 at 2:59 am #

    Hi Adrian,

    Eagerly looking forward for the release of your new book.When can we expect this book to be released?

    Regards,
    Dr. Ashwin Kumr

    • Adrian Rosebrock December 31, 2016 at 1:31 pm #

      Hey Ashwin — I’ll be running a Kickstarter to fund the book in January. The actual book will be released in mid-2017.

  18. jasstion February 8, 2017 at 9:03 am #

    i hope this book can cover images clustering not only image classfication

Leave a Reply