Take a sneak peek at what's inside...
This book has one goal — to help developers, researchers, and students just like yourself become experts in deep learning for image recognition and classification.
Inside this book you'll find:
- Super practical walkthroughs that present solutions to actual, real-world image classification problems, challenges, and competitions.
- Hands-on tutorials (with lots of code) that not only show you the algorithms behind deep learning for computer vision but their implementations as well.
- A no-nonsense teaching style that is guaranteed to cut through all the cruft and help you master deep learning for image understanding and visual recognition.
Just getting started with deep learning? Or already a pro?
No problem, I have you covered either way.
Are you just getting started in deep learning? Don't worry; you won't get bogged down by tons of theory and complex equations. We'll start off with the basics of machine learning and neural networks. Learn in a fun, practical way with lots of code. You'll be a neural network ninja in no time, and be able to graduate to the more advanced content.
Are you already a seasoned deep learning pro? This book isn't just for beginners — there's advanced content in here, too. You'll discover how to train your own custom object detectors using deep learning. You'll build a custom framework that can be used to train very deep architectures on the challenging ImageNet dataset from scratch. I'll even show you my personal blueprint which I use to determine which deep learning techniques to apply when confronted with a new problem. Best of all, these solutions and tactics can be directly applied to your current job and research.
Regardless of your experience level, you'll find tremendous value inside Deep Learning for Computer Vision with Python, I guarantee it.
What is this book? And what is it going to cover?
Deep Learning for Computer Vision with Python will make you an expert in deep learning for computer vision and visual recognition tasks.
Inside the book we will focus on:
- Neural Networks and Machine Learning
- Convolutional Neural Networks (CNNs)
- Object detection/localization with deep learning
- Training large-scale (ImageNet-level) networks
- Hands on implementations using the Python programming language and the Keras (which is compatible with either TensorFlow or Theano) + mxnet libraries
After going through Deep Learning for Computer Vision with Python, you'll be able to solve real-world problems with deep learning.
Utilize Python, Keras (with either a TensorFlow or Theano backend), and mxnet to build deep learning networks.
Python, Keras, and mxnet are all well-built tools that, when combined, create a powerful deep learning development environment that you can use to master deep learning for computer vision and visual recognition.
We'll be utilizing the Python programming language for all examples in this book. Python is an easy language to learn and is hands-down the best way to work with deep learning algorithms.
To build and train our deep learning networks we'll primarily be using the Keras library. Keras supports both TensorFlow and Theano, making it super easy to build and train networks quickly.
We'll also use mxnet, a deep learning library that specializes in distributed, multi-machine learning. The ability to parallelize training across GPUs/devices is critical when training deep neural network architectures on massive datasets (such as ImageNet).
Each library that we use in this book will be thoroughly reviewed to ensure you understand how to build & train your own deep learning networks.
You're probably wondering...
"Is this book right for me?"
This book is for developers, researchers, and students who have at least some programming experience and want to become proficient in deep learning for computer vision & visual recognition.
- Are a computer vision developer that utilizes OpenCV (among other image processing libraries) and are eager to level-up your skills.
- Have experience with machine learning and want to break into neural networks/deep learning for image understanding.
- Are a college student and want more than your university offers (or want to get ahead of your class).
- Are a scientist looking to apply deep learning + computer vision algorithms to your research.
- Utilize computer vision algorithms in your own projects but have yet to try deep learning.
- Used deep learning in projects before, but never in the context of visual recognition and image understanding.
- Write Python/machine learning code at your day job and are motivated to stand out from your coworkers.
- Are a "machine learning hobbyist" who knows how to program and wants to understand what this "deep learning" thing is all about.
If any of these descriptions fit you, rest assured: you're the target student.
I wrote this book for you.
I'm constantly recommending your [PyImageSearch.com] site to people I know at Georgia Tech and Udacity. While I consider Udacity the gold standard, I would rate your material at the same level. Keep up the good work
Adrian possesses a very rare talent of making complex concepts easy to grasp.
A three volume book — customized to what you want to learn.
Since this book covers a huge amount of content, I've decided to break the book down into three volumes called "bundles". A bundle includes the eBook, video tutorials, and source code for a given volume.
Each bundle builds on top of the others and includes all content from the lower volumes. You should choose a bundle based on: (1) how in-depth you want to study deep learning, computer vision, and visual recognition and (2) your particular budget.
You can find a quick breakdown of the three bundles below — the full list of topics to be covered can be found later on this page:
- Starter Bundle: A great fit for those taking their first steps towards deep learning for image classification mastery. You'll learn the basics of (1) machine learning, (2) neural networks, (3) Convolutional Neural Networks, and (4) how to work with your own custom datasets.
- Practitioner Bundle: Perfect for readers who are ready to study deep learning in-depth, understand advanced techniques, and discover common best practices and rules of thumb.
- ImageNet Bundle: The complete deep learning for computer vision experience. In this bundle, I demonstrate how to train large-scale neural networks on the massive ImageNet dataset. You just can't beat this bundle if you want to master deep learning for computer vision.
More than just a book — this is your gateway to mastering deep learning.
Deep Learning for Computer Vision with Python is more than just a book. It's a complete package that is designed from the ground-up to help you master deep learning.
Each bundle includes:
- The eBook files in PDF, .mobi, and .epub format.
- Video tutorials and walkthroughs for each chapter in the book.
- All source code listings so you can run the examples from the book out-of-the-box.
- A downloadable pre-configured Ubuntu VirtualBox virtual machine that ships with all necessary Python + deep learning libraries you will need to be successful pre-installed.
- Access to the Deep Learning for Computer Vision with Python companion website, so you can further your knowledge, even when you're done reading the book.
The ImageNet Bundle also includes a hardcopy edition of the complete book delivered to your doorstep.
New to machine learning and neural networks? Go with the Starter Bundle.
The Starter Bundle begins with a gentle introduction to the world of computer vision and machine learning, builds to neural networks, and then turns full steam into deep learning and Convolutional Neural Networks. You'll even solve fun and interesting real-world problems using deep learning along the way.
The Starter Bundle is appropriate if:
- You are new to the world of machine learning/neural networks.
- You are on a budget.
While this is the lowest tier bundle, you'll still be getting a complete education. That said, for a more in-depth treatment of deep learning for computer vision, I would recommend either the Practitioner Bundle or ImageNet Bundle.
Want an in-depth treatment of deep learning? Choose the Practitioner Bundle.
The Practitioner Bundle is appropriate if you want to take a deeper dive in deep learning. Inside this bundle, I cover more advanced techniques and best practices/rules of thumb. When you factor in the cost/time of training these deeper networks, the techniques I cover in the Practitioner Bundle will save you so much time that the bundle will pay for itself, guaranteed.
While the Starter Bundle focuses on learning the fundamentals of deep learning, the Practitioner Bundle takes the next logical step and covers more advanced techniques, including transfer learning, fine-tuning, networks as feature extractors, working with HDF5 + large datasets, and object detection and localization.
I also review Deep Dreaming and Neural Style, Generative Adversarial Networks (GANs), and Image Super Resolution in detail.
Using the techniques discussed in this bundle, you'll be able to compete in image classification competitions such as the Kaggle Dog vs. Cats Challenge (claiming a position in the top-25 leaderboard) and Stanford's cs231n Tiny ImageNet challenge.
This bundle is perfect for you if you are ready to study deep learning in-depth, understand advanced techniques, and discover common best practices and rules of thumb.
The Practitioner Bundle gives you the best bang for your buck. If you're even remotely serious about studying deep learning, you should go with this bundle.
Interested in a complete deep learning education? Go with the ImageNet Bundle.
The ImageNet Bundle is the most in-depth bundle and is a perfect fit if you want to train large-scale deep neural networks. This is also the only bundle that includes a hardcopy edition of the complete Deep Learning for Computer Vision with Python book, mailed to your doorstep.
Inside this bundle, I demonstrate how to build a custom Python framework to train network architectures from scratch — this is the exact same framework I use when training my own neural networks. We'll use this framework to train AlexNet, VGGNet, SqueezeNet, GoogLeNet, and ResNet on the challenging ImageNet dataset.
Using the training techniques I outline in this bundle, you'll be able to reproduce the results you see in popular deep learning papers and publications — this is an absolute must for anyone doing research and development in the deep learning space.
To demonstrate advanced deep learning techniques in action, I provide a number of case studies, including age + gender recognition, emotion and facial expression recognition, car make + model recognition, and automatic image orientation correction.
This bundle also includes a special BONUS GUIDE that reviews Faster R-CNNs and Single Shot Detectors (SSDs) and how to use them.
You should choose the ImageNet Bundle if:
- You want the complete deep learning for computer vision experience.
- Intend on training deep neural networks on large datasets from scratch.
When it comes to studying deep learning, you can't beat this bundle!
Here's the full breakdown of what you'll learn inside Deep Learning for Computer Vision with Python
Since this book covers a huge amount of content, I've decided to break the book down into three volumes called "bundles". Each bundle builds on top of the others and includes all content from the lower tiers. Use the list of topics below (broken down by bundle) to help you (1) identify which topics you would like to study and then (2) choose a bundle based on this list.
Take Your First Steps
Learn how to setup and configure your development environment to study deep learning using Python, Keras, and mxnet.
Understand Image Basics
Review how we represent images as arrays; coordinate systems; width, height, and depth; and aspect ratios.
Machine Learning Principles
Discover "parameterized learning" (i.e., learning from data) and how we use data, feature vectors, scoring functions, and loss functions to create machine learning classifiers.
Gradient Descent algorithms allow our algorithms to learn from data — I'll teach you how these methods work and show you how to implement then by hand.
We'll discuss & implement the classic Perceptron algorithm, then move on to multi-layer networks, which we'll code from scratch via Python + Keras.
We'll take an in-depth dive into the Backpropagation algorithm, the cornerstone of neural networks. I'll even show you how to implement backpropagation by hand using Python + NumPy.
Intro to Convolutional Neural Networks (CNNs)
I'll discuss exactly what a convolution is, followed by explaining Convolutional Neural Networks (what they are used for, why they work so well for image classification, etc.).
CNN Building Blocks
Convolutional Neural Networks are built using different layer types, including convolutional layers, activation layers, pooling layers, batch normalization layers, dropout layers and others — you'll discover how to use these layers to build your own CNNs.
Uncover Common Architectures & Training Patterns
Discover common network architecture patterns you can use to design architectures of your own with minimal frustration and headaches.
Pre-trained ImageNet Networks
Utilize out-of-the-box CNNs for classification that are pre-trained on 1,000 common object categories and are ready to be applied to your own images/image datasets (VGG16, VGG19, ResNet50, etc.).
Learn how to save and load your network models from disk during training, allowing you to checkpoint models and spot high performing epochs.
Spot Underfitting and Overfitting
Save yourself hours (or even days) of training time by using these techniques to determine if your network is underfitting or overfitting on your training data.
Decay and Learning Rate Schedulers
Learning rate decay/schedulers can help prevent overfitting and increase your classification accuracy. I'll discuss how to use these methods to maximize your model accuracy.
Work With Your Own Datasets
Learn how to gather your own training images, label them, and train a Convolutional Neural Network from scratch on top of your dataset.
Train the classic LeNet architecture from scratch to recognize handwritten digits in images.
Case Study: Smile Detection
I'll show you how to train a custom smile detector using Convolutional Neural Networks.
Don't train your CNN from scratch — use transfer learning and train your network in a fraction of the time and obtain higher classification accuracy.
Networks As Feature Extractors
Treat pre-trained networks as feature extractors to obtain high classification accuracy with little effort.
Utilize fine-tuning to boost the accuracy of pre-trained networks, allowing you to work with small image dataset (and still reach high accuracy).
Apply data augmentation to increase network classification accuracy without gathering more training data.
Learn how to implement seminal CNN architectures from scratch, including AlexNet, VGGNet, SqueezeNet, GoogLeNet, and ResNet.
Advanced Optimization Algorithms
SGD is just the tip of the iceberg — you can also train your networks using RMSprop, Adagrad, Adadelta, Adam, Adamax, and Nadam. I'll show you how.
Utilize image cropping for an easy way to boost accuracy on your testing set.
Explore how network ensembles can be used to increase classification accuracy simply by training multiple networks.
Best Practices to Boost Network Performance
Discover my optimal pathway for applying deep learning techniques to maximize classification accuracy (and which order to apply these techniques in to achieve the greatest effectiveness).
Work With Datasets Too Large to Fit Into Memory
Learn how to convert an image dataset from raw images on disk to HDF5 format, making networks easier (and faster) to train.
Compete In Deep Learning Competitions
I'll show you how to train a network on the Kaggle Dogs vs. Cats challenge and claim a position in the top-25 leaderboard with minimal effort. We'll also review how to rank high on the cs231n Tiny ImageNet classification challenge leaderboard.
Object Detection & Localization
Discover how to use deep learning to transform the artistic styles from one image to another.
Deep Dreaming and Neural Style
Discover how to use deep learning to transform the artistic styles from one image to another.
Generative Adversarial Networks (GANs)
I'll show you how to utilize two neural networks (a generative model and a discriminative model) to produce photorealistic images that look authentic to humans.
Image Super Resolution
Learn how to construct high-resolution images from a single, low-resolution input using deep learning algorithms.
ImageNet: Large Scale Visual Recognition Challenge
Discover what the massive ImageNet (1,000 categories) dataset is and why it’s considered the de-facto challenge to benchmark image classification algorithms.
Work With ImageNet
I'll show you how to obtain the ImageNet dataset and convert it to an efficiently packed record file suitable for training.
Learn how to utilize multiple GPUs to train your network in parallel, greatly reducing training time.
AlexNet, VGGNet, GoogLeNet, SqueezeNet, and ResNet
Train state-of-the-art network architectures to replicate the results of the authors in their original papers.
ImageNet: Tips, Tricks, & Rules of Thumb
Save weeks (and even months) of training time by discovering learning rate schedules that actually work — this chapter alone will save you enough time to actually pay for the book itself.
Boost ImageNet Accuracy
Learn how to restart training from saved epochs, lower learning rates, and increase classification accuracy on your testing set.
Running Networks in the Cloud
Curious how to build your own Clarifai API or Google Vision API? In this chapter, I'll demonstrate how to wrap network architectures in a scalable web API to help you build deep learning web services. Perhaps your web API will become the next AI SaaS!
Faster R-CNNs and Single Shot Detectors (SSDs)
In this bonus guide, I'll discuss object detection with deep learning, explain how the Faster R-CNN and Single Shot Detector (SSD) architectures work, and demonstrate how to use these architectures using the Caffe framework.
Case Study: Age + Gender Recognition
Train your own custom CNN to (accurately) recognize the age + gender of a person in an image using deep learning.
Case Study: Emotion and Facial Expression Recognition
Compete in Kaggle's Facial Expression Recognition Challenge and train a CNN (from scratch) capable of recognizing emotions/facial expressions in real-time.
Case Study: Vehicle Make + Model Classification
Utilize fine-tuning to train a network capable of recognizing the make + model of over 164 vehicles with over 96.52% accuracy.
Case Study: Image Orientation Correction
Learn how features extracted from a pre-trained Convolutional Neural Network can be used to not only detect image orientation but correct it as well.
Trusted by members of top machine learning companies and schools.
Join them in deep learning mastery.
Enjoy a 100% money back guarantee.
Since this book is currently in pre-order, I am offering a 30 day Money Back Guarantee on all orders. If I do not release this book (extremely unlikely), I will, of course, refund your purchase. If after reading my book, you haven't learned the fundamentals of deep learning for computer vision, then I don't want your money. That's why I offer a 100% Money Back Guarantee. Simply send me an email and ask for a refund, up to 30 days after your purchase.
Which bundle should I buy?
Each bundle builds on top of the others and includes all content from lower volumes. You should choose a bundle based on (1) how in depth you want to study deep learning, computer vision & visual recognition and (2) your particular budget. Use the "Here's the full breakdown of what you'll learn inside Deep Learning for Computer Vision with Python" section above to help you decide which topics you want to learn, then pick a bundle based on your choices.
What is a "pre-order"?
In order to fund the creation of Deep Learning for Computer Vision with Python, I ran a Kickstarter campaign to pay for additional servers (to gather results faster and get the book finished ASAP), my editor, and for my own time so I could consult less and focus on this book. Your pre-order helps buy my time so I can focus on finishing this book and releasing it faster.
Why should I pre-order?
The less I have to do consulting/contracting work, the more time I can spend writing this book. Your pre-order enables me to do this. Also, I am offering a discount on Deep Learning for Computer Vision with Python if you pre-order now — the book will be more expensive once it is released publicly later this year.
What happens after I pre-order?
After you pre-order your copy you will (1) receive an email receipt for your purchase and (2) you will be added to my email contact list where I'll be sharing exclusive behind the scenes looks as I finish up the book. You'll also be the first to receive drafts of Deep Learning for Computer Vision with Python before the book is released to the general public.
What is a "money back guarantee?
Since this book is currently in pre-order, I am offering a 30 day Money Back Guarantee on all orders. If I do not release this book (extremely unlikely), I will, of course, refund your purchase.
Why are we using the Python programming language?
First of all, Python is awesome. It is an easy language to learn and hands-down the best way to work with deep learning algorithms. The simple, intuitive syntax allows you to focus on learning the basics of deep learning, rather than spending hours fixing crazy compiler errors in other languages.
What deep learning libraries/packages are used?
We will be using the Keras and mxnet libraries inside this book. Keras supports both TensorFlow and Theano, making it super easy to quickly build and train networks. The mxnet library specializes in distributed learning, making it a great choice for training deep network architectures on massive datasets. Each library in the book will be thoroughly reviewed to ensure you understand how to build & train your own deep learning networks.
Do I need any programming experience before reading this book?
This book assumes you have some prior programming experience (e.g. you know what a variable function, loop, etc. are). You should have more skills than a novice, but certainly not an intermediate or advanced developer. As long as you understand basic programming logic flow you'll be successful in reading (and understanding) the contents of this book.
Do I need to know OpenCV?
You do not need to know the OpenCV library to be successful when going through this book. We only use OpenCV to facilitate basic image processing operations such as loading an image from disk, displaying it to our screen, and a few other basic operations. That said, a little bit of OpenCV experience goes a long way, so if you're new to OpenCV I highly recommend that you go through my other book, Practical Python and OpenCV to learn the fundamentals.
Do I need any special hardware to run the examples in the Starter Bundle or Practitioner Bundle?
All examples inside the Starter Bundle can be executed on a CPU without a problem. The same is true for the examples in the Practitioner Bundle, although some examples will take longer to run. In either case, a GPU will dramatically speed up the network training process but is not a requirement.
Do I need special hardware for the ImageNet Bundle?
If you intend on going with the ImageNet Bundle, you are expected to have a GPU with at least 6GB of memory. The more GPUs you have available, the better. You should also have at least 1-2TB of free space on your machine. The ImageNet Bundle covers very advanced deep learning techniques on massive datasets, so make sure you make the necessary hardware preparations.
Which GPUs do you recommend for the course?
I personally use the NVIDIA Titan X (12GB) on a daily basis for training my own deep learning networks. The Titan X a bit expensive, so NVIDIA has released the GTX 1080 with 8GB of memory for half the cost of the Titan X. The latest addition to the NVIDIA family, the 1080 Ti (11GB; link coming soon), is also highly recommended. Alternatively, I would recommend using Amazon EC2 and their GPU instances (particularly p2.* and g2.*) in the cloud to train your networks if you do not want to purchase physical hardware.
What if I'm a beginner at deep learning?
Don't worry; you won't get bogged down by tons of theory and complex equations. We'll start off with the basics of machine learning and neural networks. You'll learn in a fun, practical way with lots of code. You'll be a neural network ninja in no time, and be able to graduate to the more advanced content.
What if I'm already experienced in deep learning?
This book isn't just for beginners — there's advanced content in here too. You'll discover how to train your own custom object detectors using deep learning. You'll build a custom framework that can be used to train very deep architectures on the challenging ImageNet dataset from scratch. I'll even show you my personal blueprint, that I use to determine which deep learning techniques to apply when confronted with a new problem. Best of all, these solutions and tactics can be directly applied to your current job and research.
Where can I learn more about you?
I have authored over 175+ blog posts about computer vision, OpenCV, and deep learning over at PyImageSearch.com. Check out the posts to get a feel for my teaching and writing style (not to mention the quality and depth of the tutorials). I would also highly suggest that you sign up for the (free) Table of Contents I am offering using the form at the bottom-right corner of this page.
I have another question.
If you have any other questions, please send me a message, and I'll get back to you immediately
Who's behind this?
Hey, I'm Adrian Rosebrock, a Ph.D and entrepreneur who has spent his entire adult life studying computer vision and machine learning. Over the past 3 years alone I have:
- Started the PyImageSearch.com blog and published over 175+ tutorials and articles aimed at teaching computer vision, image processing, and machine learning.
- Authored Practical Python and OpenCV, which has been featured on the official OpenCV.org website.
- Created PyImageSearch Gurus, an actionable, real-world course on computer vision and OpenCV. This course is the most comprehensive computer vision education online today, covering 13 modules broken out into 168 lessons with over 2,161 pages of content.
- Ran a successful Kickstarter campaign (over 5,000% funded) to fund the creation of this book.
- Answered 10,000 emails and helped 1,000's of developers, researchers, and students learn the ropes of computer vision and machine learning.
If studying deep learning and visual recognition sounds interesting to you, I hope you'll consider pre-ordering this book. You'll learn a ton about deep learning and computer vision in a practical, hands-on way. And you'll have fun doing it. See you on the other side!