Take a sneak peek at what's inside...
This book has one goal — to help developers, researchers, and students just like yourself become experts in deep learning for image recognition and classification.
Inside this book you'll find:
- Super practical walkthroughs that present solutions to actual, real-world image classification problems, challenges, and competitions.
- Hands-on tutorials (with lots of code) that not only show you the algorithms behind deep learning for computer vision but their implementations as well.
- A no-nonsense teaching style that is guaranteed to cut through all the cruft and help you master deep learning for image understanding and visual recognition.
I highly recommend grabbing a copy of Deep Learning for Computer Vision with Python. It goes into a lot of detail and has tons of detailed examples. It’s the only book I’ve seen so far that covers both how things work and how to actually use them in the real world to solve difficult problems. Check it out!
Phenomenal. The concepts on deep learning are so well explained that I will be recommending this book to anybody not just involved in computer vision but AI in general.
Just getting started with deep learning? Or already a pro?
No problem, I have you covered either way.
Are you just getting started in deep learning? Don't worry; you won't get bogged down by tons of theory and complex equations. We'll start off with the basics of machine learning and neural networks. Learn in a fun, practical way with lots of code. You'll be a neural network ninja in no time, and be able to graduate to the more advanced content.
Are you already a seasoned deep learning pro? This book isn't just for beginners — there's advanced content in here, too. You'll discover how to train your own custom object detectors using deep learning. You'll build a custom framework that can be used to train very deep architectures on the challenging ImageNet dataset from scratch. I'll even show you my personal blueprint which I use to determine which deep learning techniques to apply when confronted with a new problem. Best of all, these solutions and tactics can be directly applied to your current job and research.
Regardless of your experience level, you'll find tremendous value inside Deep Learning for Computer Vision with Python, I guarantee it.
What is this book? And what does it to cover?
Deep Learning for Computer Vision with Python will make you an expert in deep learning for computer vision and visual recognition tasks.
Inside the book we will focus on:
- Neural Networks and Machine Learning
- Convolutional Neural Networks (CNNs)
- Object detection/localization with deep learning
- Training large-scale (ImageNet-level) networks
- Hands on implementations using the Python programming language and the Keras, TensorFlow 2.0, and mxnet deep learning libraries
After going through Deep Learning for Computer Vision with Python, you'll be able to solve real-world problems with deep learning.
Utilize Python, Keras, TensorFlow 2.0, and mxnet to build deep learning networks.
Python, TensorFlow 2.0, Keras, and mxnet are all well-built tools that, when combined, create a powerful deep learning development environment that you can use to master deep learning for computer vision and visual recognition.
We'll be utilizing the Python programming language for all examples in this book. Python is an easy language to learn and is hands-down the best way to work with deep learning algorithms.
To build and train our deep learning networks we'll primarily be using TensorFlow 2.0 and the Keras API inside of TF 2.0 (i.e.,
tf.keras). Using Keras and TensorFlow 2.0 is the fastest, easiest way to go from idea, to experimentation, to result.
We'll also use mxnet, a deep learning library that specializes in distributed, multi-machine learning. The ability to parallelize training across GPUs/devices is critical when training deep neural network architectures on massive datasets (such as ImageNet).
Each library that we use in this book will be thoroughly reviewed to ensure you understand how to build & train your own deep learning networks.
Dr. Rosebrock delivers on what he promises! [Inside the book] he focuses on mastering deep learning concepts, lays down the theoretical foundation, develops interesting deep learning and computer vision projects with detailed explanation of Python scripts, and puts to your disposal priceless expertise to get you quickly engaged in the incredible field of deep learning. There is just no other book like this that I know of!
You're probably wondering...
"Is this book right for me?"
This book is for developers, researchers, and students who have at least some programming experience and want to become proficient in deep learning for computer vision & visual recognition.
- Are a computer vision developer that utilizes OpenCV (among other image processing libraries) and are eager to level-up your skills.
- Have experience with machine learning and want to break into neural networks/deep learning for image understanding.
- Are a college student and want more than your university offers (or want to get ahead of your class).
- Are a scientist looking to apply deep learning + computer vision algorithms to your research.
- Utilize computer vision algorithms in your own projects but have yet to try deep learning.
- Used deep learning in projects before, but never in the context of visual recognition and image understanding.
- Write Python/machine learning code at your day job and are motivated to stand out from your coworkers.
- Are a "machine learning hobbyist" who knows how to program and wants to understand what this "deep learning" thing is all about.
If any of these descriptions fit you, rest assured: you're the target student.
I wrote this book for you.
I'm constantly recommending your [PyImageSearch.com] site to people I know at Georgia Tech and Udacity. While I consider Udacity the gold standard, I would rate your material at the same level. Keep up the good work.
Adrian possesses a very rare talent of making complex concepts easy to grasp.
A three volume book — customized to what you want to learn.
Since this book covers a huge amount of content, I've decided to break the book down into three volumes called "bundles". A bundle includes the eBook, video tutorials, and source code for a given volume.
Each bundle builds on top of the others and includes all content from the lower volumes. You should choose a bundle based on: (1) how in-depth you want to study deep learning, computer vision, and visual recognition and (2) your particular budget.
You can find a quick breakdown of the three bundles below — the full list of topics to be covered can be found later on this page:
- Starter Bundle: A great fit for those taking their first steps towards deep learning for image classification mastery. You'll learn the basics of (1) machine learning, (2) neural networks, (3) Convolutional Neural Networks, and (4) how to work with your own custom datasets.
- Practitioner Bundle: Perfect for readers who are ready to study deep learning in-depth, understand advanced techniques, and discover common best practices and rules of thumb.
- ImageNet Bundle: The complete deep learning for computer vision experience. In this bundle, I demonstrate how to train large-scale neural networks on the massive ImageNet dataset. You just can't beat this bundle if you want to master deep learning for computer vision.
More than just a book — this is your gateway to mastering deep learning.
Deep Learning for Computer Vision with Python is more than just a book. It's a complete package that is designed from the ground-up to help you master deep learning.
Each bundle includes:
- The eBook files in PDF, .mobi, and .epub format.
- Video tutorials and walkthroughs for each chapter in the book.
- All source code listings so you can run the examples from the book out-of-the-box.
- A downloadable pre-configured Ubuntu VirtualBox virtual machine that ships with all necessary Python + deep learning libraries you will need to be successful pre-installed.
- Access to the Deep Learning for Computer Vision with Python companion website, so you can further your knowledge, even when you're done reading the book.
The ImageNet Bundle also includes a hardcopy edition of the complete book delivered to your doorstep.
New to machine learning and neural networks? Go with the Starter Bundle.
The Starter Bundle begins with a gentle introduction to the world of computer vision and machine learning, builds to neural networks, and then turns full steam into deep learning and Convolutional Neural Networks. You'll even solve fun and interesting real-world problems using deep learning along the way.
The Starter Bundle is appropriate if:
- You are new to the world of machine learning/neural networks.
- You are on a budget.
While this is the lowest tier bundle, you'll still be getting a complete education. That said, for a more in-depth treatment of deep learning for computer vision, I would recommend either the Practitioner Bundle or ImageNet Bundle.
Want an in-depth treatment of deep learning? Choose the Practitioner Bundle.
The Practitioner Bundle is appropriate if you want to take a deeper dive in deep learning. Inside this bundle, I cover more advanced techniques and best practices/rules of thumb. When you factor in the cost/time of training these deeper networks, the techniques I cover in the Practitioner Bundle will save you so much time that the bundle will pay for itself, guaranteed.
While the Starter Bundle focuses on learning the fundamentals of deep learning, the Practitioner Bundle takes the next logical step and covers more advanced techniques, including transfer learning, fine-tuning, networks as feature extractors, working with HDF5 + large datasets, and object detection and localization.
I also review Deep Dreaming and Neural Style, Generative Adversarial Networks (GANs), and Image Super Resolution in detail.
Using the techniques discussed in this bundle, you'll be able to compete in image classification competitions such as the Kaggle Dog vs. Cats Challenge (claiming a position in the top-25 leaderboard) and Stanford's cs231n Tiny ImageNet challenge.
This bundle is perfect for you if you are ready to study deep learning in-depth, understand advanced techniques, and discover common best practices and rules of thumb.
The Practitioner Bundle gives you the best bang for your buck. If you're even remotely serious about studying deep learning, you should go with this bundle.
Interested in a complete deep learning education? Go with the ImageNet Bundle.
The ImageNet Bundle is the most in-depth bundle and is a perfect fit if you want to train large-scale deep neural networks. This is also the only bundle that includes a hardcopy edition of the complete Deep Learning for Computer Vision with Python book, mailed to your doorstep.
Inside this bundle, I demonstrate how to build a custom Python framework to train network architectures from scratch — this is the exact same framework I use when training my own neural networks. We'll use this framework to train AlexNet, VGGNet, SqueezeNet, GoogLeNet, and ResNet on the challenging ImageNet dataset.
Using the training techniques I outline in this bundle, you'll be able to reproduce the results you see in popular deep learning papers and publications — this is an absolute must for anyone doing research and development in the deep learning space.
To demonstrate advanced deep learning techniques in action, I provide a number of case studies, including age + gender recognition, emotion and facial expression recognition, car make + model recognition, and automatic image orientation correction.
This bundle also includes special BONUS GUIDES on object detection (Faster R-CNNs, Single Shot Detectors, RetinaNet) and instance/semantic segmentation (Mask R-CNN). Your copy of the ImageNet Bundle includes these bonus guides.
You should choose the ImageNet Bundle if:
- You want the complete deep learning for computer vision experience.
- Intend on training deep neural networks on large datasets from scratch.
- You want to learn how to train object detection or instance/semantic segmentation networks.
When it comes to studying deep learning, you can't beat this bundle!
Here's the full breakdown of what you'll learn inside Deep Learning for Computer Vision with Python
Since this book covers a huge amount of content, I've decided to break the book down into three volumes called "bundles". Each bundle builds on top of the others and includes all content from the lower tiers. Use the list of topics below (broken down by bundle) to help you (1) identify which topics you would like to study and then (2) choose a bundle based on this list.
Take Your First Steps
Learn how to setup and configure your development environment to study deep learning using Python, TensorFlow 2.0, Keras, and mxnet.
Understand Image Basics
Review how we represent images as arrays; coordinate systems; width, height, and depth; and aspect ratios.
Machine Learning Principles
Discover "parameterized learning" (i.e., learning from data) and how we use data, feature vectors, scoring functions, and loss functions to create machine learning classifiers.
Gradient Descent algorithms allow our algorithms to learn from data — I'll teach you how these methods work and show you how to implement then by hand.
We'll discuss & implement the classic Perceptron algorithm, then move on to multi-layer networks, which we'll code from scratch via Python + Keras.
We'll take an in-depth dive into the Backpropagation algorithm, the cornerstone of neural networks. I'll even show you how to implement backpropagation by hand using Python + NumPy.
Intro to Convolutional Neural Networks (CNNs)
I'll discuss exactly what a convolution is, followed by explaining Convolutional Neural Networks (what they are used for, why they work so well for image classification, etc.).
CNN Building Blocks
Convolutional Neural Networks are built using different layer types, including convolutional layers, activation layers, pooling layers, batch normalization layers, dropout layers and others — you'll discover how to use these layers to build your own CNNs.
Uncover Common Architectures & Training Patterns
Discover common network architecture patterns you can use to design architectures of your own with minimal frustration and headaches.
Pre-trained ImageNet Networks
Utilize out-of-the-box CNNs for classification that are pre-trained on 1,000 common object categories and are ready to be applied to your own images/image datasets (VGG16, VGG19, ResNet50, etc.).
Learn how to save and load your network models from disk during training, allowing you to checkpoint models and spot high performing epochs.
Spot Underfitting and Overfitting
Save yourself days (or even weeks) of training time by using these techniques to determine if your network is underfitting or overfitting on your training data.
Decay and Learning Rate Schedulers
Learning rate decay/schedulers can help prevent overfitting and increase your classification accuracy. I'll discuss how to use these methods to maximize your model accuracy.
Work With Your Own Datasets
Learn how to gather your own training images, label them, and train a Convolutional Neural Network from scratch on top of your dataset.
Train the classic LeNet architecture from scratch to recognize handwritten digits in images.
Case Study: Smile Detection
I'll show you how to train a custom smile detector using Convolutional Neural Networks.
Don't train your CNN from scratch — use transfer learning and train your network in a fraction of the time and obtain higher classification accuracy.
Networks As Feature Extractors
Treat pre-trained networks as feature extractors to obtain high classification accuracy with little effort.
Utilize fine-tuning to boost the accuracy of pre-trained networks, allowing you to work with small image dataset (and still reach high accuracy).
Apply data augmentation to increase network classification accuracy without gathering more training data.
Learn how to implement seminal CNN architectures from scratch, including AlexNet, VGGNet, SqueezeNet, GoogLeNet, and ResNet.
Advanced Optimization Algorithms
SGD is just the tip of the iceberg — you can also train your networks using RMSprop, Adagrad, Adadelta, Adam, Adamax, and Nadam. I'll show you how.
Utilize image cropping for an easy way to boost accuracy on your testing set.
Explore how network ensembles can be used to increase classification accuracy simply by training multiple networks.
Best Practices to Boost Network Performance
Discover my optimal pathway for applying deep learning techniques to maximize classification accuracy (and which order to apply these techniques in to achieve the greatest effectiveness).
Work With Datasets Too Large to Fit Into Memory
Learn how to convert an image dataset from raw images on disk to HDF5 format, making networks easier (and faster) to train.
Compete In Deep Learning Competitions
I'll show you how to train a network on the Kaggle Dogs vs. Cats challenge and claim a position in the top-25 leaderboard with minimal effort. We'll also review how to rank high on the cs231n Tiny ImageNet classification challenge leaderboard.
Object Detection & Localization
Discover how to use deep learning to detect and localize objects in images.
Deep Dreaming and Neural Style
Discover how to use deep learning to transform the artistic styles from one image to another.
Generative Adversarial Networks (GANs)
I'll show you how to utilize two neural networks (a generative model and a discriminative model) to produce photorealistic images that look authentic to humans.
Image Super Resolution
Learn how to construct high-resolution images from a single, low-resolution input using deep learning algorithms.
ImageNet: Large Scale Visual Recognition Challenge
Discover what the massive ImageNet (1,000 categories) dataset is and why it’s considered the de-facto challenge to benchmark image classification algorithms.
Work With ImageNet
I'll show you how to obtain the ImageNet dataset and convert it to an efficiently packed record file suitable for training.
Learn how to utilize multiple GPUs to train your network in parallel, greatly reducing training time.
AlexNet, VGGNet, GoogLeNet, SqueezeNet, and ResNet
Train state-of-the-art network architectures to replicate the results of the authors in their original papers.
ImageNet: Tips, Tricks, & Rules of Thumb
Save weeks (and even months) of training time by discovering learning rate schedules that actually work — this chapter alone will save you enough time to actually pay for the book itself.
Boost ImageNet Accuracy
Learn how to restart training from saved epochs, lower learning rates, and increase classification accuracy on your testing set.
Faster R-CNNs and Single Shot Detectors (SSDs)
In this bonus guide, I'll discuss object detection with deep learning, explain how the Faster R-CNN and Single Shot Detector (SSD) architectures work, and demonstrate how to use these architectures using the Caffe framework.
Case Study: Age + Gender Recognition
Train your own custom CNN to (accurately) recognize the age + gender of a person in an image using deep learning.
Case Study: Emotion and Facial Expression Recognition
Compete in Kaggle's Facial Expression Recognition Challenge and train a CNN (from scratch) capable of recognizing emotions/facial expressions in real-time.
Case Study: Vehicle Make + Model Classification
Utilize fine-tuning to train a network capable of recognizing the make + model of over 164 vehicles with over 96.52% accuracy.
Case Study: Image Orientation Correction
Learn how features extracted from a pre-trained Convolutional Neural Network can be used to not only detect image orientation but correct it as well.
BONUS: Logo Detection with the RetinaNet Object Detector
Inside this chapter you will learn how to train the RetinaNet object detection framework to automatically detect logos in images with higher accuracy.
BONUS: Weapon Detection
Learn how to train an object detector capable of detecting weapons in images and video streams.
BONUS: Mask R-CNN and Skin Lesion Segmentation
I'll teach you how to train a Mask R-CNN instance segmentation network to automatically detect skin lesions, a first step in cancer identification.
BONUS: Annotate and Train Your Own Mask R-CNN
Learn how to label and annotate your own image dataset for instance segmentation. I'll then show you how to train your own Mask R-CNN on your data.
Trusted by members of top machine learning companies and schools.
Join them in deep learning mastery.
This book is a great, in-depth dive into practical deep learning for computer vision. I found it to be an approachable and enjoyable read: explanations are clear and highly detailed. You'll find many practical tips and recommendations that are rarely included in other books or in university courses. I highly recommend it, both to practitioners and beginners.
Enjoy a 100% money back guarantee.
After reading my book, if you haven't learned the fundamentals of deep learning for computer vision, then I don't want your money. That's why I offer a 100% Money Back Guarantee. Simply send me an email and ask for a refund, up to 30 days after your purchase. With all the copies I've sold, I count the number of refunds on one hand. My readers are satisfied and I'm sure you will be too.
Which bundle should I buy?
Each bundle builds on top of the others and includes all content from lower volumes. You should choose a bundle based on (1) how in depth you want to study deep learning, computer vision & visual recognition and (2) your particular budget. Use the "Here's the full breakdown of what you'll learn inside Deep Learning for Computer Vision with Python" section above to help you decide which topics you want to learn, then pick a bundle based on your choices.
What happens after I purchase?
After you purchase your copy of Deep Learning for Computer Vision with Python you will (1) receive an email receipt for your purchase and (2) you will be able to download your books, code, datasets, etc. immediately. If you purchased the ImageNet Bundle, the only bundle to include a hardcopy edition, you will receive a second email to enter your shipping information.
Your book is more expensive than other online courses and books — why is your book worth the price?
First, it's important to understand that Deep Learning for Computer Vision with Python is the most complete, comprehensive deep learning education online (the ImageNet Bundle is over 900+ pages). Not only does it cover the theory behind deep learning, it also details the implementation as well. You can't find a book this detailed in any other online platform, MOOC, or book.
Secondly, I personally dedicate time daily to answering your questions, providing help, and offering suggestions — no other book or course online gives you this level of access to authors. To be totally honest with you, I've considered raising the price of this book multiple times but haven't (yet).
My book may seem expensive, but the value you are getting is multiple orders of magnitude higher than any other book or course. I encourage you to give my book a try. Once you dig into the content I'm confident you'll agree that the book is well worth the price.
Why are we using the Python programming language?
First of all, Python is awesome. It is an easy language to learn and hands-down the best way to work with deep learning algorithms. The simple, intuitive syntax allows you to focus on learning the basics of deep learning, rather than spending hours fixing crazy compiler errors in other languages.
What deep learning libraries and packages are we using?
We use Keras, TensorFlow 2.0, and mxnet in this book. After years in the trenches as a deep learning researcher and practitioner, I can tell you that the combination of Keras and TensorFlow 2.0 is the fastest, easiest way to go from idea, to experimentation, to result.
The mxnet library specializes in distributed learning, making it a great choice for training deep network architectures on massive datasets.
Each library in the book is thoroughly reviewed to ensure you understand how to build & train your own deep learning networks.
I want to learn TensorFlow 2.0. Is TensorFlow 2.0 covered?
Yes, TensorFlow 2.0 is covered inside the text. We primarily use TensorFlow 2.0 and the Keras API inside TensorFlow (i.e.,
tf.keras) when training our deep neural networks. You'll also learn how to use TensorFlow 2.0 specific features such as
GradientTapeand eager execution.
What if I'm a beginner at deep learning?
Don't worry; you won't get bogged down by tons of theory and complex equations. We'll start off with the basics of machine learning and neural networks. You'll learn in a fun, practical way with lots of code. You'll be a neural network ninja in no time, and be able to graduate to the more advanced content.
What if I'm already experienced in deep learning?
This book isn't just for beginners — there's advanced content in here too. You'll discover how to train your own custom object detectors using deep learning. You'll build a custom framework that can be used to train very deep architectures on the challenging ImageNet dataset from scratch. I'll even show you my personal blueprint that I use to determine which deep learning techniques to apply when confronted with a new problem. Best of all, these solutions and tactics can be directly applied to your current job, research, and projects.
Do I need any programming experience before reading this book?
This book assumes you have some prior programming experience (e.g. you know what a variable, function, loop, etc. are). You should have more skills than a novice, but certainly not an intermediate or advanced developer. As long as you understand basic programming logic flow you'll be successful in reading (and understanding) the contents of this book.
Do I need to know OpenCV?
You do not need to know the OpenCV library to be successful when going through this book. We only use OpenCV to facilitate basic image processing operations such as loading an image from disk, displaying it to our screen, and a few other basic operations. That said, a little bit of OpenCV experience goes a long way, so if you're new to OpenCV I highly recommend (1) purchase a copy of Deep Learning for Computer Vision with Python and (2) work through my other book, Practical Python and OpenCV to learn the fundamentals.
Do I need any special hardware to run the examples in the Starter Bundle or Practitioner Bundle?
All examples inside the Starter Bundle can be executed on a CPU without a problem. The same is true for most examples in the Practitioner Bundle, although some examples will take longer to run. In either case, a GPU will dramatically speed up the network training process but is not a requirement.
Do I need special hardware for the ImageNet Bundle?
If you intend on going with the ImageNet Bundle, you are expected to have a GPU with at least 6GB of memory. The more GPUs you have available, the better. You should also have at least 1TB of free space on your machine. The ImageNet Bundle covers very advanced deep learning techniques on massive datasets, so make sure you make the necessary hardware preparations.
Can I upgrade from a lower tier bundle to a higher one? How does the upgrade process work?
Yes, you can always upgrade your bundle to a higher one. For example, you could purchase the Starter Bundle and then upgrade to the Practitioner Bundle or ImageNet Bundle at a later date.
The cost to upgrade would simply be the price difference between your current bundle and the bundle you wanted to upgrade to (you would not need to "repurchase" the content you already own). To upgrade your bundle just send me an email and I can get you the upgrade link.
What GPUs do you recommend for the book?
I personally use the NVIDIA Titan X (12GB) on a daily basis for training my own deep learning networks. The Titan X a bit expensive, so NVIDIA has released the GTX 1080 with 8GB of memory for half the cost of the Titan X. The latest addition to the NVIDIA family, the 1080 Ti (11GB), is also highly recommended. Alternatively, I would recommend using Amazon EC2 and their GPU instances (particularly p2.* and g2.*) in the cloud to train your networks if you do not want to purchase physical hardware.
Can I use the cloud for deep learning?
Yes, you can absolutely use cloud services such as Amazon Web Services (AWS) or Microsoft Azure either with or without a GPU to work through the examples in this book. To jumpstart your education, I have released my own personal pre-configured Amazon Machine Instance (AMI) to help you with your studies and projects. Simply launch an EC2 instance using this pre-configured AMI and you'll be ready to train your own deep neural networks in the matter of minutes! If you're a Microsoft Azure user, you can spin up Microsft DSVM instance and be up and running in a few minutes as well.
Are the hardcopy editions shipping?
Yep, the hardcopies are indeed shipping! The ImageNet Bundle is the only bundle that includes a hardcopy edition. After you purchase, you will receive an email with a link to enter your shipping information. Once I have your shipping address I can get your hardcopy edition in the mail, normally within 48 hours.
Where can I learn more about you?
I have authored over 250+ blog posts about computer vision, OpenCV, and deep learning over at PyImageSearch.com. Check out the posts to get a feel for my teaching and writing style (not to mention the quality and depth of the tutorials). I would also highly suggest that you sign up for the (free) Table of Contents and sample chapters I am offering using the form at the bottom-right corner of this page.
I have another question.
If you have any other questions, please send me a message, and I'll get back to you ASAP.
Who's behind this?
Hey, I'm Adrian Rosebrock, a Ph.D and entrepreneur who has spent his entire adult life studying computer vision and machine learning. Over the past 3 years alone I have:
- Started the PyImageSearch.com blog and published over 200+ tutorials and articles aimed at teaching computer vision, image processing, and machine learning.
- Authored Practical Python and OpenCV, which has been featured on the official OpenCV.org website.
- Created PyImageSearch Gurus, an actionable, real-world course on computer vision and OpenCV. This course is the most comprehensive computer vision education online today, covering 13 modules broken out into 168 lessons with over 2,161 pages of content.
- Ran a successful Kickstarter campaign (over 5,000% funded) to fund the creation of this book.
- Answered 10,000 emails and helped 1,000's of developers, researchers, and students learn the ropes of computer vision and machine learning.
If studying deep learning and visual recognition sounds interesting to you, I hope you'll consider grabbing a copy of this book. You'll learn a ton about deep learning and computer vision in a practical, hands-on way. And you'll have fun doing it. See you on the other side!