In this post, I interview Brandon Gilles, a longtime PyImageSearch reader, and creator of the OpenCV AI Kit (OAK), which is revolutionizing how we are performing embedded computer vision and deep learning.
To celebrate the 20th anniversary of the OpenCV library, Brandon partnered with the official OpenCV.org organization to create the OpenCV AI Kit, an MIT-licensed open-source software API and Myriad X-based embedded board/camera.
OAK comes in two flavors:
- OAK-1: The standard OpenCV AI board that can perform neural network inference, object tracking, April Tags recognition, feature detection, and basic image processing operations.
- OAK-D: Everything in OAK-1, but with a stereo depth camera, 3D object localization, and tracking objects in a 3D space.
Last week, Brandon and the OpenCV organization launched a Kickstarter campaign to fund the creation of these amazing embedded AI boards. At the time of this writing, the OpenCV AI Kit is currently sitting at $419,833 in funding from its Kickstarter supporters, potentially making it one of the most successful embedded boards in crowdfunding history!
Brandon and the OpenCV team were gracious enough to give the PyImageSearch team early access to the OpenCV AI Kit, so expect OAK tutorials on the PyImageSearch blog in the future.
Additionally, I’ve decided that OAK will be covered inside the next edition of our embedded computer vision and deep learning book, Raspberry Pi for Computer Vision book (specifically the Complete Bundle).
- If you already have a copy of the Complete Bundle, you will receive the OAK chapter updates for free via email update when they release.
- If you haven’t already picked up a copy of RPi4CV, make sure you do, and you will receive the OAK chapters when they are ready.
The OpenCV AI Kit is going to revolutionize how embedded computer vision and deep learning is performed. We can’t wait to work with the board (and help you get your start with it as well).
Let’s give a warm welcome to Brandon Gilles as he shares his work.
An interview with Brandon Gilles, creator of the OpenCV AI Kit (OAK)
Adrian: Hi Brandon! It’s such a pleasure to have you here on the PyImageSearch blog. I’m sure you’re swamped with running the OpenCV AI Kit (OAK) Kickstarter campaign – we all appreciate you taking the time to do this interview.
Brandon: Thanks a ton for having me. As you know the whole team and I are all huge fans of PyImageSearch. What you guys do is so great for democratizing computer-vision and deep-learning know-how and is such a great resource. And the constant stream of tutorials, new content, etc. is invaluable for keeping up to speed in this rapidly-advancing field. So on behalf of myself and the whole OAK team – thanks for all that you and your team do! I don’t think I’ve met a CV/AI person who hasn’t used your tutorials or books.
Adrian: Thanks so much for the kind words, Brandon! We really appreciate it. Can you tell us a bit about yourself? Where do you work and what is your job?
Brandon: I’m an Electrical Engineer. I’m the founder and CEO of Luxonis.
I’ve always wanted to start my own business. And I’m constantly searching for what to build that could have a positive impact on the world.
Funnily enough, I’ve also always been risk-averse… so I spent over a decade dreaming about starting a business, but never actually diving full-in to do it. So this is my first attempt at going 100% all-in to build a product that matters and a company from scratch.
Adrian: What is your background in computer vision and deep learning? And how did you first become interested in computer vision and deep learning?
Brandon: So like most EEs I had done basic computer vision in undergrad classes (now 16 years ago!), implementing traditional computer functions in classes like linear systems, but my career trajectory hadn’t involved computer vision or deep learning at all until one of my mentors quit the company for which we were both working.
I loved it there, he loved it there (he was an antenna designer there, actually), and so it was perplexing.
In interviewing him why he was leaving, he told me ‘AI and deep learning is about to up-end every industry; it’s the biggest opportunity of my career – I have to go try’.
And at the time I knew very little about deep learning, and the last time I had even thought about the term ‘AI’ was 2004 when a college roommate was trying out some AI techniques in Lisp and him describing to me how useless it was. So my mental model of AI from 2004 until mid-2016 was ‘it’s useless’.
Then this mentor (who I hold in super-high-regard) left to pursue AI/machine learning/deep learning full-time, leaving a job we both loved to do so. It was a shock and opened up my eyes.
I went home that day and started Googling deep learning, machine learning, AI, etc. Over the next ~12 months I was obsessed, spending any spare time I had (i.e. reading on the toilet) reading/lurking on PyImageSearch (primarily), following the tutorials PyImageSearch, and keeping tabs on TowardsDataScience, MIT Technology review, etc.
Adrian: So tell me about this OpenCV AI Kit Kickstarter campaign. I remember when you first mentioned this project back to me in 2018, but back then it was called “DepthAI”. Can you tell us your story of how you came up with DepthAI, validated the idea, and then turned that into the OpenCV AI Kit?
Brandon: So it’s a bit circuitous, and has changed names a bunch of times – so sorry about that – but this actually started as me trying to solve a problem as a result of tragedy for several people in my circle of close and extended friends and family.
So I was actually wanting to make a non-1980s-technology version of laser-tag (imagine fully augmented players, walls, structures, etc.) when I left my job to pursue CV/ML full-time, thinking the breakthroughs in these techniques could allow making such a multi-player AR system possible as an inexpensive retrofit to existing laser tag facilities. This was 2017.
But then in what I remember as like a single week, it seemed everyone I knew was telling me about how their friend, family member, or colleague had been hit by a distracted driver while riding their bike to or from work. Fortunately for me, none of these folks were my immediate circle… but it has a huge impact on me nonetheless.
My Dad’s best friend was hit from behind, it broke his back in multiple places, his femur, and shattered his hips bad enough that he had to be bedridden for 9 months before he could have surgery just to get the swelling down. My good friend’s business partner had a nearly identical experience. My college ski buddy (I found out through a mutual friend) had the same thing, but additionally suffered a traumatic brain injury. And the founder of the hackerspace I sometimes attend, his business partner died from the same impact (the vehicle’s mirror hit him in the neck and killed him).
So all of a sudden this Augmented Reality laser tag system I was prototyping didn’t feel like the right thing to do. And having already spent years at this point studying all the state of art behind computer vision, machine learning, deep learning, and other nascent technologies, I wondered if you could build an embedded system that could have prevented these accidents.
In all of the cases, if the car had just been as little as 12 inches further to the left, none of this would have happened. Having seen Jonathan Lansey’s LOUD Bicycle Kickstarter years prior – and been latently fascinated by how effective it was – I wanted, I had to find out if it was possible to build a system that would warn the person riding the bike – and warn the driver – to prevent this type of tragedy.
It seemed to me that if computer vision and deep learning could outperform doctors on analyzing CT-scan imagery, outperform humans at image classification, object detection, 3D perception, etc. that there had to be a way to make this life-safety device.
What I envisioned was something that was equivalent to a friend riding on your bike backwards who could tap you on the shoulder when things weren’t looking right, start flashing ultra-bright LEDs at the car when the driver is on a trajectory to hit you, and then actually honk a car horn (which are remarkably small!) if the car is definitely going to hit you… with just enough time to allow the swerve. So leveraging what I had learned from PyImageSearch and a big step up from Github from Katsuya Hyodo (PINTO0309) I made a prototype that could to this:
It was a huge, fantastically-ugly prototype, but it totally worked! And this whole prototyping and testing process taught me one super-important thing:
- Depth + modering object detection made solving this problem tractable… and it worked insanely well with very little code/effort.
Now that the concept was proven, including actually warning the biker with feedback, flashing LEDs and the car, and even honking real car horns (which are super loud), I went to see about building an actual product. Which taught me a second important thing:
- Although individual components existed for this (depth cameras, AI processors, embedded CPUs, etc.) there was no way to build a small, embedded product around these solutions (they were too big and cumbersome to realistically fit under a bike seat, or be cost reasonable).
Interestingly enough, the Myriad X had just started shipping at this time – and it had every single CV/AI capability to solve this problem – but it was only available in a USB stick or a PCIE card… neither of which allowed embedded use and also neither allowed making use of the depth engine and its other CV acceleration capabilities (motion estimation, edge detection, object tracking, feature tracking, video encoding, etc.). So the USB stick/PCIE card were not usable in this sort of application that was (1) embedded and (2) required more than just neural inference.
So at this point, now with a team behind me, we either had to give up on the mission or build the platform.
We chose to build the platform. And at the time we called it DepthAI … but actually, after some initial sales, you tweeted about our platform (now forever ago), which resulted in Dr. Mallick from OpenCV reaching out with the idea to take our mission, this platform, and make it a core part of OpenCV for solving such embedded CV & AI problems.
Adrian: Tell us about the hardware behind the OpenCV AI Kit. What’s powering this embedded device? And more to the point, as deep learning/computer vision practitioners, why should we care about the OpenCV AI KIt?
Brandon: So what led us to find the Myriad X was that it combined all the capabilities for solving the problem we were trying to solve, in one single low-power chip:
- Real-time neural inference (covering nearly every type of network)
- Stereo depth
- Feature tracking including Motion Estimation and platform-motion estimation
- Object tracking
- H.265/H.264 encoding (for recording incidents and/or generally just recording)
- JPEG encoding
- 16x efficient vector processors for combining these capabilities together and running accelerated CV functions (think of these as link an embedded vision-specific GPU)
Being an EE, I said ‘OK, well that’s great, I just need the chip, the SDK, and I’m good to go’. What I found out is that the only SDK for the chip was for neural inference… all the other functionality was not usable by any products on the market (compute sticks, PCIE cards, etc.). They could only do inference.
So the thing that makes it unique (and why we chose it) was the combination of all those features. When used in a USB stick or PCIE all you get is 1/7th capabilities the chip has (and there are others I’m forgetting… so it’s probably more like 1/10th of the chip’s power).
So with OAK we take the chip and expose the power that we needed, and that we figured other people must need as well. In short, we expose the real power of the Myriad X… which is the combination of all these features to solve Spatial AI problems.
And on that – Spatial AI is the capability to use AI to get results in physical coordinates – position of objects or features in meters. And this is exactly what we needed to monitor and track the 3D trajectories of vehicles (and their edges/mirrors) in real-time.
Adrian: So just to clarify, the OpenCV AI Kit is more than just an embedded Artificial Intelligence board. It’s also an API, correct? How do users interact with the API? Is it Python and OpenCV compatible?
Brandon: Great question. The number-one confusion about the platform is that ‘it’s just another board’. Although the hardware is valuable, because it is what affords the Myriad X to be used as it should, the key is the API and the way we configure the Myriad X’s hardware blocks in novel ways to afford all this functionality in a simple way. We spent a TON of time optimizing cache, inner-chip communication systems, and DDR communication – including tons of rewrites – to allow the Myriad X to do all of these things.
And we expose this through a simple API which is yes Python and OpenCV compatible. And is actually written in C++ (also open-source, so can be compiled on anything that can compile OpenCV) with pybind11 providing the API functionality in Python. So in short C++ interface and direct-programming can be done as well.
The Kickstarter focuses on the USB-interface version of OAK, but we are working in the background on SPI-interface versions… so this power could be connected to other systems like microcontrollers that don’t even have an operating system. This will be open-sourced likely around the closing of the Kickstarter campaign.
Adrian: You clearly have extensive experience in embedded systems, board design, and manufacturing. What’s the most challenging part of manufacturing the OpenCV AI Kit boards? How are you anticipating keeping up with the demand?
Brandon: Thanks for the kind words! So the most challenging part is the sourcing of components, specifically the camera modules. Over the last several decades we’ve seen this move to information transparency and an ease of buying all sorts of components… this wave has not yet hit the camera module market. It remains very opaque and rigid. As an aside, it seems to me like a market ripe for disruption.
So we’ve spent a tremendous amount of time simply communicating with suppliers, doing build orders, and negotiating. Other than that, the manufacturing and supply chain has been pain-free. We did decent-quantity control-run prior to the Kickstarter and got 99.7% yield, which we were incredibly excited about.
Adrian: You’ve been a longtime reader and customer of PyImageSearch having read both Raspberry Pi for Computer Vision and Deep Learning for Computer Vision with Python – how have those books helped you develop the OpenCV AI Kit?
Brandon: Yes, our team has collectively purchased every one of your books actually (most of which independently before we started working together).
So the #1 way that PyImageSearch has helped in accelerating the time to complete a prototype. Instead of fighting 10 hours to figure out how to get something going, we leverage your tutorials and books and get it done as a 20-second step because you and your team has already fought the good fight to figure out how exactly to make codebase-x or machine-learning technique-x actually run, and run properly.
So PyImageSearch is what allowed us to quickly prototype, discover the power and feasibility of applying computer vision to solve such problems. And this led to the discovery that there was a hole in the hardware ecosystem, that the capabilities of the Myriad X weren’t actually available and usable for everyone. So PyImageSearch was a core part of discovering our mission, and continues to help us get things running smoothly and without unnecessary hassle.
Adrian: Would you recommend Raspberry Pi for Computer Vision and Deep Learning for Computer Vision with Python to other developers, students, and researchers who are trying to learn computer vision and deep learning?
Brandon: Absolutely. So I wasn’t abreast at all on what was technically doable, and PyImageSearch was instrumental in showing me what was, and allowing me to get working systems (like the crate above) going right away.
For anyone who is interested, these books allow you to get proficient and get to solving real-world problems right away.
Adrian: Is there any advice you would give to someone who wants to follow in your footsteps, learn computer vision and deep learning, and then launch a product in the CV/DL space?
Brandon: Start by following a PyImageSearch tutorial. I spent too much time reading articles about theory/etc. and the state-of-the-art. Things really started happening once I started actually running PyImageSearch code, and modifying it to solve my actual problems and to satisfy my curiosity.
Adrian: If a PyImageSearch reader wants to chat, what’s the best place to contact you?
Brandon: I’m a huge believer in engineers being able to communicate directly with anyone interested, so we have a couple ways that allow this – our community slack channel (https://luxonis-community.slack.com/) where I am available effectively when I’m awake, and our discussion forum (https://discuss.luxonis.com/). Folks can feel free to email me at brandon [at] luxonis [dot] com as well!
In this blog post, we interviewed Brandon Gilles, an entrepreneur, PyImageSearch reader/customer, and creator of the OpenCV AI Kit.
Brandon and the OpenCV team recently launched a crowdfunding campaign for the OpenCV AI Kit (OAK), a piece of embedded hardware that is making it easier than ever for computer vision and deep learning practitioners to apply CV/DL to embedded devices.
The Kickstarter campaign for the OpenCV AI Kit will be live for another three weeks. Brandon and OpenCV are offering discounts on the OpenCV AI Kit until the campaign closes, so if you’re interested in grabbing this hardware at the special crowdfunding deal, now would be a good time to do so.
And if you’re interested in following in Brandon’s footsteps and learning how to apply computer vision/deep learning to your own projects and research, definitely grab a copy of Deep Learning for Computer Vision with Python and Raspberry Pi for Computer Vision – these are the exact same resources Brandon used.
To be notified when future blog posts and interviews are published here on PyImageSearch, just enter your email address in the form below, and I’ll be sure to keep you in the loop.
Join the PyImageSearch Newsletter and Grab My FREE 17-page Resource Guide PDF
Enter your email address below to join the PyImageSearch Newsletter and download my FREE 17-page Resource Guide PDF on Computer Vision, OpenCV, and Deep Learning.