We’re not even halfway through 2014 yet, but there have been some really amazing Python books released this year that have not been getting much attention.
Some of these books are related to computer vision, some to machine learning and statistical analysis, and others to parallel computing.
While not all of these books are directly related to building content based image retrieval systems or image search engines, they are tangentially related in one way or another.
For instance, how are you going to deploy an image search engine? By using a Python web framework, of course.
And how do you intend on indexing your dataset of images? I would hope that you’re using parallel computing.
Definitely take a second to check out these books. You won’t be disappointed.
And if you think I’m missing a particularly important book, leave a comment or send me a message.
The Best Python books of 2014
If you’re a reader of this blog, you know that I love hands-on, easy to follow tutorials, and guides to solving problems. And let’s face it, parallel computing makes sense on both a theoretical and practical level. But the big question is, how do you actually do it?
This book answers all your parallel computing questions and discuses pipes and queues, distributed tasks using one of my favorite Python packages, celery, and how to perform asynchronous I/O.
In the context of computer vision and image search engines, parallel computing is practically a must. Imagine that we are tasked with extracting features from a dataset with millions of images. Indexing this dataset would take weeks, even months, on a single core machine.
The solution is to distribute the indexing across multiple processes and even multiple machines.
And in order to index images in parallel across multiple process and machines, you first need to understand how you practically do this.
I highly recommend this book and suggest you add it to your reading list.
Think back to my top 9 favorite Python libraries for building image search engines post.
Do you know which Python package made the cut?
That’s right. Matplotlib.
Back when I was running experiments and gathering results for my dissertation, matplotlib was always there to make plotting simple and easy.
Whether you are a scientific researcher publishing your results, a student at a university working on a project, or a programmer or developer working the 9-to-5, you’ve likely had to generate some plots and figures in your lifetime.
And while the documentation for matplotlib is fantastic, you just can’t beat the cookbook approach — actual Python scripts to solve actual plotting problems using matplotlib.
Check out this book and let me know what you think.
Before you can even think about building a content based image retrieval system or an image search engine, there is one paramount, absolutely critical step you must take first.
And that step is…gathering your dataset.
Makes sense, right?
You can’t build an image search engine if you don’t have anything to search against!
In some cases, your dataset may already exist for you. This is especially true if you are doing research in academia and need to compare your results to other methods utilizing a common dataset.
But on the business, enterprise, and startup side of things, this isn’t always true.
Here’s an example from my own personal experience…
When I first built Chic Engine, I had to create a web crawler to go out and scrape the webpages and images associated with fashion content, like shirts, shoes, jackets, etc.
In order to create this web crawler, I utilized Beautiful Soup, which makes parsing and navigating the DOM (Document Object Model) tree dead simple.
Honestly, it took me less than 30 minutes to create a crawler to scrape both Amazon and ShopStyle for the latest fashion finds. Thank you, Beautiful Soup.
Once I had my screen scrapers coded up, I utilized multi-threading and multi-processing methods (similar to what is discussed in Parallel Programming in Python mentioned above) to quickly crawl and scrape the web from my content.
After I had scraped all the data I needed, I indexed my dataset using (once again) parallel methods.
Noticing a parallel computing trend here?
All of this wouldn’t have been possible without the Beautiful Soup package.
If it’s not on the web, then it probably doesn’t exist.
That’s not exactly a true statement, but I think you get my sentiment.
For example, let’s say that you just built the next big startup.
Or better yet, the next Google Image Search or TinEye.
How are you going to get it out there? How are users going to utilize your new technology?
You’ll likely be putting together some sort of website and maybe even an API wrapped around it.
In order to do that, you’ll likely need a Python web-framework.
But which one do you choose?
And speaking of Django, I highly recommend picking up a copy of Two Scoops of Django 1.6 — Daniel Greenfeld and Audrey Roy have done a phenomenal job putting this book together and is a must own for Django developers.
#5. Statistics, Data Mining, and Machine Learning in Astronomy: A Practical Python Guide for the Analysis of Survey Data
How big is “big data”? What does it mean for data to be “big”?
And is big data, like the cake, just a lie?
While we all may have different definitions for big data, one thing is for sure: the volume of data in which astronomers and astrophysicists interact with is reaching the petabyte domain.
Tell me, how well do your algorithms scale?
Can they (efficiently) handle a petabyte of data?
If you’re working with a ton of data, the Python code and datasets associated with Statistics, Data Mining, and Machine Learning is well worth the look.
Personally, I find that I learn the best when I’m getting my hands dirty — when I’m neck deep in the code and I’m just trying to tread water.
That’s why I’m such a big fan of programming cookbooks…
You get to jump right in to practical, working examples. You can to break the code. And then get it working again, all while tweaking it to your liking.
If you’ve been considering tinkering with Raspberry Pi, definitely check out this book by Tim Cox.
Alright. I’m clearly biased. But…
If you’ve ever been curious about learning the basics of computer vision and image processing using OpenCV and Python, then this is the book for you.
And I can teach you the basics in a single weekend.
I know, it sounds crazy.
See, Practical Python and OpenCV covers the fundamentals of computer vision and image processing. I’ve included tons of code examples that allow you to get your hands dirty, quickly and easily.
Seriously, this is your guaranteed quick start guide to learning the fundamentals of computer vision and image processing using Python and OpenCV. It doesn’t matter if you’re a developer, programmer, or student — I can teach you the fundamentals in a single weekend.
So if you’re even remotely interested in computer vision and how to make machines see and interpret photos and images, take a second to check out my book.
It won’t disappoint, I promise.
So there you have it!
The best Python books of 2014 (thus far anyway)!
This year is not even halfway over and we already have some phenomenal Python content to digest. Honestly, it just shows how dedicated and great the Python community is.
Anyway, if there is a book that you think I am missing on this list, feel free to leave a comment or shoot me a message. I would be happy to chat.