Deep dream: Visualizing every layer of GoogLeNet

A few weeks ago I introduced bat-country, my implementation of a lightweight, extendible, easy to use Python package for deep dreaming and inceptionism.

The reception of the library was very good, so I decided that it would be interesting to do a follow up post — but instead of generating some really trippy images like on the Twitter #deepdream stream, I thought it would be more captivating to instead visualize every layer of GoogLeNet using bat-country.

Looking for the source code to this post?
Jump right to the downloads section.

Visualizing every layer of GoogLeNet with Python

Below follows my Python script to load an image, loop over every layer of the network, and then write each output image to file:

This script requires three command line arguments: the --base-model  directory where our Caffe model lives, the path to our input --image , and finally the --output  directory where our images will be stored after being passed through the network.

As you’ll also see, I am using a try/except  block to catch any layers that cannot be used for visualization.

Below is the image that I inputted to the network:

Figure 1: The iconic input image of Dr. Grant and the T-Rex from Jurassic Park.

Figure 1: The iconic input image of Dr. Grant and the T-Rex from Jurassic Park.

I then executed the Python script using the following command:

And the visualization process will kick off. I generated my results on an Amazon EC2 g2.2xlarge instance with GPU support enabled so the script finished up within 30 minutes.

You can see a .gif of all layer visualizations below:

Figure 2: Visualizing every layer of GoogLeNet using bat-country.

Figure 2: Visualizing every layer of GoogLeNet using bat-country.

The .gif is pretty large at 9.6mb, so give it a few seconds to load, especially if you are on a slow connection.

In the meantime, here are some of my favorite layers:

Figure 2: This is my far my favorite one of the bunch. The lower layers of the network reflect edge-like regions in the input image.

Figure 3: This is my far my favorite one of the bunch. The lower layers of the network reflect edge-like regions in the input image.

Figure 3: The inception_3a/3x3 layer also products a nice effect.

Figure 4: The inception_3a/3×3 layer also products a nice effect.

Figure 4: The same goes for the inception_3b/3x3_reduce layer.

Figure 5: The same goes for the inception_3b/3x3_reduce layer.

Figure 4:  This one I found amusing -- it seems that Dr. Grant has developed a severe case of dog-ass.

Figure 6: This one I found amusing — it seems that Dr. Grant has developed a severe case of dog-ass.

Figure 6: Eventually, our Dr. Grant and T-Rex have morphed into something else entirely.

Figure 7: Eventually, our Dr. Grant and T-Rex have morphed into something else entirely.

Summary

This blog post was a quick “just for fun” tutorial on visualizing every layer of a CNN using the bat-country library. It also served as a good demonstration on how to use the bat-country  library.

If you haven’t had a chance to play around with deep dreaming or inceptionism, definitely give the original post on bat-country a read — I think you’ll find it amusing and enjoyable.

See you next week!

Downloads:

If you would like to download the code and images used in this post, please enter your email address in the form below. Not only will you get a .zip of the code, I’ll also send you a FREE 17-page Resource Guide on Computer Vision, OpenCV, and Deep Learning. Inside you'll find my hand-picked tutorials, books, courses, and libraries to help you master CV and DL! Sound good? If so, enter your email address and I’ll send you the code immediately!

, , , , , , ,

12 Responses to Deep dream: Visualizing every layer of GoogLeNet

  1. Aaron August 21, 2015 at 2:41 pm #

    Hi,

    Whats the processing requirements for each picture?

    I’d love to do some real-time video processing.

    • Adrian Rosebrock August 22, 2015 at 7:15 am #

      It really depends on the size of the image and whether you are using the CPU or the GPU. GPU is immensely faster than the CPU. And smaller images are better if you want to process them in real-time. Lower level layers of network will also be faster since each layer requires additional computation. If you are trying to do real-time processing, I would suggest making your images as small as you can tolerate and using the GPU.

  2. Tim August 29, 2015 at 7:11 pm #

    Hey Adrian,

    Have you experimented with other caffe models? Or just google’s?

    Cheers.

    • Adrian Rosebrock August 30, 2015 at 7:55 am #

      I’ve played around with MIT Places with BatCountry, but mostly GoogLeNet.

  3. Pawan Dhananjay April 27, 2016 at 1:43 pm #

    Hi..
    Will running on CPU make any difference with respect to the visualisations obtained?
    I ran the same code on my laptop without GPU and I am getting very hazy visualisations as compared to the ones you have put up!

    • Adrian Rosebrock April 28, 2016 at 2:32 pm #

      Running on the CPU shouldn’t have any impact on the output visualizations — it will just run much, much slower than using the GPU.

  4. xibo July 14, 2016 at 6:00 am #

    I have a question:
    for example the input image is 256×256,
    does the size of the output of the CNN of deep dream is also 256×256?(for each layer)
    because the convolution layer(for example, the filter is 3×3) reduce the size of the image to 254×254

    • Adrian Rosebrock July 14, 2016 at 1:08 pm #

      The output image will be the same spatial dimensions as your input image.

  5. xibo July 14, 2016 at 6:02 am #

    the output of each layer of the CNN is a tensor, how to change it into a image?

    • Adrian Rosebrock July 14, 2016 at 1:07 pm #

      That really depends on your deep learning library. With Caffe + Python, you can use something like this.

      • xibo July 14, 2016 at 9:05 pm #

        Thanks for answering my questions, I have simple understanding of the output of the CNN. And it is depending on the model it uses.

  6. praveen vijayan November 27, 2016 at 8:31 am #

    Great . Helpful article.. thanks

Before you leave a comment...

Hey, Adrian here, author of the PyImageSearch blog. I'd love to hear from you, but before you submit a comment, please follow these guidelines:

  1. If you have a question, read the comments first. You should also search this page (i.e., ctrl + f) for keywords related to your question. It's likely that I have already addressed your question in the comments.
  2. If you are copying and pasting code/terminal output, please don't. Reviewing another programmers’ code is a very time consuming and tedious task, and due to the volume of emails and contact requests I receive, I simply cannot do it.
  3. Be respectful of the space. I put a lot of my own personal time into creating these free weekly tutorials. On average, each tutorial takes me 15-20 hours to put together. I love offering these guides to you and I take pride in the content I create. Therefore, I will not approve comments that include large code blocks/terminal output as it destroys the formatting of the page. Kindly be respectful of this space.
  4. Be patient. I receive 200+ comments and emails per day. Due to spam, and my desire to personally answer as many questions as I can, I hand moderate all new comments (typically once per week). I try to answer as many questions as I can, but I'm only one person. Please don't be offended if I cannot get to your question
  5. Do you need priority support? Consider purchasing one of my books and courses. I place customer questions and emails in a separate, special priority queue and answer them first. If you are a customer of mine you will receive a guaranteed response from me. If there's any time left over, I focus on the community at large and attempt to answer as many of those questions as I possibly can.

Thank you for keeping these guidelines in mind before submitting your comment.

Leave a Reply

[email]
[email]