Convolutions with OpenCV and Python

Figure 12: Finding horizontal gradients in an image using the Sobel-y operator and convolutions.

I’m going to start today’s blog post by asking a series of questions which will then be addressed later in the tutorial:

  • What are image convolutions?
  • What do they do?
  • Why do we use them?
  • How do we apply them?
  • And what role do convolutions play in deep learning?

The word “convolution” sounds like a fancy, complicated term — but it’s really not. In fact, if you’ve ever worked with computer vision, image processing, or OpenCV before, you’ve already applied convolutions, whether you realize it or not!

Ever apply blurring or smoothing? Yep, that’s a convolution.

What about edge detection? Yup, convolution.

Have you opened Photoshop or GIMP to sharpen an image? You guessed it — convolution.

Convolutions are one of the most critical, fundamental building-blocks in computer vision and image processing. But the term itself tends to scare people off — in fact, on the the surface, the word even appears to have a negative connotation.

Trust me, convolutions are anything but scary. They’re actually quite easy to understand.

In reality, an (image) convolution is simply an element-wise multiplication of two matrices followed by a sum.

Seriously. That’s it. You just learned what convolution is:

  1. Take two matrices (which both have the same dimensions).
  2. Multiply them, element-by-element (i.e., not the dot-product, just a simple multiplication).
  3. Sum the elements together.

To understand more about convolutions, why we use them, how to apply them, and the overall role they play in deep learning + image classification, be sure to keep reading this post.

Looking for the source code to this post?
Jump right to the downloads section.

A Quick Note on PyImageSearch Gurus

Before we get started, I just wanted to mention that the first half of this blog post on kernels and convolutions is based on the “Kernels” lesson inside the PyImageSearch Gurus course.

While the Kernels lesson goes into a lot more detail than what this blog post does, I still wanted to give you a taste of what PyImageSearch Gurus — my magnum opus on computer vision — has to offer.

If you like this tutorial, there are over 168 lessons covering 2,161+ pages of content on image basicsdeep learningautomatic license plate recognition (ANPR)face recognitionand much more inside PyImageSearch Gurus.

To learn more about the PyImageSearch Gurus course (and grab 10 FREE sample lessons), just click the button below:

Click here to learn more about PyImageSearch Gurus!

Convolutions with OpenCV and Python

Think of it this way — an image is just a multi-dimensional matrix. Our image has a width (# of columns) and a height (# of rows), just like a matrix.

But unlike the traditional matrices you may have worked with back in grade school, images also have a depth to them — the number of channels in the image. For a standard RGB image, we have a depth of 3 — one channel for each of the Red, Green, and Blue channels, respectively.

Given this knowledge, we can think of an image as a big matrix and kernel or convolutional matrix as a tiny matrix that is used for blurring, sharpening, edge detection, and other image processing functions.

Essentially, this tiny kernel sits on top of the big image and slides from left-to-right and top-to-bottom, applying a mathematical operation (i.e., a convolution) at each (x, y)-coordinate of the original image.

It’s normal to hand-define kernels to obtain various image processing functions. In fact, you might already be familiar with blurring (average smoothing, Gaussian smoothing, median smoothing, etc.), edge detection (Laplacian, Sobel, Scharr, Prewitt, etc.), and sharpening — all of these operations are forms of hand-defined kernels that are specifically designed to perform a particular function.

So that raises the question, is there a way to automatically learn these types of filters? And even use these filters for image classification and object detection?

You bet there is.

But before we get there, we need to understand kernels and convolutions a bit more.


Again, let’s think of an image as a big matrix and a kernel as tiny matrix (at least in respect to the original “big matrix” image):

Figure 1: A kernel is a small matrix that slides from left-to-right and top-to-bottom across a larger image. At each pixel in the input image, the neighborhood of the image is convolved with the kernel and the output stored. Source, PyImageSearch Gurus

Figure 1: A kernel is a small matrix that slides from left-to-right and top-to-bottom across a larger image. At each pixel in the input image, the neighborhood of the image is convolved with the kernel and the output stored. Source: PyImageSearch Gurus

As the figure above demonstrates, we are sliding the kernel from left-to-right and top-to-bottom along the original image.

At each (x, y)-coordinate of the original image, we stop and examine the neighborhood of pixels located at the center of the image kernel. We then take this neighborhood of pixels, convolve them with the kernel, and obtain a single output value. This output value is then stored in the output image at the same (x, y)-coordinates as the center of the kernel.

If this sounds confusing, no worries, we’ll be reviewing an example in the “Understanding Image Convolutions” section later in this blog post.

But before we dive into an example, let’s first take a look at what a kernel looks like:

Figure 2: A 3 x 3 kernel that can be convolved with an image using OpenCV and Python. Source: PyImageSearch Gurus.

Figure 2: A 3 x 3 kernel that can be convolved with an image using OpenCV and Python. Source: PyImageSearch Gurus

Above we have defined a square 3 x 3 kernel (any guesses on what this kernel is used for?)

Kernels can be an arbitrary size of M x N pixels, provided that both M and N are odd integers.

Note: Most kernels you’ll typically see are actually square N x N matrices.

We use an odd kernel size to ensure there is a valid integer (x, y)-coordinate at the center of the image:

Figure 3: A 3 x 3 kernel with a valid integer center (x, y)-coordinate (left). A 2 x 2 kernel without a valid integer (x, y)-center. Source, PyImageSearch Gurus

Figure 3: A 3 x 3 kernel with a valid integer center (x, y)-coordinate (left). A 2 x 2 kernel without a valid integer (x, y)-center (right). Source: PyImageSearch Gurus

On the left, we have a 3 x 3 matrix. The center of the matrix is obviously located at x=1, y=1 where the top-left corner of the matrix is used as the origin and our coordinates are zero-indexed.

But on the right, we have a 2 x 2 matrix. The center of this matrix would be located at x=0.5, y=0.5. But as we know, without applying interpolation, there is no such thing as pixel location (0.5, 0.5) — our pixel coordinates must be integers! This reasoning is exactly why we use odd kernel sizes — to always ensure there is a valid (x, y)-coordinate at the center of the kernel.

Understanding Image Convolutions

Now that we have discussed the basics of kernels, let’s talk about a mathematical term called convolution.

In image processing, a convolution requires three components:

  1. An input image.
  2. A kernel matrix that we are going to apply to the input image.
  3. An output image to store the output of the input image convolved with the kernel.

Convolution itself is actually very easy. All we need to do is:

  1. Select an (x, y)-coordinate from the original image.
  2. Place the center of the kernel at this (x, y)-coordinate.
  3. Take the element-wise multiplication of the input image region and the kernel, then sum up the values of these multiplication operations into a single value. The sum of these multiplications is called the kernel output.
  4. Use the same (x, y)-coordinates from Step #1, but this time, store the kernel output in the same (x, y)-location as the output image.

Below you can find an example of convolving (denoted mathematically as the “*” operator) a 3 x 3 region of an image with a 3 x 3 kernel used for blurring:

Figure 4: Convolving a 3 x 3 input image region with a 3 x 3 kernel used for blurring. Source: PyImageSearch Gurus

Figure 4: Convolving a 3 x 3 input image region with a 3 x 3 kernel used for blurring. Source: PyImageSearch Gurus


Figure 5: The output of the convolution operation is stored in the output image. Source: PyImageSearch Gurus

Figure 5: The output of the convolution operation is stored in the output image. Source: PyImageSearch Gurus

After applying this convolution, we would set the pixel located at the coordinate (i, j) of the output image O to O_i,j = 126.

That’s all there is to it!

Convolution is simply the sum of element-wise matrix multiplication between the kernel and neighborhood that the kernel covers of the input image.

Implementing Convolutions with OpenCV and Python

That was fun discussing kernels and convolutions — but now let’s move on to looking at some actual code to ensure you understand how kernels and convolutions are implemented. This source code will also help you understand how to apply convolutions to images.

Open up a new file, name it , and let’s get to work:

We start on Lines 2-5 by importing our required Python packages. You should already have NumPy and OpenCV installed on your system, but you might not have scikit-image installed. To install scikit-image, just use pip :

Next, we can start defining our custom convolve  method:

The convolve  function requires two parameters: the (grayscale) image  that we want to convolve with the kernel .

Given both our image  and kernel  (which we presume to be NumPy arrays), we then determine the spatial dimensions (i.e., width and height) of each (Lines 10 and 11).

Before we continue, it’s important to understand that the process of “sliding” a convolutional matrix across an image, applying the convolution, and then storing the output will actually decrease the spatial dimensions of our output image.

Why is this?

Recall that we “center” our computation around the center (x, y)-coordinate of the input image that the kernel is currently positioned over. This implies there is no such thing as “center” pixels for pixels that fall along the border of the image. The decrease in spatial dimension is simply a side effect of applying convolutions to images. Sometimes this effect is desirable and other times its not, it simply depends on your application.

However, in most cases, we want our output image to have the same dimensions as our input image. To ensure this, we apply padding (Lines 16-19). Here we are simply replicating the pixels along the border of the image, such that the output image will match the dimensions of the input image.

Other padding methods exist, including zero padding (filling the borders with zeros — very common when building Convolutional Neural Networks) and wrap around (where the border pixels are determined by examining the opposite end of the image). In most cases, you’ll see either replicate or zero padding.

We are now ready to apply the actual convolution to our image:

Lines 24 and 25 loop over our image , “sliding” the kernel from left-to-right and top-to-bottom 1 pixel at a time.

Line 29 extracts the Region of Interest (ROI) from the image  using NumPy array slicing. The roi  will be centered around the current (x, y)-coordinates of the image . The roi  will also have the same size as our kernel , which is critical for the next step.

Convolution is performed on Line 34 by taking the element-wise multiplication between the roi  and kernel , followed by summing the entries in the matrix.

The output value k  is then stored in the output  array at the same (x, y)-coordinates (relative to the input image).

We can now finish up our convolve  method:

When working with images, we typically deal with pixel values falling in the range [0, 255]. However, when applying convolutions, we can easily obtain values that fall outside this range.

In order to bring our output  image back into the range [0, 255], we apply the rescale_intensity  function of scikit-image (Line 41). We also convert our image back to an unsigned 8-bit integer data type on Line 42 (previously, the output  image was a floating point type in order to handle pixel values outside the range [0, 255]).

Finally, the output  image is returned to the calling function on Line 45.

Now that we’ve defined our convolve  function, let’s move on to the driver portion of the script. This section of our program will handle parsing command line arguments, defining a series of kernels we are going to apply to our image, and then displaying the output results:

Lines 48-51 handle parsing our command line arguments. We only need a single argument here, --image , which is the path to our input path.

We then move on to Lines 54 and 55 which define a 7 x 7 kernel and a 21 x 21 kernel used to blur/smooth an image. The larger the kernel is, the more the image will be blurred. Examining this kernel, you can see that the output of applying the kernel to an ROI will simply be the average of the input region.

We define a sharpening kernel on Lines 58-61, used to enhance line structures and other details of an image. Explaining each of these kernels in detail is outside the scope of this tutorial, so if you’re interested in learning more about kernel construction, I would suggest starting here and then playing around with the excellent kernel visualization tool on

Let’s define a few more kernels:

Lines 65-68 define a Laplacian operator that can be used as a form of edge detection.

Note: The Laplacian is also very useful for detecting blur in images.

Finally, we’ll define two Sobel filters on Lines 71-80. The first (Lines 71-74) is used to detect vertical changes in the gradient of the image. Similarly, Lines 77-80 constructs a filter used to detect horizontal changes in the gradient.

Given all these kernels, we lump them together into a set of tuples called a “kernel bank”:

Finally, we are ready to apply our kernelBank  to our --input  image:

Lines 95 and 96 load our image from disk and convert it to grayscale. Convolution operators can certainly be applied to RGB (or other multi-channel images), but for the sake of simplicity in this blog post, we’ll only apply our filters to grayscale images).

We start looping over our set of kernels in the kernelBank  on Line 99  and then apply the current kernel  to the gray  image on Line 104 by calling our custom convolve  method which we defined earlier.

As a sanity check, we also call cv2.filter2D  which also applies our kernel  to the gray  image. The cv2.filter2D  function is a much more optimized version of our convolve  function. The main reason I included the implementation of convolve  in this blog post is to give you a better understanding of how convolutions work under the hood.

Finally, Lines 108-112 display the output images to our screen.

Example Convolutions with OpenCV and Python

Today’s example image comes from a photo I took a few weeks ago at my favorite bar in South Norwalk, CT — Cask Republic. In this image you’ll see a glass of my favorite beer (Smuttynose Findest Kind IPA) along with three 3D-printed Pokemon from the (unfortunately, now closed) Industrial Chimp shop:

Figure 6: The example image we are going to apply our convolutions to.

Figure 6: The example image we are going to apply our convolutions to.

To run our script, just issue the following command:

You’ll then see the results of applying our smallBlur  kernel to the input image:

Figure 7: Applying a small blur convolution with our "convolve" function and then validating it against the results of OpenCV's "cv2.filter2D" function.

Figure 7: Applying a small blur convolution with our “convolve” function and then validating it against the results of OpenCV’s “cv2.filter2D” function.

On the left, we have our original image. Then in the center we have the results from the convolve  function. And on the right, the results from cv2.filter2D . As the results demonstrate, our output matches cv2.filter2D , indicating that our convolve  function is working properly. Furthermore, our original image now appears “blurred” and “smoothed”, thanks to the smoothing kernel.

Next, let’s apply a larger blur:

Figure 8: As we convolve our image with a larger smoothing kernel, our image becomes more blurred.

Figure 8: As we convolve our image with a larger smoothing kernel, our image becomes more blurred.

Comparing Figure 7 and Figure 8, notice how as the size of the averaging kernel increases, the amount of blur in the output image increases as well.

We can also sharpen our image:

Figure 9: Using a sharpening kernel enhances edge-like structures and other details in our image.

Figure 9: Using a sharpening kernel enhances edge-like structures and other details in our image.

Let’s compute edges using the Laplacian operator:

Figure 10: Applying the Laplacian operator via convolution with OpenCV and Python.

Figure 10: Applying the Laplacian operator via convolution with OpenCV and Python.

Find vertical edges with the Sobel operator:

Figure 11: Utilizing the Sobel-x kernel to find vertical images.

Figure 11: Utilizing the Sobel-x kernel to find vertical images.

And find horizontal edges using Sobel as well:

Figure 12: Finding horizontal gradients in an image using the Sobel-y operator and convolutions.

Figure 12: Finding horizontal gradients in an image using the Sobel-y operator and convolutions.

The Role of Convolutions in Deep Learning

As you’ve gathered through this blog post, we must manually hand-define each of our kernels for applying various operations such as smoothing, sharpening, and edge detection.

That’s all fine and good, but what if there was a way to learn these filters instead? Is it possible to define a machine learning algorithm that can look at images and eventually learn these types of operators?

In fact, there is — these types of algorithms are a sub-type of Neural Networks called Convolutional Neural Networks (CNNs). By applying convolutional filters, nonlinear activation functions, pooling, and backpropagation, CNNs are able to learn filters that can detect edges and blob-like structures in lower-level layers of the network — and then use the edges and structures as building blocks, eventually detecting higher-level objects (i.e., faces, cats, dogs, cups, etc.) in the deeper layers of the network.

Exactly how do CNNs do this?

I’ll show you — but it will have to wait for another few blog posts until we cover enough basics.


In today’s blog post, we discussed image kernels and convolutions. If we think of an image as a big matrix, then an image kernel is just a tiny matrix that sits on top of the image.

This kernel then slides from left-to-right and top-to-bottom, computing the sum of element-wise multiplications between the input image and the kernel along the way — we call this value the kernel output. The kernel output is then stored in an output image at the same (x, y)-coordinates as the input image (after accounting for any padding to ensure the output image has the same dimensions as the input).

Given our newfound knowledge of convolutions, we defined an OpenCV and Python function to apply a series of kernels to an image. These operators allowed us to blur an image, sharpen it, and detect edges.

Finally, we briefly discussed the roles kernels/convolutions play in deep learning, specifically Convolutional Neural Networks, and how these filters can be learned automatically instead of needing to manually define them first.

In next week’s blog post, I’ll be showing you how to train your first Convolutional Neural Network from scratch using Python — be sure to signup for the PyImageSearch Newsletter using the form below to be notified when the blog post goes live!


If you would like to download the code and images used in this post, please enter your email address in the form below. Not only will you get a .zip of the code, I’ll also send you a FREE 17-page Resource Guide on Computer Vision, OpenCV, and Deep Learning. Inside you'll find my hand-picked tutorials, books, courses, and libraries to help you master CV and DL! Sound good? If so, enter your email address and I’ll send you the code immediately!

, , , , , , , ,

56 Responses to Convolutions with OpenCV and Python

  1. Winston Chen July 25, 2016 at 2:57 pm #

    Thanks for sharing the concept of Convolution. Very clear introductions and simple examples.

    • Adrian Rosebrock July 27, 2016 at 2:30 pm #

      No problem, I’m happy I could help introduce the topic Winston! 🙂

  2. Juan Tapia July 25, 2016 at 10:03 pm #

    Dear Adria

    I have a Doubt with this!

    This procedure describe the correlation between matrix and not the convolution.
    In a convolution we have a minus sign in the middle of the equation, thus we need to turn and swipe the second matrix. Make sense for you?

    • Adrian Rosebrock July 27, 2016 at 2:26 pm #

      Hey Juan — thanks for the comment, although I’m not sure I understand your question. Can you please elaborate?

      • buchtak July 31, 2016 at 4:59 am #

        Juan is right. When you’re doing convolution, you’re supposed to flip the kernel both horizontally and vertically in the case od 2D images. Hence the minus sign. It obvisouly doesn’t matter for symmetric kernels like averaging etc., but in general it can lead to nasty bugs for example when trying to accelerate the computation using convolution theorem and FFT.

        On the other hand, as far as I’m aware, Caffe framework also only performs correlation in their convolutional layers, while several other libraries do it by the book. So, be aware of these things when trying to convert pre-trained models for instance…

        • Adrian Rosebrock July 31, 2016 at 10:35 am #

          Oh I see — now I understand the question. Thanks for the clarification buchtak.

  3. DD July 25, 2016 at 11:28 pm #

    Thanks for this beautifully written post. It very well explains the concept in a simple language. Code example and visuals are real bonus. Keep up the good work.


    • Adrian Rosebrock July 27, 2016 at 2:26 pm #

      Thanks, I’m happy I could help!

  4. vani July 26, 2016 at 7:31 am #

    great job ..

    • Adrian Rosebrock July 27, 2016 at 2:00 pm #

      Thank you Vani.

  5. Kenny July 27, 2016 at 11:57 am #

    Cool stuff Adrian 😉 A pleasure to read your enthusiasm and excitement 🙂 Keep going!

    • Adrian Rosebrock July 27, 2016 at 1:53 pm #

      Thanks Kenny! 🙂

      • bashir August 21, 2017 at 7:59 pm #

        please tell me which method is the best for detect any object into the large image by using NNs?

        • Adrian Rosebrock August 22, 2017 at 10:47 am #

          Which method is “best” really depends on your application and what you’re actually trying to detect. I would suggest looking into popular object detection frameworks such as YOLO, Faster R-CNNs, and SSDs.

  6. Ian August 29, 2016 at 7:23 pm #

    Dear Adrian,

    These examples require the skimage library. Is it possible to install that library on the Raspberry Pi 3 model B?


    – Ian

    • Adrian Rosebrock August 30, 2016 at 12:44 pm #

      Yes, please refer to the scikit-image documentation.

  7. amrosik September 14, 2016 at 3:29 am #

    applying a laplacian operation twice, does that correspond to a sqared-laplacian operator?

    • Adrian Rosebrock September 15, 2016 at 9:35 am #

      Why would you take the Laplacian of the Laplacian? Is there a particular reason you need to do that?

  8. Hygo Oliveira September 23, 2016 at 8:42 pm #

    Wonderful tutorial. It helped me very much.

    • Adrian Rosebrock September 27, 2016 at 8:54 am #

      Thanks Hygo, I’m glad it was able to help you understand convolutions 🙂

  9. Luis José September 26, 2016 at 9:23 am #

    Hi Adrian Klose,

    Nice tutorial! I wonder I you have experience in performing the opposite operation: deconvolution.

    Thanks again for sharing your knowledge to the world!


  10. pyofey October 11, 2016 at 4:54 am #

    Nice post Adrian,

    I wanted to know if there is some method to intuitively de-blur blurred images. Like of course we need to de-convolve with the blur causing kernel but in most practical scenarios we dont know that kernel and resort to brute-force blind de-convolution.

    So can we perform blind deconvolution using (say) some ML algorithm?

    • Adrian Rosebrock October 11, 2016 at 12:51 pm #

      It really depends on the level of which you are trying to deblur the image. Applying deblurring using a simple kernel is unlikely to give you ideal results. The current state-of-the-art involves applying machine learning to deblur images. Here is a link to a recent NIPS paper so you can learn more about the topic.

  11. Atti November 29, 2016 at 5:21 am #

    Thanks for a great post Adrian,

    I encountered a small issue with one of the snippets. I had to convert pad to an int since cv2.copyMakeBorder expects ints as paddings. Thought i`d let you know.

    # allocate memory for the output image, taking care to
    # “pad” the borders of the input image so the spatial
    # size (i.e., width and height) are not reduced
    pad = int((kW – 1) / 2)
    image = cv2.copyMakeBorder(image, pad, pad, pad, pad,

    • Adrian Rosebrock November 29, 2016 at 7:55 am #

      Thanks for sharing Atti! Just to clarify, were you using Python 2.7 or Python 3?

      • Joel January 1, 2017 at 2:15 pm #

        I encountered the same, using Python 3.5.2, opencv 3.1.0. I applied the same fix as Atti.

        Thank you for the great blog!

        • Joel January 1, 2017 at 3:05 pm #

          I used Anaconda 3 to make the whole installation process simpler. Using Anaconda has the added bonus of a more consistent experience between Linux and Win10.

    • oleg August 24, 2017 at 9:52 am #

      Atti thank you for your message. I also has problem with with this code but I added int (pad = int((kW – 1) / 2)) how you wrote and this code work. Thank you.

  12. Lugia February 7, 2017 at 5:57 pm #

    Thanks for sharing the post. I’ve subscribed one of your book and really like it.

    • Adrian Rosebrock February 10, 2017 at 2:15 pm #

      Thanks for picking up a copy Lugia, I appreciate it!

  13. zoya February 24, 2017 at 12:30 pm #

    sir, i encountered this error while running that code… can u help me through this

    ubuntu@ubuntu-Inspiron-5559:~/myproject$ python
    usage: [-h] -i IMAGE error: argument -i/–image is required

    • Adrian Rosebrock February 27, 2017 at 11:24 am #

      You need to supply the --image command line argument to the script. I would suggest you read up on command line arguments before continuing.

  14. Gero Noerenberg March 1, 2017 at 7:15 am #

    Hi Adrian,
    wonderful tutorial as all your posts! Thank you.
    I need help with an issue I’m running in:
    Applying the sharpening filter the call to cv2.filter2D(gray, -1, kernel) run into an exception:

    cv2.error: C:\slave\WinInstallerMegaPack\src\opencv\modules\imgproc\src\templmatch.cpp:61: error: (-215) depth == tdepth || tdepth == CV_32F

    would be great to get an hint how to solve this.
    Many thanks

    • Adrian Rosebrock March 2, 2017 at 6:51 am #

      Is it only the sharpening kernel? Or all other kernels?

  15. Shadab April 1, 2017 at 6:45 am #

    Hi Adrian! Thanks for the amazing post. Although I am a little stuck on the range of ‘for’ loops in convolve function.

    Instead of , for e.g. for ‘x’, np.arange(pad, iW + pad), shoudn’t it be just np.arange(pad, iW) since while cutting out the ROI you are considering the extra pad width ( by adding ‘pad’ value to x ) ?

    Thank you.

    • Adrian Rosebrock April 3, 2017 at 2:11 pm #

      If I understand your question correctly, the np.arange function is non-inclusive on the upper end, hence we add the extra pad value.

  16. Yashaswini April 16, 2017 at 7:49 am #

    Thanks for the detailed and clear explanation. I have to define a kernel for a specific template (a part of the image ) and match it with a series of other images. When I do so, The shapes of the kernel and images are not the same. It pops the error message saying “Operands could not be broadcast together ” . Is there a different kind of padding that i should follow?

    Thanks in advance 🙂

    • Adrian Rosebrock April 16, 2017 at 8:49 am #

      The shapes of the kernel and image shouldn’t be the same since the kernel essentially slides across the input image. It sounds like you’re not extracting the ROI of the input image correctly before applying the kernel. If the input region is smaller than the kernel size, simply pad the input ROI.

  17. mashariki September 5, 2017 at 6:22 pm #

    Thanks a lot for demystifying these hard topics.

  18. UnlimitedJava November 7, 2017 at 2:14 am #

    Thanks for your sharing good information.
    I have tested this source code for height 1640, width 1190 bitmap image.
    It’s too slow in my VirtualBox Ubuntu 16.04.
    Thank you.

    • Adrian Rosebrock November 9, 2017 at 7:04 am #

      We normally don’t process images larger than 600px along its maximum dimension (unless we are applying a specific technique that is geared towards large images). Resize your image and it will run significantly faster.

  19. Kesava Prasad November 7, 2017 at 11:13 pm #

    Hi Adrian,

    Great post. BTW, to find scratches from an image (of a metal part) is it a good idea to use convolution?

    • Adrian Rosebrock November 9, 2017 at 6:44 am #

      That really depends on your input images. I’ve never tried to detect scratches on metal but I imagine you might be able to (1) devise a kernel that reveals scratch-like regions or (2) train a network that learns a set of filters that activates under scratch regions.

  20. Robert Maria December 12, 2017 at 1:51 am #

    Hi Adrian!

    Thank you for this post! Could you please help me understand how 3D convolutions store color information? Am I able to detect green cats from RGB images if my first convolutional layer uses 3D filters? Or do we use 3D filters to capture information related to shape, edges?

    Thank you,

    • Adrian Rosebrock December 12, 2017 at 9:02 am #

      I assume you are referring to deep learning in which case the convolutions are learned from your input images. If your input images contain green cats then the lower layers of the network will learn color blobs and edge-like regions. Mid-layers of the network combine this information to form contours, outlines, and intersections. The highest layers of the network start to form these semantic concepts such as “cat”, “dog”, etc. CNNs are able to encode color information starting from the input layer.

  21. Jan December 15, 2017 at 4:18 pm #

    Thank you for this post! could you please help me how to apply convolution to apply “Directional Weighted Median Filter”. which uses the absolute sum of differences between center pixel and pixels aligned in four main direction, to detect Random valued noise.

  22. Foad August 10, 2018 at 6:14 pm #

    I’m also trying to implement the convolution of arbitrary shaped ndarrays in NumPy here:
    you may wanna take a look.

  23. tool_elucidator November 6, 2018 at 8:49 am #

    Hello sir, ia have a question do you know how the inbuilt convolution function performs this operation ?

    I mean the cv2.filter2D or the cuda::createConvolution ?

    Thanks a lot sir your post solve all my questions about convolution

  24. Sajid Iqbal October 20, 2019 at 5:08 pm #

    I am trying to get convolution output using OpenCV filter2D method and using small matrices however the output given by scipy.signal.convolve2D is correct but cv.filter2d is not correct. Can you please explain. Best

  25. Yeshwanth Sai Neeli November 12, 2019 at 4:58 pm #

    Hi Adrian,

    Thank you so much for your detailed explanations. I am using kernels of size 49×49 (from L-M filterbank) on images of size 4800 x 3200. But for some reason I am getting images that are all black. Are there any changes that I have to make in the code to get this working. Kindly let me know what you think could be the problem.


    • Yeshwanth Sai Neeli November 12, 2019 at 5:02 pm #

      I also read the images to read the pixel values. They are almost close to 0. The range of the values is from 15-20.

    • Adrian Rosebrock November 14, 2019 at 9:20 am #

      Check your data type. Are you using a floating point type? If so, convert back to uint8 (which is what OpenCV is expecting).

      • Yeshwanth Sai Neeli November 14, 2019 at 11:22 am #

        I am using opencv to read the input image and it is taken as uint8. So, I don’t think that is the problem. The output from your convolve function and the filter2D function from Opencv are different for my images. I am not sure what I am doing wrong.

      • Yeshwanth Sai Neeli November 20, 2019 at 6:24 pm #

        Do kernels have to be of dtype int?


  1. LeNet - Convolutional Neural Network in Python - PyImageSearch - August 1, 2016

    […] layers later in this series of posts (although you should already know the basics of how convolution operations work); but in the meantime, simply follow along, enjoy the lesson, and learn how to implement your […]

  2. How to get better answers to your computer vision questions - PyImageSearch - February 27, 2017

    […] particular, I vividly remember struggling with the concept of kernels and convolutions — I simply couldn’t translate the mathematics in my textbook to an actual practical […]

Before you leave a comment...

Hey, Adrian here, author of the PyImageSearch blog. I'd love to hear from you, but before you submit a comment, please follow these guidelines:

  1. If you have a question, read the comments first. You should also search this page (i.e., ctrl + f) for keywords related to your question. It's likely that I have already addressed your question in the comments.
  2. If you are copying and pasting code/terminal output, please don't. Reviewing another programmers’ code is a very time consuming and tedious task, and due to the volume of emails and contact requests I receive, I simply cannot do it.
  3. Be respectful of the space. I put a lot of my own personal time into creating these free weekly tutorials. On average, each tutorial takes me 15-20 hours to put together. I love offering these guides to you and I take pride in the content I create. Therefore, I will not approve comments that include large code blocks/terminal output as it destroys the formatting of the page. Kindly be respectful of this space.
  4. Be patient. I receive 200+ comments and emails per day. Due to spam, and my desire to personally answer as many questions as I can, I hand moderate all new comments (typically once per week). I try to answer as many questions as I can, but I'm only one person. Please don't be offended if I cannot get to your question
  5. Do you need priority support? Consider purchasing one of my books and courses. I place customer questions and emails in a separate, special priority queue and answer them first. If you are a customer of mine you will receive a guaranteed response from me. If there's any time left over, I focus on the community at large and attempt to answer as many of those questions as I possibly can.

Thank you for keeping these guidelines in mind before submitting your comment.

Leave a Reply