4 Point OpenCV getPerspective Transform Example

getperspective_transform_01

4:18am. Alarm blaring. Still dark outside. The bed is warm. And the floor will feel so cold on my bare feet.

But I got out of bed. I braved the morning, and I took the ice cold floor on my feet like a champ.

Why?

Because I’m excited.

Excited to share something very special with you today…

You see, over the past few weeks I’ve gotten some really great emails from fellow PyImageSearch readers. These emails were short, sweet, and to the point. They were simple “thank you’s” for posting actual, honest-to-goodness Python and OpenCV code that you could take and use to solve your own computer vision and image processing problems.

And upon reflection last night, I realized that I’m not doing a good enough job sharing the libraries, packages, and code that I have developed for myself for everyday use — so that’s exactly what I’m going to do today.

In this blog post I’m going to show you the functions in my transform.py  module. I use these functions whenever I need to do a 4 point cv2.getPerspectiveTransform  using OpenCV.

And I think you’ll find the code in here quite interesting…and you’ll even be able to utilize it in your own projects.

So read on. And checkout my 4 point OpenCV cv2.getPerspectiveTransform  example.

Looking for the source code to this post?
Jump right to the downloads section.

OpenCV and Python versions:
This example will run on Python 2.7/Python 3.4+ and OpenCV 2.4.X/OpenCV 3.0+.

4 Point OpenCV getPerspectiveTransform Example

You may remember back to my posts on building a real-life Pokedex, specifically,  my post on OpenCV and Perspective Warping.

In that post I mentioned how you could use a perspective transform to obtain a top-down, “birds eye view” of an image — provided that you could find reference points, of course.

This post will continue the discussion on the top-down, “birds eye view” of an image. But this time I’m going to share with you personal code that I use every single time I need to do a 4 point perspective transform.

So let’s not waste any more time. Open up a new file, name it transform.py, and let’s get started.

We’ll start off by importing the packages we’ll need: NumPy for numerical processing and cv2  for our OpenCV bindings.

Next up, let’s define the order_points  function on Line 5. This function takes a single argument, pts , which is a list of four points specifying the (x, y) coordinates of each point of the rectangle.

It is absolutely crucial that we have a consistent ordering of the points in the rectangle. The actual ordering itself can be arbitrary, as long as it is consistent throughout the implementation.

Personally, I like to specify my points in top-left, top-right, bottom-right, and bottom-left order.

We’ll start by allocating memory for the four ordered points on Line 10.

Then, we’ll find the top-left point, which will have the smallest x + y sum, and the bottom-right point, which will have the largest x + y sum. This is handled on Lines 14-16.

Of course, now we’ll have to find the top-right and bottom-left points. Here we’ll take the difference (i.e. x – y) between the points using the np.diff  function on Line 21.

The coordinates associated with the smallest difference will be the top-right points, whereas the coordinates with the largest difference will be the bottom-left points (Lines 22 and 23).

Finally, we return our ordered functions to the calling function on Line 26.

Again, I can’t stress again how important it is to maintain a consistent ordering of points.

And you’ll see exactly why in this next function:

We start off by defining the four_point_transform  function on Line 28, which requires two arguments: image  and pts .

The image  variable is the image we want to apply the perspective transform to. And the pts  list is the list of four points that contain the ROI of the image we want to transform.

We make a call to our order_points  function on Line 31, which places our pts  variable in a consistent order. We then unpack these coordinates on Line 32 for convenience.

Now we need to determine the dimensions of our new warped image.

We determine the width of the new image on Lines 37-39, where the width is the largest distance between the bottom-right and bottom-left x-coordinates or the top-right and top-left x-coordinates.

In a similar fashion, we determine the height of the new image on Lines 44-46, where the height is the maximum distance between the top-right and bottom-right y-coordinates or the top-left and bottom-left y-coordinates.

Note: Big thanks to Tom Lowell who emailed in and made sure I fixed the width and height calculation!

So here’s the part where you really need to pay attention.

Remember how I said that we are trying to obtain a top-down, “birds eye view” of the ROI in the original image? And remember how I said that a consistent ordering of the four points representing the ROI is crucial?

On Lines 53-57 you can see why. Here, we define 4 points representing our “top-down” view of the image. The first entry in the list is (0, 0)  indicating the top-left corner. The second entry is (maxWidth - 1, 0)  which corresponds to the top-right corner. Then we have (maxWidth - 1, maxHeight - 1)  which is the bottom-right corner. Finally, we have (0, maxHeight - 1)  which is the bottom-left corner.

The takeaway here is that these points are defined in a consistent ordering representation — and will allow us to obtain the top-down view of the image.

To actually obtain the top-down, “birds eye view” of the image we’ll utilize the cv2.getPerspectiveTransform  function on Line 60. This function requires two arguments, rect , which is the list of 4 ROI points in the original image, and dst , which is our list of transformed points. The cv2.getPerspectiveTransform  function returns M , which is the actual transformation matrix.

We apply the transformation matrix on Line 61 using the cv2.warpPerspective  function. We pass in the image , our transform matrix M , along with the width and height of our output image.

The output of cv2.warpPerspective  is our warped  image, which is our top-down view.

We return this top-down view on Line 64 to the calling function.

Now that we have code to perform the transformation, we need some code to drive it and actually apply it to images.

Open up a new file, call transform_example.py , and let’s finish this up:

The first thing we’ll do is import our four_point_transform  function on Line 2. I decided put it in the pyimagesearch  sub-module for organizational purposes.

We’ll then use NumPy for the array functionality, argparse  for parsing command line arguments, and cv2  for OpenCV bindings.

We parse our command line arguments on Lines 8-12. We’ll use two switches, --image , which is the image that we want to apply the transform to, and --coords , which is the list of 4 points representing the region of the image we want to obtain a top-down, “birds eye view” of.

We then load the image on Line 19 and convert the points to a NumPy array on Line 20.

Now before you get all upset at me for using the eval  function, please remember, this is just an example. I don’t condone performing a perspective transform this way.

And, as you’ll see in next week’s post, I’ll show you how to automatically determine the four points needed for the perspective transform — no manual work on your part!

Next, we can apply our perspective transform on Line 24.

Finally, let’s display the original image and the warped, top-down view of the image on Lines 27-29.

Obtaining a Top-Down View of the Image

Alright, let’s see this code in action.

Open up a shell and execute the following command:

You should see a top-down view of the notecard, similar to below:

Figure 1: Applying an OpenCV perspective transform to obtain a "top-down" view of an image.

Figure 1: Applying an OpenCV perspective transform to obtain a “top-down” view of an image.

 

Let’s try another image:

Figure 2: Applying an OpenCV perspective transform to warp the image and obtain a top-down view.

Figure 2: Applying an OpenCV perspective transform to warp the image and obtain a top-down view.

And a third for good measure:

Figure 3: Yet another OpenCV getPerspectiveTranform example to obtain a birds eye view of the image.

Figure 3: Yet another OpenCV getPerspectiveTranform example to obtain a birds eye view of the image.

As you can see, we have successfully obtained a top-down, “birds eye view” of the notecard!

In some cases the notecard looks a little warped — this is because the angle the photo was taken at is quite severe. The closer we come to the 90-degree angle of “looking down” on the notecard, the better the results will be.

Summary

In this blog post I provided an OpenCV cv2.getPerspectiveTransform  example using Python.

I even shared code from my personal library on how to do it!

But the fun doesn’t stop here.

You know those iPhone and Android “scanner” apps that let you snap a photo of a document and then have it “scanned” into your phone?

That’s right — I’ll show you how to use the 4 point OpenCV getPerspectiveTransform example code to build one of those document scanner apps!

I’m definitely excited about it, I hope you are too.

Anyway, be sure to signup for the PyImageSearch Newsletter to hear when the post goes live!

Downloads:

If you would like to download the code and images used in this post, please enter your email address in the form below. Not only will you get a .zip of the code, I’ll also send you a FREE 17-page Resource Guide on Computer Vision, OpenCV, and Deep Learning. Inside you'll find my hand-picked tutorials, books, courses, and libraries to help you master CV and DL! Sound good? If so, enter your email address and I’ll send you the code immediately!

, , ,

143 Responses to 4 Point OpenCV getPerspective Transform Example

  1. Vivek February 8, 2015 at 4:49 am #

    Hello Adrian,
    This was really a wonderful post it gave me a very insightful knowledge of how to apply the perspective transform. I just have a very small question about the part where you were finding the maxHeight and maxWidth. For maxHeight (just considering heightA) you wrote

    np.sqrt(((tr[1] – br[1]) ** 2) + ((tr[1] – br[1]) ** 2))

    but i think that the height should be
    np.absolute(tr[1] – br[1])

    because you know this gives us the difference in the Y coordinate
    but the equation that you wrote gives us

    1.4142 * difference of the y coordinates. Why is so?

    • Adrian Rosebrock February 8, 2015 at 6:49 am #

      Hi Vivek. The equation is utilizing the sum of squared differences (Euclidean distance), whereas the equation you proposed is just the absolute value of the differences (Manhattan distance). Try converting to the code to use the np.absolute function and let me know how the results look.

      • Ashish July 23, 2018 at 6:14 am #

        I know this is old but still, I actually had the same question. Then I tried to replaced with

        widthA = abs(br[0] – bl[0])
        widthB = abs(tr[0] – tl[0])

        heightA = abs(tr[1] – br[1])
        heightB = abs(tl[1] – bl[1])

        And I’m getting pretty similar results.

  2. Vertex February 15, 2015 at 5:02 pm #

    Hi Adrian,

    is it possible, that you mixed up top and bottom in the comments of the function order_points() ? When I did an example rect[0] was BL, rect[1] was BR, rect[2] was TR and rect[3] was TL.

    • Adrian Rosebrock February 16, 2015 at 7:00 am #

      Hi Vertex. Hm, I don’t think so. The dst array assumes the ordering that I mentioned above and it’s important to maintain that order. If the order was not maintained then the results from applying the perspective transform would not be correct.

      • Vertex February 16, 2015 at 1:04 pm #

        Hi Adrian, thanks for your answer, I have to say I am newbie and I tried the following to get a better understanding:

        import numpy as np

        rect = np.zeros((4, 2), dtype = “float32”)

        # TL,BR,TR,BR
        a = [[3,6],[3,3],[6,6],[6,3]]
        rect[0] = np.argmin(np.sum(a,axis=1))
        rect[2] = np.argmax(np.sum(a,axis=1))
        rect[1] = np.argmin(np.diff(a,axis=1))
        rect[3] = np.argmax(np.diff(a,axis=1))
        print(rect)

        [[ 1. 1.]
        [ 3. 3.]
        [ 2. 2.]
        [ 0. 0.]]

        I guess I got a faulty reasoning.

        • Adrian Rosebrock February 16, 2015 at 6:16 pm #

          Ah, I see the problem. You are taking the argmin/argmax, but not grabbing the point associated with it. Try this, for example:

          rect[0] = a[np.argmin(np.sum(a,axis=1))]

          The argmin/argmax functions give you the index, which you can then apply to the original array.

      • Nithin SK June 4, 2015 at 9:49 am #

        Hi Adrian! I’m a newbie.
        Spent lots of time mulling over this.
        Lines 53-57
        dst = np.array([
        [0, 0],
        [maxWidth – 1, 0],
        [maxWidth – 1, maxHeight – 1],
        [0, maxHeight – 1]], dtype = “float32”)

        Isn’t
        [0,0] – Bottom Left
        [maxWidth – 1, 0] – Bottom Right
        [maxWidth – 1, maxHeight – 1] – Top Right
        [0, maxHeight – 1]] – Top Left

        So it’s bl,br,tr,tl? I’m a bit confused. Could you please explain?

        • Adrian Rosebrock June 4, 2015 at 10:52 am #

          Hey Nithin, it’s actually:

          • [0, 0] – top left
          • [maxWidth – 1, 0] – top right
          • [maxWidth – 1, maxHeight – 1] – bottom right
          • [0, maxHeight – 1] – bottom-left

          Python arrays are zero-indexed, so we start counting from zero. Furthermore, the top-left corner of the image is located at point (0,0). For more information on the basics of image processing, including the coordinate system, I would definitely look at Practical Python and OpenCV. I have an entire chapter dedicated to learning the coordinate system and image basics.

          • Bogdan April 12, 2016 at 4:21 am #

            might be super simple, but I still don’t get it why do you extract 1 from maxWidth and maxHeight for the tr, br and bl ?

          • Adrian Rosebrock April 13, 2016 at 7:01 pm #

            In order to apply a perspective transform, the coordinates must be a consistent order. In this case, we supply them in top-left, top-right, bottom-right, and bottom-left order.

  3. Ken Wood March 9, 2015 at 3:53 pm #

    Great sample. My question is regarding the transformation matrix. Could it be used to tranform only a small region from the original image to a new image instead of warping the entire image? Say you used the Hello! example above but you wanted to only relocate the exclamation mark from the original image to a new image to end up with exactly the same output you have except without the “Hello” part, just the exclamation mark. I guess the question is whether you can use the TM directly without using the warping function.

    Thanks!

    • Adrian Rosebrock March 9, 2015 at 3:56 pm #

      Hi Ken, the transformation matrix M is simply a matrix. On its own, it cannot do anything. It’s not until you plug it into the transformation function that the image gets warped.

      As for only warping part of an image, you could certainly only transform the exclamation point. However, this would require you to find the four point bounding box around the exclamation point and then apply the transform — which is exactly what we do in the blog post. And that point you’re better off transforming the entire index card and cropping out the exclamation point from there.

  4. Ken Wood March 14, 2015 at 3:10 pm #

    Hey Adrian,
    I agree completely with what you say…I apologise, it was a poor example…what I was wondering about was how the mapping worked.

    Fwiw, given the transformation matrix M you can calculate the mapped location of any source image pixel (x,y) (or a range of pixels) using:
    dest(x) = [M11x + M12y + M13]/[M31x + M32y + M33]
    dest(y) = [M21x + M22y + M23]/[M31x + M32y + M33]

    Why bother?
    I used this method to map a laser pointer from a keystoned camera image of a large screen image back onto the original image…allowing me to “draw” on the large screen.

    Thanks!

    • Adrian Rosebrock March 14, 2015 at 4:18 pm #

      Wow, that’s really awesome Ken! Do you have an example video of your application in action? I would love to see it.

  5. wiem April 17, 2015 at 1:04 pm #

    Hi Adrian,

    I am newbie in opencv.
    is it possible to measuring angles in getPerspective Transform
    can u give the function?

    Thanks in advance

    • Adrian Rosebrock April 17, 2015 at 1:19 pm #

      Hi Wiem, I’m not sure I understand what you’re asking? If you want to get the angle of rotation for a bounding box, you might want to look into the cv2.minAreaRect function. I cover it a handful of times on the PyImageSearch blog, but you’ll want to lookup the actual documentation on how to get the angle.

      • wiem April 18, 2015 at 11:26 am #

        Hi Adrian, thanks for your answer
        I looked at “hello image” (original vs wraped) there is an angle of rotation.
        i want to know how to get the angle.
        sorry for my english

        Hope to hear from you
        Regards.

        • Adrian Rosebrock April 18, 2015 at 1:31 pm #

          Hey, Wiem — please see my previous comment. To get the angle of rotation of the bounding box just use the cv2.minAreaRect function, which you can read more about here. Notice how the angle of rotation is returned for the bounding box.

  6. wiem April 18, 2015 at 1:45 pm #

    Ups yeah.

    thanks for fast response.
    Thank you very much
    regards

  7. Aamir April 27, 2015 at 10:04 am #

    Is there equivalent function for order_points(Rect ) in opencv for C++?

    P.S. Thanks for your tutorials.

    Thanks.

    • Adrian Rosebrock May 1, 2015 at 7:06 pm #

      Hey Aamir, if there is a C++ version, I do not know of one. The order_points function I created was entirely specific to Python and is not part of the core of OpenCV.

    • Akanskha Singh November 13, 2017 at 12:11 pm #

      hiii Aamir,

      Do you have the c++ version of this above code.

  8. palom May 3, 2015 at 2:49 pm #

    The code shows this error:

    “TypeError: eval() arg 1 must be a string or code object”

    thanks

    • Adrian Rosebrock May 4, 2015 at 6:19 am #

      Hi Palom, make sure you wrap the coordinates in quotes: python transform_example.py --image images/example_01.png --coords "[(73, 239), (356, 117), (475, 265), (187, 443)]"

      • Kunal Patil December 4, 2015 at 5:21 pm #

        That(wrapping in quotes) is already done in the code and the problem persists!

        • Adrian Rosebrock December 5, 2015 at 6:23 am #

          Try removing the argument parsing code and then hardcoding the points, like this:

          pts = np.array([(73, 239), (356, 117), (475, 265), (187, 443)], dtype = "float32")

          • Ruijie September 21, 2017 at 5:14 pm #

            error: (-215) src.cols > 0 && src.rows > 0 in function warpPerspective
            I have a error after hardcoding

          • Adrian Rosebrock September 23, 2017 at 10:14 am #

            It sounds like the path you supplied to cv2.imread does not against. I would suggest reading up on NoneType errors.

          • Oleg January 5, 2018 at 4:26 am #

            It works in my Spyder on Debian:

            coords_list = eval(eval(args[“coords”]))
            pts = np.array(coords_list, dtype =”float32″)

  9. Singh June 9, 2015 at 3:55 pm #

    Hey, i am trying to implement the same thing in java using openCV but I cant seem to find the workaround of the numpy function can you help me out please…..My aim is to implement document scanner in java(ref :https://www.pyimagesearch.com/2014/09/01/build-kick-ass-mobile-document-scanner-just-5-minutes/)…..
    Thanking you in anticipation

    • Adrian Rosebrock June 9, 2015 at 4:09 pm #

      Hey Singh, I honestly don’t do much work in Java, so I’m probably not the best person to answer that question. But you could probably use something like jblas or colt.

  10. Tarik August 11, 2015 at 1:18 pm #

    Hey Adrian,
    I believe your order_points algorithm is not ideal, there are certain perspective in which it will fail, giving non contiguous points, even for a rectangular subject. A better approach is to find the top 2 points and 2 bottom points on the y axis, then sort these pairs on the x axis.

    • Tarik August 11, 2015 at 8:30 pm #

      Actually my solution can also fail.

      If the points in the input are contiguous, the best would be to choose a begin point meeting a choosing arbitrary ordering constraint, whilst conserving their original order.

      Otherwise, a correct solution involve tracing a polygon without intersection, e.g. using the gift wrapping algorithm — simplified for a quadrilateral.

      • Adrian Rosebrock August 12, 2015 at 6:20 am #

        Thanks for the tip Tarik!

  11. dan September 5, 2015 at 11:29 am #

    i am receiving an error message no module named pyimageseach.transform any idea what i have missed?

    • Adrian Rosebrock September 5, 2015 at 1:27 pm #

      Please download the source code using the form at the bottom of this post that includes the pyimagesearch module.

  12. dan September 6, 2015 at 1:55 pm #

    Thanks for the quick reply.Okay i downloaded the file on my old laptop running windows 7 what do i do with it. Sorry for the stupid question I am a newbee and I am also old so i have two strikes against me,but if you learn from mistakes I should be a genius in no time!

    • Adrian Rosebrock September 7, 2015 at 8:24 am #

      Please see the “Obtaining a Top-Down View of the Image” section of this post. In that section I provide the example commands you need to run the Python script.

  13. Ari October 28, 2015 at 2:39 am #

    Hi Adrian,

    Thanks so much for the code. I have tried your code and running well on my MacBook OSX Yosemite, Python 2.7.6 and openCV 3.0.

    I am just wondering that if we can improve it by automatically detect the four points. So, the variable input will be only the image. 🙂 Should it be possible? What will be the algorithm to auto-detect the points?

    🙂

    Thanks!
    Ari

  14. HongQQ November 11, 2015 at 3:42 pm #

    Thanks for your kind sharing information about python & openCV.

    Wonderful!

  15. nami December 10, 2015 at 7:50 am #

    hi adrian,

    thanks for sharing..
    what is mean index 0 and 1 in equation widthA = np.sqrt(((br[0] – bl[0]) ** 2) + ((br[1] – bl[1]) ** 2)) ?

    • Adrian Rosebrock December 10, 2015 at 2:25 pm #

      The 0 index is the x-coordinate and the 1 index is the y-coordinate.

  16. Ben January 9, 2016 at 7:23 pm #

    Hi!

    Thank you for the tutorial. Have you got any tutorial how to transform perspective for the whole image? So user chooses 4 reference points and then the whole image is transformed (in result image there are some black fragments added). I know that I should calculate appropriate points for input image, but I have no idea how to do it. Can you help?

    Regards

    • Adrian Rosebrock January 10, 2016 at 7:47 am #

      Given your set of four input points, all you need to do is build your transformation matrix, and then apply a perspective warp, like this tutorial.

  17. Jan January 22, 2016 at 6:55 pm #

    Excellent tutoroal! Would it be possible to do the opposite? I mean, given a top-view of an image produce a distorsioned one?

    • Adrian Rosebrock January 23, 2016 at 2:10 pm #

      Sure, the same principles apply — just modify the transformation matrix prior to applying it.

  18. Matheus Torquato February 16, 2016 at 9:36 pm #

    Hi, Adrian
    I’m passing by just to say this post is really helpful and thank you very much for it.
    I’m starting studying Computer Vision and your blog it is really helping my development.
    Well done and keep going. =)

    Cheers,
    Matheus Torquato

    • Adrian Rosebrock February 17, 2016 at 12:37 pm #

      Thanks Matheus! 🙂

  19. Matt February 19, 2016 at 5:16 pm #

    You have a slight typo in your description of line 21. It should say (i.e. y – x).

  20. Damien Mac Namara March 23, 2016 at 12:40 pm #

    Hi Adrian,

    Can the getPerspectiveTransform method return the angle that the image makes with the Y-axis?

    • Adrian Rosebrock March 24, 2016 at 5:15 pm #

      Can you be more specific what you mean by “the angle the image makes with the y-axis”? I’m not sure I understand your question.

  21. Bozhidar Stanchev April 6, 2016 at 10:01 am #

    Do you happen to have the c++ version of this?

    • Adrian Rosebrock April 7, 2016 at 12:44 pm #

      Sorry, I only have Python versions, I don’t do much C++ coding anymore.

  22. Ayush June 10, 2016 at 8:14 am #

    I’m a complete newbie to OpenCV but shouldn’t warping perspective off a Mat using the output of minAreaRect be a one-line command? I mean, you clearly have extracted some of these things out as ‘utils’ and a nice importable github repo for that, too, for which, we all thank you but don’t you think that if it were so “regularly used by devs for their image processing tasks”, they better lie in vanilla OpenCV? To be *really really* honest, my “duckduckgo-ing” about warping off a rect perspective led me to this post of yours among the very first results and I *knew* the code presented obviously works but I didn’t immediately start using it *ONLY AND ONLY* because I *believed* there would be something to the effect of

    warpedMat = cv2.warpPerspective(imageToWarp, areaToWarp)

    Ultimately, on asking my colleague on how to do it, she suggested goin’ ahead with your written utils only! 🙂

  23. Christine B June 17, 2016 at 3:25 pm #

    Adrian,
    Thank you so much for your awesome tutorials!! I’ve been learning how to use the raspberry pi the past few weeks in order to make an automated testing system, and your tutorials have been so thorough and helpful. Thank you so so much for making these!!

    • Adrian Rosebrock June 18, 2016 at 8:14 am #

      Thanks Christine, I’m happy i could help 🙂

  24. Gabe July 4, 2016 at 5:01 am #

    I tried this code and it’s pretty cool but it can’t handle images like this:
    http://vari-print.co.uk/wp-content/uploads/2013/05/business-cards-1.jpg

    I tried to cut out the business card but I couldn’t. I got some strange results. Why and how can I fix it?

    • Adrian Rosebrock July 5, 2016 at 1:52 pm #

      In order to apply a perspective transform (and get reasonable results), you need to be able to detect all four corners of the object. If you cannot, your results will look very odd indeed. Please see this post for more details on applying perspective transforms to rectangular objects.

      • Chris July 20, 2016 at 5:30 am #

        Hi Adrian. sorry for my English. 🙂
        I’m newbie in opencv. thank you so much for awesome tut 😀
        i want crop this image https://goo.gl/photos/kAmDRokUeLcqpycX7 with original on left and i want crop image on right. may u help me?
        thanks in advance

        • Adrian Rosebrock July 20, 2016 at 2:34 pm #

          If you are trying to manually crop the region, I would use something like the GUI code I detail in this blog post.

          • Chris July 21, 2016 at 2:31 am #

            thanks your tut. it very excited!
            in this image, i want to get the road, and outside will be black, white or other color, because i’m researching about raspberry pi, and i want it process
            least as possible. do you have any idea?

          • Zubaer August 31, 2017 at 2:42 pm #

            Thank you sir. I accept you as my Guru.

  25. Karthikey July 20, 2016 at 8:28 am #

    Hi Adrian,

    I dont understand how (x-y) will be minimum for the top right corner…. Consider this square:
    tl= 0,0
    tr= 1,0
    br= 1,1
    bl =0,1

    (x-y) is minimum for bl, ie. 0-1 = -1, nd not tr… Am i going wrong somewhere??

    • Adrian Rosebrock July 20, 2016 at 2:33 pm #

      The origin (x, y)-coordinates start at the top-left corner and increase going down and to the right. For what it’s worth, I recommend using this updated version of the coordinate ordering function.

      • afwefawef September 9, 2017 at 10:11 pm #

        I agree with Karthikey,

        you made a mistake.

        it should be y-x, not x-y

        diff = np.diff(pts, axis=1)
        rect[1] = pts[np.argmax(diff)]
        rect[3] = pts[np.argmin(diff)]

        • Adrian Rosebrock September 11, 2017 at 9:15 am #

          Yes, there is actually an updated, better tutorial on ordering coordinates here.

  26. Rima Borah August 3, 2016 at 1:01 am #

    I gave the input image of a irregular polygon formed after applying convex hull function to a set of points, supplying the four end pints of the polygon in the order you mentioned. However the output I get is a blank screen. No polygon in it.
    Can you please tell how to give irregular polygons as input to the above code.

    • Adrian Rosebrock August 4, 2016 at 10:20 am #

      It’s hard to say without seeing an example, but I imagine your issue is that you did not supply the coordinates in top-left, top-right, bottom-left, and bottom-right order prior to applying the transformation.

  27. Abhishek Mishra October 3, 2016 at 3:09 pm #

    Adrian, great post. I was trying to build this for a visiting card. My problem is that the card could be aligned in any arbitrary angle with respect to the desk as well as with respect to the camera.

    When I use this logic, there are alignments at which the resultant image is like a rotated and shrieked one. In your case, the image is rotated towards the right to make it look correct. However, if the original card was rotated clockwise 90 degrees, then the logic of top right, top left does not work correctly.

    I tried using the “width will always be greater than height” approach but that too fails at times.

    Any suggestions?

    • Adrian Rosebrock October 4, 2016 at 6:56 am #

      This sounds like it may be a problem with the coordinate ordering prior to applying the perspective transform. I would suggest instead suggest using the updated implementation.

  28. sandesh chand October 4, 2016 at 10:19 am #

    Hello Adrain,
    I am quit new in opencv. i am working in Imageprocessing progam in python 2.7.Actually i am facing Problem during cropping of Image. I am working in diffferent Images having some black side background. The Picture is taken by camera maulally so the Image is also different in size. I want to crop the Image and separte it from balck Background. could you suggest how can i make a program that can detect the Image first and crop automatically.
    Thanks

    • Adrian Rosebrock October 6, 2016 at 7:02 am #

      Hey Sandesh — do you have any examples of images that you’re working with?

  29. Manuel October 16, 2016 at 1:54 pm #

    Hi Adrian,

    I have a doubt when thinking about the generalization of this example regarding the destiny points. This example is specifically aimed to quadrangular objects, right? I mean, you get the destiny image points because you simple say “hey, sure it will be a box, let’s get the max height, max width and that’s it”.

    But wouldn’t be so easy if the original object would have had a different shape, right?

    Thanks.

    • Adrian Rosebrock October 17, 2016 at 4:05 pm #

      Yes, this example presumes that the object you are trying to transform is rectangular. I’m not sure I understand what you mean by the “generalization” of this technique?

  30. Siladittya Manna December 8, 2016 at 3:17 pm #

    I want to know if there is a way that the program can automatically detect the corners??

  31. rene godbout December 14, 2016 at 9:16 pm #

    When the rectangular dimensions of the source target are known, the result is much better if you input x,y values for the destination image that conform to the x/y ratio of the source target. The estimation of the output size described here will only be perfect if the target is perpendicular to the viewpoint. This is why the distortion increases in proportion to the angle(s) away from perpendicular. Otherwise, you have to know the focal length of the camera, focus distance, etc. (much more complicated calculations…) to estimate the “real” proportions of the target’s dimensions (or x/y ratio).

  32. rene godbout December 14, 2016 at 10:16 pm #

    As an example, the page size in your samples looks like “legal” 8.5 x 14 paper. Once this is established, if you replace the maxHeight calculation with “maxHeight = (maxWidth * 8) / 14”,
    the output image(s) are much better looking as far as the x/y ratio is concerned (no apparent distortion on the last sample). Of course, one must know the target’s x/y ratio…

    • Adrian Rosebrock December 18, 2016 at 9:09 am #

      Good point, thanks for sharing Rene. If the aspect ratio is know then the output image can be much better. There are methods that can attempt to (automatically) determine the aspect ratio but they are outside the scope of this post. I’ll try to do a tutorial on them in the future.

  33. Mazine January 28, 2017 at 10:09 am #

    Hi, this is really nice ! What bugs me is the way to find those four corners since the picture does not have the right perspective

  34. Patrick February 5, 2017 at 5:04 pm #

    Hi Andrian, thanks for the tutorial. I cannot help noticing that you mentioned the difference of the coordinates is (x-y), but np.diff([x, y]) actually returns (y-x).

  35. kamell February 6, 2017 at 6:34 am #

    Traceback (most recent call last):
    File “transformexple.py”, line 2, in
    from pyimagesearch.transform import four_point_transform
    ImportError: No module named pyimagesearch.transform
    how can install it ????

    • Adrian Rosebrock February 7, 2017 at 9:12 am #

      Make sure you use the “Downloads” section to download the source code associated with this blog post. It includes the “pyimagesearch” directory structure for the project.

  36. Ahmed March 2, 2017 at 4:14 pm #

    It is really nice way to get the bird’s eye view, but when I tried to use it in my algorithm i failed to get the bird’s eye
    I want to get a top view of a lane and a car ?

    • Adrian Rosebrock March 4, 2017 at 9:44 am #

      This method is an example of applying a top-down transform using pre-defined coordinates. To automatically determine those coordinates you’ll have to write a script that detects the top of a car.

  37. Christian March 8, 2017 at 12:20 pm #

    Hi Adrian,

    Thanks a lot for the post, this is great for the app I’m trying to build. The thing is that I’m translating all your code to Java and I don’t know if everything is correctly translated because the image I get after the code is rotated 90º and flipped… I’m investigating what could be happening but maybe you think of something that could be happening. Thanks again for the post.

    • Adrian Rosebrock March 8, 2017 at 12:56 pm #

      Hi Christian — thanks for the comment. However, I haven’t used the OpenCV + Java bindings in a long time, so I’m not sure what the exact issue is.

      • Christian March 9, 2017 at 3:57 am #

        Well, thanks anyway. I’ll giving it a second shot today 🙂

  38. Arvind Mohan March 22, 2017 at 3:10 am #

    Hi Adrian,

    Would affine transform make any sense in this context?

    • Adrian Rosebrock March 22, 2017 at 6:34 am #

      This page provides a nice discussion on affine vs. perspective transforms. An affine transform is a special case of a perspective transform. In this case perspective transforms are more appropriate — we can make them even better if we can estimate the aspect ratio of the object we are trying to transform. See the link for more details.

  39. Muhammed March 27, 2017 at 2:43 pm #

    There is another idea to order four points by
    comparing centroid with the four points
    there is one limitation when oriantation object =45 degree

  40. Michael April 29, 2017 at 10:02 am #

    Hi Adrian,

    Thanks for sharing the example, very inspiring!

    Perspective transform works great here as the object being warped is 2D. What if the object is 3D, like a cylinder, http://cdn.free-power-point-templates.com/articles/wp-content/uploads/2012/07/3d-cilinder-wrap-text-ppt.jpg?

  41. Bruno Andrade June 16, 2017 at 2:18 pm #

    Hi Adrian,

    I followed your tutorial in one of my projects but the object height decreases.
    i used your tuturial ‘Building a Pokedex in Python’ part4 and 5 in order to have the corners points.

    http://imgur.com/a/vt4Er

    Can you tell me what im doing wrong.
    Thanks in advance.

  42. Anna June 26, 2017 at 8:57 am #

    Hello Adrian,

    Could you help with the following problem?

    $ pip install pyimagesearch
    Collecting pyimagesearch
    Could not find a version that satisfies the requirement pyimagesearch (from versions: )
    No matching distribution found for pyimagesearch

    Also I’ve checked comments with similar problems, but all solutions still don’t work…
    – downloaded examples on the bottom of the page
    – replace them to the folder with python.

    Thank you!

    • Adrian Rosebrock June 27, 2017 at 6:24 am #

      There is no “pyimagesearch” module available on PyPI, hence you error when installing via “pip”. You need to use the “Downloads” section at the bottom of this page, download the code, and put the “pyimagesearch” directory in the same directory as the code you are executing. Alternatively, you could place it in the site-packages directory of your Python install, but this is not recommended.

  43. Don July 4, 2017 at 4:21 am #

    Hi Thanks for the cool tutorial helped me a lot ,but there were projection errors do u know how to Dewarp an image , for an example a page from a book.
    Thanks in advance.

  44. Ahmad Hasbini July 31, 2017 at 7:03 am #

    Hey Adrian,

    Thanks for the tutorial.

    I am porting the order_points func to C++, but I am being confused about:
    diff = np.diff(pts, axis = 1)
    The tutorial states that in order to find top right and bottom left points, find the diff of each point and the min will be the top right and max will be bottom left. The confusion is: is the function doing x – y or y – x or |x -y|?

    I have a hunch that it’s |x – y|, I’ll try it out soon and post an update.

    • Adrian Rosebrock August 1, 2017 at 9:41 am #

      Hi Ahmad — please see my reply “Eugene”. I would port the function I linked to over to C++ instead.

  45. Eugene July 31, 2017 at 12:05 pm #

    Hi, Adrian

    There is a bug in “order_points” function.

    For input [(0,0),(20,0),(5,0),(5,5)] it classifies (20,0) as bottom-right, because 20+0 is largest sum. But it is top-right. Real bottom-right is (5,5)

    It causes incorrect image processing in some cases

    I fixed it https://pastebin.com/WXkhw6tU . Code is less pretty, but works in all cases. Maybe you can rewrite it to make pretty 🙂

    • Adrian Rosebrock August 1, 2017 at 9:39 am #

      Hi Eugene — thanks for the comment. I actually provide an updated method to order the points in this post.

  46. Rosy August 18, 2017 at 5:29 am #

    Why do we hard code this. Can we automate this like when we pass the input image it has to automatically detect coordiantes and warp it according to the reference image?How do we do that .Is it possible?

    • Adrian Rosebrock August 21, 2017 at 3:50 pm #

      Yes, please see the follow up blog post where we build a document scanner that automatically determines the coordinates.

  47. Passerin September 6, 2017 at 9:27 pm #

    Hi Adrian!

    Your solutions are helping me a lot building a Document Classifier with sklearn and other machine learning related libraries! I already managed to succesfully build the classifier model and now I’m trying to get documents from photos to predict their classes.
    In order to make this work I get features from the documents I try to classify so I can build an array with them and pass it to the classifier to predict its class.
    It’s very important for the classifier to get information about the colors of the document. I already applied your scanner but I’m struggling a lot to get the colors.

    I understand it’s crucial to have do the COLOR_BGR2GRAY transformation for the code to work, either for getting the contour points and to do the warping. Is there any way I could achieve a colored pespective transformation?

    Thank you for sharing your ideas! Sorry for my bad english! I hope my explanation makes sense. I’m a spanish speaker!

    • Adrian Rosebrock September 7, 2017 at 6:56 am #

      You can certainly obtain a color version of the warped image. Just resize the image as we do in the blog post, but keep a copy of the color image via the .copy() method. From there you can apply the perspective transform to the resized (but still color) image. All other processing can be done on the grayscale image.

  48. Martin September 12, 2017 at 11:15 am #

    Hello Adrian,
    thanks for the post! I had fun playing around with the code and managed to get a nice bird’s eye view for an image of my own.

    I would have one interesting question: Let’s say that we have an image with circular pattern (imagine identical objects regularly inter-spaced along circumference of a circle, kind of like a clock). This pattern is viewed under perspective projection so it appears like an ellipse. My question is: what is the best way how to get the bird’s eye view for this pattern? (The approach presented above doesn’t seem directly applicable as there are now straight lines in the image of circular pattern).

    Thanks a lot!

    • Adrian Rosebrock September 12, 2017 at 6:49 pm #

      This is certainly a much more challenging problem. To start I would be concerned with how you would attempt the objects along the circumference of the circle under perspective transforms. Can each object be localized? If you can localize them then you could actually treat it like a 4 point perspective transform. You would basically use each of the detected objects and compute the bounding box for the entire circle. Since the bounding box would be a rectangle you could then apply the perspective transform.

  49. F1LT3R October 12, 2017 at 12:04 am #

    What software license are you using for the code in this post/repo?

    • Adrian Rosebrock October 13, 2017 at 8:51 am #

      The license is MIT. I would appreciate attribution and a link back to the PyImageSearch blog if at all possible.

  50. Jitesh Shah October 12, 2017 at 3:50 am #

    Is there any way we can automatically find out those four coordinates ?

  51. erdem January 13, 2018 at 9:59 am #

    when i run this code it just prints none, do not print any picture, why is that?

    • Adrian Rosebrock January 15, 2018 at 9:19 am #

      Hey Erdem — can you elaborate a bit more? Is the script automatically exiting? Are you receiving an error message of some kind?

  52. Prashant March 28, 2018 at 10:11 am #

    Hi Adrian,

    Thanks for the excellent post. I think the width & height calculation may have some problem. check the results at https://photos.app.goo.gl/2p93mTghSw0AQezD3

    For this case the width needs to be scaled, we can’t work with
    “width is the largest distance between the bottom-right and bottom-left x-coordinates or the top-right and top-left x-coordinates” logic

    • Adrian Rosebrock March 30, 2018 at 7:19 am #

      This code does not handle the scaling of the width and height to maintain aspect ratio. You can update the code to do so if you wish.

      • Prashant April 4, 2018 at 6:58 am #

        Thanks , I was thinking about that but not sure how we can find aspect ratio from the image

  53. houjie March 31, 2018 at 5:46 am #

    Hello, Adrian. After reading your article, I learned a lot. But I have two confusions and I would like to ask you. The first problem, when acquiring contours, is because the image is noisy, and the points obtained are not four, so do not know what to do when building the dst matrix. If the contour has multiple points, can you construct the dst matrix? The dst I mean is the second parameter in cv2.getPerspectiveTransform(rect, dst). The second question, when I got the matrix M through cv2.getPerspectiveTransform(rect, dst), I want to change the perspective of the original image through M, not just changing the contents of the four points in the image. For example, you In the example, the blank part of the picture is OK? My English is not very good, is a tool to help me translate, I hope you can get your reply, I will be extremely grateful

    • Adrian Rosebrock April 4, 2018 at 12:43 pm #

      The answer here is “contour approximation”. Take a look at this blog post to see how you can use contours, contour approximation, and a perspective transformation to obtain the desired result.

  54. Koustav Dutta May 14, 2018 at 12:42 am #

    ap.add_argument(“-i”, “–image”, help = “path to the image file”)

    The above line always gives error in WINDOWS OS in Python 3.6.4 , can you please help me out?

  55. Himanshu Singh May 28, 2018 at 9:49 am #

    Hi Adrian,

    Thanks for this great post. I was trying to follow along with this post with slight modifications. First I obtained four_point_transform of the given image. After getting the warped image, I plotted bounding box around text “Hello” using cv2.rectangle.

    cv2.rectangle(warped, (37, 20), (222, 97), (255, 0, 0), 2)

    Now, I want to plot the bounding box of text “Hello” in the warped image on the original image. I referred to a similar question on stackoverflow: https://stackoverflow.com/questions/14343420/retrieving-the-original-coordinates-of-a-pixel-taken-from-a-warped-image/14346147.
    But I do not have the coordinates on the source image so could not follow along. Any help would be very much appreciated.

    • Adrian Rosebrock May 28, 2018 at 9:57 am #

      I’m not sure what you mean by “warped image on the original image”. Could you clarify?

      • Himanshu Singh May 28, 2018 at 1:10 pm #

        I have these coordinates (37, 20), (222, 97) on the warped image. Now I’ve to find out what would have been their coordinates on the original image? Hope it’s clear now.

        • Adrian Rosebrock May 31, 2018 at 5:36 am #

          Thank you for the clarification, I understand. You linked to the StackOverflow thread you read which contains the correct answer — you use an inverse perspective transform.

  56. Antonio Peixoto June 7, 2018 at 12:43 pm #

    Many Thanks for sharing your expertise.

    I worked out the code to show original image with matplotlib so I can select the coordinates of the points more easily.

    I works quite fine.

  57. Allen June 30, 2018 at 3:56 am #

    Thanks for the sharing.
    I found a typo in the blog post where you wrote, “Here we’ll take the difference (i.e. x – y) between the points using the np.diff function on Line 21.”
    I think you mean y-x.

  58. Vaibhav Gupta August 22, 2018 at 6:36 am #

    Hi Adrian,
    How can we automatically know the coordinates of the input image , and don’t write it in the command ,So as to make this project more practical to use . Some reference links will do great help .
    Thank you !!

    • Adrian Rosebrock August 22, 2018 at 9:19 am #

      This guide will show you how 🙂

  59. Akshay August 27, 2018 at 1:42 pm #

    Hi Adrian,

    I’m trying to find all the rectangular contours and lines(strictly horizontal or vertical) in a screenshot image. For example I took screenshot of this page and would like to identify all the code blocks, image blocks, comment separation lines etc. Now since it is a screenshot image, I know that all my required rectangle contours edges are strictly horizontal and strictly vertical without any slant. In my scenario, the edges would be lines with some background or segment border
    I’ve tried canny edge(https://www.pyimagesearch.com/2015/04/06/zero-parameter-automatic-canny-edge-detection-with-python-and-opencv/) followed by hough transform. I think I’m not able to decide the parameters. Need your help in understanding the parameters for hough transform and few suggestions of parameters or approaches would be great help.

    Thank you

    • Adrian Rosebrock August 30, 2018 at 9:14 am #

      The Hough lines function can be a bit of a pain to tune the parameters to. You can actually use simple heuristics to detect text and code blocks in a screenshot. All you would need is the image gradients and some morphological operations. I would start by following this example and working from there.

  60. mohamed September 22, 2018 at 2:09 pm #

    I do not know what to say
    Everything here is useful
    I remember the day you published this post I did not care very much maybe I did not understand it
    Today I’m after a long research I really benefited from
    Thanks Adrian

    • Adrian Rosebrock October 8, 2018 at 1:02 pm #

      Thanks Mohamed 🙂

  61. marxious October 4, 2018 at 2:23 pm #

    Thank for the tutorial, it works wonderfully. I have a question regards to the transformation, say I have a point on the source image, and after wraptransform, how do I find out where this point in new image ?

  62. Phong October 9, 2018 at 4:42 am #

    Hi Adrian
    I tested and saw that it may not transform perspective back to original dimension on top down view. If we already known the original dimensions of object, and scale the result to fit these dimensions(width x height), may it be returned to correct size ? How to do that?

    • Adrian Rosebrock October 9, 2018 at 5:56 am #

      If you know the original aspect ratio of the object you scale the output image width and height (the values you pass into the perspective transform) by that aspect ratio, making the output image match the original aspect ratio.

      • Phong October 9, 2018 at 8:24 am #

        It works, and i use cv2.resize ,i wonder if interpolation = cv2.INTER_CUBIC is correct

  63. John T. Draper November 7, 2018 at 4:20 pm #

    In this scanning lesson, you referenced and challenged us to write an iPhone App that does this, I wasn’t aware that Apps can be written in Python. Can you please give me references on how to do this? I know this is not part of your course (at least I didn’t come across it yet), but reference for this would be great.

    Does the new XCode for Apple support python driven iPhone apps? Or is there some other 3rd party outfit that offers this capability.

    John

    • Adrian Rosebrock November 10, 2018 at 10:06 am #

      Sorry for any confusion there — I was providing you with an example implementation in Python. You could then translate the code into whatever language you wished and use it as a base for a mobile application.

  64. Venkat November 29, 2018 at 4:11 am #

    Hi Adrian , that is a great post and good code…. can you pls let me know how to download the pyimagesearch?

    • Adrian Rosebrock November 30, 2018 at 8:58 am #

      You can use the “Downloads” section of the tutorial to download the source code and “pyimagesearch” module for the post.

Trackbacks/Pingbacks

  1. Recognizing digits with OpenCV and Python - PyImageSearch - February 13, 2017

    […] with a rectangular shape — the largest rectangular region should correspond to the LCD. A perspective transform will give me a nice extraction of the […]

Leave a Reply