Bank check OCR with OpenCV and Python (Part I)

Today’s blog post is inspired by Li Wei, a PyImageSearch reader who emailed me last week and asked:

Hi Adrian,

Thank you for the PyImageSearch blog. I read it each week and look forward to your new posts every Monday. I really enjoyed last week’s tutorial on credit card OCR.

I was wondering: Can this same technique be used for bank check OCR?

I’m working on a project that requires me to OCR bank account and routing numbers from check images, but I’m struggling to extract the digits/symbols. Could you do a blog post covering this?

Thank you.

Great question, Li Wei, thank you for asking.

The short answer is yes, you can use the same template matching techniques that we used for credit card OCR and apply it to bank check OCR…

…but there’s a catch.

As Li Wei found out, it’s much harder to extract the the routing number and account number digits and symbols from a check.

The reason is because bank checks used special fonts where a particular symbol consists of multiple parts — this implies that we need to devise a method that can automatically compute the bounding boxes for these symbols and extract them, just like in the image at the top of this post.

To get started building your own bank check OCR system with OpenCV and Python, just keep reading.

Looking for the source code to this post?
Jump right to the downloads section.

Bank check OCR with OpenCV and Python (Part I)

Since OCR’ing a bank check with OpenCV and Python is much more complicated than OCR’ing a credit card, I’ve decided to break this guide into two parts (just one post would have been far too lengthy).

In Part I (today’s blog post), we will discuss two topics:

  1. First, we’ll learn about the MICR E-13B font, used by countries including the United States, United Kingdom, Canada, and others for check recognition.
  2. Second, we’ll discuss how to extract both the digits and symbols from a MICR E-13B reference image. This will enable us to extract ROIs for each of the characters and later use them to OCR a bank check. We’ll accomplish this using OpenCV contours and a bit of Python iterator “magic”.

Next week in Part II of this series, I’ll review how we can actually recognize each of these extracted symbols and digits using our reference ROIs and template matching.

The MICR E-13B font

Figure 1: The MICR E-13B font, commonly used for bank check recognition. We’ll be OCR’ing this bank check font using Python and OpenCV

MICR (Magnetic Ink Character Recognition) is a financial industry technology for processing documents. You will often find this magnetic ink in the E-13B format on the bottom of account statements and checks.

The E-13B variant of MICR contains 14 characters:

  • numerals: digits 0-9
  • ⑆ transit: bank branch delimiter
  • ⑇ amount: transaction amount delimiter
  • ⑈ on-us: customer account number delimiter
  • ⑉ dash: number delimiter (between routing and account number, for example)

For the four symbols shown above, we will later take advantage of the fact that each symbol contains solely three contours.

Now that we’ve learned about the MICR E-13B check font, let’s make some considerations before we dive into code.

Bank check character recognition is harder than it seems

Figure 2: Extracting digits and symbols from a bank check isn’t as simple as computing contours and extracting them as some symbols consist of multiple parts.

In our previous post credit card OCR post, we had the simpler task of computing bounding boxes of a single contour for each digit.

However, that’s not the case for MICR E-13B.

In the MICR E-13B font used on bank checks, digits still have one contour each.

However, the control symbols have three contours for each character, making the task slightly more challenging.

We can’t use a simple contour and bounding box approach. Instead, we need to devise our own method to reliably extract both digits and symbols.

In the following section, we’ll walk through the steps to accomplish this.

Extracting MICR digits and symbols with OpenCV

Given the challenges associated with extracting bank check characters, it seems we have our work cut out for us.

Let’s begin tackling this problem by opening up a new file, naming it , and inserting the following code:

Lines 2-7 handle our importing packages. Make sure that you have the following installed on your environment:

  • OpenCV: Select the installation guide appropriate for your system from this page.
  • scikit-image : This is pip-installable via pip install -U scikit-image .
  • numpy : Install via pip install numpy
  • imutils : This is pip-installable via pip install --upgrade imutils . I add features to imutils often, so if you already have a copy, this would be a good time to update it with the --upgrade  flag shown.

Tip: You can find installation tutorials by checking the “install” tag associated with my blog by going to

Now that we’ve imported our relevant packages (and installed them if needed), let’s build a function to extract the characters from the MICR font:

Line 9 begins a (what will be quite lengthy) function for extracting the MICR digits and symbols. This function is broken down into five digestible chunks that’ll we’ll review in the remainder of this section.

For starters, our function takes 4 parameters:

  • image : The MICR E-13B font image (provided in the code downloads).
  • charCnts : A list of containing the contours of the characters in the reference image (we’ll explain how to obtain these chapters later in the post).
  • minW : An optional parameter which represents the minimum character width. This helps us account for when we encounter 2 or 3 small contours which, together, make up one MICR character. The default value is a width of 5 pixels.
  • minH : The minimum character height. This parameter is optional and has a default value of 15 pixels. The usage rational is the same as minW .

On Line 13, we initialize an iterator for our charCnts  list. List objects are inherently “iterable”, meaning that the available __iter__  and __next__  methods have been made by a generator.

Note: Since we don’t have any special requirements for our list iterator (other than the typical traversing from left to right), we use the one built in to the standard Python list. If we did have special needs, we might create a special class and a custom generator + iterator. Unfortunately, Python iterators do not have is a “hasNext” method like you may find in languages such as Java — rather, Python will throw an exception when there are no more items in the iterable object. We account for this exception with a try-catch block in this function.

Lines 14 and 15 initialize empty lists to hold our rois  (regions of interest), and locs  (ROI locations). We’ll return these lists in a tuple at the end of the function.

Let’s begin looping and see how iterators work:

In our function, we start an infinite loop on Line 19 — our exit condition will be part of the body of the loop (when we catch a StopIterator  exception). To catch this exception, we need to open a try-catch block on Line 20.

For each iteration of the loop, we grab the next  character contour by simply calling next(charIter)  (Line 23).

Let’s compute the bounding rectangle around the contour, c , on Line 24. From this function call, we can extract the (x, y)-coordinates and width/height of the rectangle.

We initialize a roi  on Line 25, which we will store the character image in shortly.

Next, we’ll check our bounding box width and height for size and take actions accordingly:

If the character counter’s dimensions are greater than or equal to the minimum width and height, respectively (Line 29), we take the following actions:

  1. Extract the roi  from the image using the coordinates and width/height from our bounding rectangle call (Line 31).
  2. Append roi  to rois  (Line 32).
  3. Append a tuple to locs  (Line 33). This tuple consists of the (x, y)-coordinates of two corners of the rectangle. We will return this list of locations later.

Otherwise, we assume we are working with a MICR E-13B character symbol and need to apply a bit more advanced set of processing operations:

The else  block of the if-else has logic to analyze the special symbols containing multiple contours found in the MICR E-13B font. The first thing we do is build the parts  of the symbol on Line 41. The list, parts , contains three contours: (1) the contour we already grabbed on Line 23, (2) the next contour, and (3) the next-next contour. That’s the way iterators work — each time we call next, we are provided with the subsequent item.

Just as we need to know the bounding box for a character with one contour, we need to know the bounding box for a character containing three contours. To accomplish this, initialize four bounding box parameters, sXA  through sYB (Lines 42 and 43).

Now we’ll loop through the list of parts  which ideally represents one character/symbol. Line 46 begins this loop and first we compute the bounding rectangle for the first item, p , on Line 49.

Using the bounding rectangle parameters, we compare and compute the minimums and maximums in relation to previous values (Lines 50-53). This is the reason we first initialized sXA  through sYB  to positive/negative infinity values — for code conciseness and readability this is a convenient way to do it.

Now that we’ve found the coordinates of the box surrounding the symbol, let’s extract the roi  from the image, append the roi  to rois , and append the box coordinates tuple to locs  (Lines 56-58).

The remaining code block of our function handles our while-loop exit condition and return statement.

If calling next  on charIter  (our iterator object) throws a StopIteration  exception, then we have reached reached the last contour in the last and are attempting to grab a contour that does not exist. In this case, we break  out of our loop. This logic is shown on Lines 62 and 63.

Finally, we return rois  and locs  in a convenient tuple on Line 66.

Now we are ready to parse command line arguments and continue on with the script:

On Lines 69-74 we establish two required command line arguments:

  • --image : Our query image. We won’t use this argument until next week’s post.
  • --reference : Our reference MICR E-13B font image.

Next, we’ll create “names” for each of the symbols/characters and store them in a list.

The above block is rather simple — we are just establishing names for the symbols we encounter from left to right in the reference image. These charNames  are specified in list form on Lines 83 and 84.

Note: Since OpenCV does not support drawing characters in unicode, we need to define “T” for transit, “U” for on-us, “A” for amount, and “D” dash.

Next, we’ll load our reference image into memory and perform some pre-processing:

In the above block we complete four tasks:

  1. Loading the --reference  image into memory as ref  (Line 89).
  2. Converting to grayscale (Line 90).
  3. Resizing to a width=400  (Line 91).
  4. Binary inverse threshold using Otsu’s method (Lines 92-93).

The result of these simple operations can be seen in Figure 3:

Figure 3: Thresholding our MICR E-13B image to reveal the individual characters and symbols.

The rest of this code walk-through is broken down into two parts. First, I’ll show you a logical and simple contour method along with the resulting image. Then, we’ll move on to a more advanced method which takes advantage of the function we wrote at the top of the script — extract_digits_and_symbols .

For both parts, there are some common pieces of data we’ll use including ref  (the reference image, which we just pre-processed) and refCnts  (reference contours, which we are just about to extract).

To extract the contours from the reference image, we make use of OpenCV’s  cv2.findContours  function (Lines 97 and 98).

Note: OpenCV 2.4, 3, and 4 return contours differently, so Line 99 has some logic to account for this.

Next, we sort the contours from left to right on Line 100.

We’re going to draw on the image, so we copy all channels to an image called clone  on Line 103.

The last step to the simple contour method, before displaying the results, is to loop over the sorted contours (Line 106). In this loop, we compute the bounding box of each contour (Line 109) and then draw a rectangle around it (Line 110).

Results are displayed by showing the image (Line 113) and pausing here until a key is pressed (Line 114) — see Figure 4:

Figure 4: The naïve method of extracting bank check symbols with OpenCV can extract digits, but fails to correctly compute the bounding box for each of the control symbols.

Do you see the problem with this approach? The issue is that we have 22 bounding boxes rather than the desired 14 bounding outlines (one for each character). Obviously this problem is solvable with a more advanced methodology.

The more advanced method is shown and described below:

Remember that long function, extract_digits_and_symbols , we wrote at the beginning of this script? It is now being put to use on Lines 118 and 119.

Next, we initialize an empty dictionary, chars , which will hold the name  and roi  of each symbol.

We follow this action by overwriting the clone  image (Line 123) with a new copy of ref  (to get rid of the rectangles we just drew).

Lastly, we loop over the characters (Line 126). We have three lists which we can conveniently zip  into a list of the same length containing 3-tuples. This 3-tuple list is what we’re looping through.

In the body of the for-loop, first we draw a rectangle for each character on our clone  image (Lines 129-130).

Second, we resize the roi  to 36 by 36 pixels (Line 134) and update our chars  dictionary with the name  and roi  as the key-value pair.

The last step (mainly for debugging/developmental purposes), is to display each roi  on the screen until a key is pressed.

The resulting “better method” image is shown to the screen (Line 142) until a key is pressed (Line 143), and that ends our script.

Figure 5 shows the outcome:

Figure 5: By examining contour properties of each character/symbol, we can use Python iterator magic to correctly build the bounding boxes for each bank check control character.

Digit and symbol extraction results

Now that we have coded our MICR E-13B digit and symbol extractor, let’s give it a try.

Be sure to use the “Downloads” section of this blog post to download the source code + example images.

From there, execute the following script:

As the GIF below demonstrates, we have correctly extracted each of the characters:

Figure 6: Extracting each individual bank check digit and symbol using OpenCV and Python.

In Part II of this blog series, we’ll learn how to OCR each of these bank check characters using Python and OpenCV.


As today’s blog post demonstrated, OCR’ing a bank check is more difficult than OCR’ing a credit card — this is mainly due to the fact that bank check symbols consist of multiple parts.

We cannot assume that each contour in our reference font image maps to an individual character.

Instead, we need to insert extra logic that examines the dimensions of each contour and determines if we are examining a digit or a symbol.

In the case that we have found a symbol, we need to grab the next two contours to build our bounding box (since a bank check control character consists of three distinct parts).

Now that we know how to extract bank check digits and symbols from an image, we can move on to actually OCR’ing the bank check using Python and OpenCV.

To be notified when the next bank check OCR goes post goes live, just enter your email address in the form below!

See you next week.


If you would like to download the code and images used in this post, please enter your email address in the form below. Not only will you get a .zip of the code, I’ll also send you a FREE 17-page Resource Guide on Computer Vision, OpenCV, and Deep Learning. Inside you'll find my hand-picked tutorials, books, courses, and libraries to help you master CV and DL! Sound good? If so, enter your email address and I’ll send you the code immediately!

, , ,

40 Responses to Bank check OCR with OpenCV and Python (Part I)

  1. Lucian July 24, 2017 at 12:00 pm #


    It is possible to do same thing (more or less) with handwritten numbers/letters ?
    Most, the extraction of them from the image than the recognition..

    Thank you

    • Adrian Rosebrock July 24, 2017 at 3:28 pm #

      Yes, absolutely. I cover the basics of handwritten digit recognition inside Practical Python and OpenCV. More advanced methods are starting to use deep learning to recognize character sequences as well.

  2. Saurabh July 24, 2017 at 1:30 pm #

    Hi Adrian,

    Great article. I am working on a OCR problem where i have images of container where alphanumeric numbers are mentioned (container number) , i need to create an OCR engine to extract these numbers in the text.

    Please advise if i should follow the similar steps as mentioned in your last couple of blogs or need to try a separate method.


    • Adrian Rosebrock July 24, 2017 at 3:28 pm #

      Hi Saurabh — without seeing the images you are working with, it’s hard for me to give you any concrete suggestions.

  3. Zara July 27, 2017 at 7:01 am #

    Reading your blog on regular basis and I learned so many things from your blog. For given blog, how can we find the subjective area of bank check image? Is it possible to detect character’s bounding box dynamically if minW and minH value are not given?

    • Adrian Rosebrock July 28, 2017 at 9:51 am #

      I’m glad to hear you are enjoying the PyImageSearch blog, Zara! As for the account/routing number of the bank check (and how to automatically detect it), I’ll be covering that in Part II of this blog post which will be published on Monday, July 31st.

  4. Sachin Tiwari July 31, 2017 at 5:23 pm #

    Informative post. I got a question though, how is below for loop helping in figuring out the single bounded rectangle around the symbols.

    for p in parts:
    # compute the bounding box for the part, then
    # update our bookkeeping variables
    (pX, pY, pW, pH) = cv2.boundingRect(p)
    sXA = min(sXA, pX)
    sYA = min(sYA, pY)
    sXB = max(sXB, pX + pW)
    sYB = max(sYB, pY + pH)

    • Adrian Rosebrock August 1, 2017 at 9:37 am #

      Each symbol consists of three parts. We loop over the three parts and compute their respective bounding boxes. The top-left corner of the bounding box will have the smallest (x, y)-coordinates. The bottom-right corner of the bounding box will have the largest (x, y)-coordinates. That is what the code is doing.

  5. shruthi August 21, 2017 at 8:54 am #

    sir your blog is very much useful for us.. can i know what do u going to teach to next session.
    thanks for this blog

    • Adrian Rosebrock August 21, 2017 at 3:38 pm #

      Hi Shruthi — the best way to keep up to date with what I am teaching is to signup for the PyImageSearch Newsletter.

  6. Richard June 17, 2018 at 5:46 pm #

    Hi Adrian,

    This blog is very nice and explained everything in clarity. I have signed up for the newsletter as well.

    My requirement is to read the printed characters on the check like bank name, payer name and amount. Could you please let me know if you have any other blog to extract those chars or help me with the sample code.

    • Adrian Rosebrock June 19, 2018 at 8:48 am #

      Hey Richard — try taking a look at the Google Vision API as well as Tesseract.

  7. Koustav Dutta June 30, 2018 at 8:21 am #

    Thanks a lot Sir
    Can you please give a tutorial on Handwritten digits and alphabets recognition using CV and Deep Learning ?

  8. Santiago de Pena July 20, 2018 at 4:35 pm #

    Hi, Arian.
    Is it possible to do the same with invoices? Get structural data?

    • Adrian Rosebrock July 21, 2018 at 9:14 am #

      Yes, but it’s significantly harder. You’ll want to look up “image registration” algorithms to help fit your invoice to a template/form.

  9. Ainavilli Venkat August 21, 2018 at 9:38 am #

    Hello sir ,when i am executing this code my code is giving this output …can you solve me ..

    usage: [-h] -i IMAGE -r REFERENCE error: the following arguments are required: -i/–image, -r/–reference
    Press any key to continue . . .

    • Adrian Rosebrock August 22, 2018 at 9:31 am #

      Read this guide on command line arguments and you’ll be up and running in no time 🙂

      • Ainavilli Venkat August 23, 2018 at 9:53 am #

        thank you i got the what it means….you blogs are very usefull and fantastic.

  10. supriya November 19, 2018 at 2:37 am # error: the following arguments are required: -i/–image, -r/–reference this error are coming when i run this file in pycharm

    • Adrian Rosebrock November 19, 2018 at 12:26 pm #

      Read this tutorial on command line arguments, including how to set them in PyCharm. It will help you resolve your error.

      • supriya November 20, 2018 at 4:36 am #

        Thank u so much sir 🙂 u giving very fast reply and i resolve that issue

        • Adrian Rosebrock November 20, 2018 at 6:05 am #

          Congrats on resolving the issue!

  11. Vikram December 24, 2018 at 5:30 am #

    Hi Adrian Rosebrock.

    Every one is suggesting your blog for characteristic recognition. This blog on check id detection is awesome. But here, if we have hand written, then how to detect that either name or number and in particular place? waiting for your rply tq.

    • Adrian Rosebrock December 27, 2018 at 10:40 am #

      Thanks Vikram, I’m glad you are enjoying the blog. Handwriting recognition is much more challenging. What specifically on the check are you trying to OCR? The signature or the amount on the check that is written?

  12. Gaurav January 8, 2019 at 3:37 am #

    Hi Adrian,

    Thanks for this awesome tutorial, but can you please let us know on how to identify if MICR code is genuine or not ? How to detect the fakeness ?

    • Adrian Rosebrock January 8, 2019 at 6:37 am #

      What constitutes a fake code? I assume it’s somehow related to the numbers/digits themselves? In that case it’s probably easier to just OCR the code and then run whatever “fakeness” checker algorithm on top of the OCR results.

  13. Mohd Anas Khan January 16, 2019 at 6:01 am #

    Hello Adrian!

    I am trying to extract information(Address, Product ID, COmpany Name) from the image(e. g -, I have used pytesseract, but it’s giving me the whole text in which it’s hard to locate address, name etc. Can you tell me how can I locate complete address from the image.
    Please do suggest some way if possible.
    Thank you

    • Adrian Rosebrock January 16, 2019 at 9:30 am #

      Which Tesseract tutorial are you following? Try following my Tesseract v4 tutorial and see if that helps.

  14. Yash February 20, 2019 at 12:24 pm #

    Hi Adrian,

    Really enjoyed this blog. That would be great if you could write a blog for invoice, I need to pull only few records from multiple invoices and without any template system. Each and every invoice has different format and Tesseract pulls out all of the information. I want only Invoice number, contact number, company name and total amount with tax details. I could not find any solution for this issue.If you can not write any solution for this issue then please recommend me any tutorial as I am very new to design this kind of solution. Thanks!

    • Adrian Rosebrock February 20, 2019 at 12:25 pm #

      I’ve been considering writing a document understanding tutorial but I would be making the assumption there would be a template to use. Is there a reason you don’t have a template in this use case? Could you create a template? It’s far, far easier to work with templates.

      • Yash February 21, 2019 at 3:18 am #

        We have a ready made solution or library available for template based OCR of Invoice and that’s called Invoice2data library. It’s just a one minute task to install and pull all of the records from an invoice using this template. But I don’t need template based system because in a large organization you receive thousands of invoices daily and all of them with different formats.

        I want to analyze only few details from each of the invoice. So lets say even if I pull all of the records from invoice so my code should be capable enough to extract all of the information and save the out come in an excel or database for the specific content like all invoice numbers should be save in one column for all the invoices, Contact details in another column and same for other content.

        As I am new to computer vision but still trying to solve this problem, that would be great if you could help or may be guide to implement the solution.

        • Adrian Rosebrock February 22, 2019 at 6:39 am #

          Thanks for mentioning invoice2data, I had never heard of that library before. It looks pretty neat.

          The project you’re trying to implement is possible but it will be challenging. You’ll really need to invest in your skills first (it’s not something a computer vision novice will be able to easily build).

          Since you’re new to computer vision I would recommend first reading through Practical Python and OpenCV to help you learn the basics and fundamentals. If you have the budget and time, the PyImageSearch Gurus course would be a better option since it’s far more in-depth.

          I would certainly recommend one of them to help you get your start in computer vision.

  15. Mohammad Bhuiyan June 21, 2019 at 8:58 pm #

    Hi Adrian,

    Your blog is excellent. Can you please write a blog for doing the same thing in java. Basically how to read MICR code from the cheque image and display it with other information likes account number, amount , name . I dont know python. Thanks in advance again.

    • Adrian Rosebrock June 26, 2019 at 1:42 pm #

      Sorry, I only provide Python code here. Good luck!

  16. Farnaz June 25, 2019 at 6:36 am #

    Thank you!! It was a great tutorial. I really enjoyed working through this and I learnt a lot. 🙂

    • Adrian Rosebrock June 26, 2019 at 1:06 pm #

      Thanks Farnaz, I’m glad you enjoyed it!

  17. Prashant July 3, 2019 at 6:02 am #

    Hi Adrian

    You have really helped me learn through your blogs.
    Right now I am trying to make a software to edit pdfs. I could not find any references or solution for that. Could you help me with the code for that. I am kinda new to this.

  18. Rajni Arora August 21, 2019 at 1:31 am #

    Hello Adrian,

    Thank You so much for your blogs. I have used Barcode extraction from videos that one is awesome.


  1. Bank check OCR with OpenCV and Python (Part II) - PyImageSearch - July 31, 2017

    […] Last week we learned how to extract MICR E-13B digits and symbols from input images. Today we are going to take this knowledge and use it to actually recognize each of the characters, thereby allowing us to OCR the actual bank check and routing number. […]

Before you leave a comment...

Hey, Adrian here, author of the PyImageSearch blog. I'd love to hear from you, but before you submit a comment, please follow these guidelines:

  1. If you have a question, read the comments first. You should also search this page (i.e., ctrl + f) for keywords related to your question. It's likely that I have already addressed your question in the comments.
  2. If you are copying and pasting code/terminal output, please don't. Reviewing another programmers’ code is a very time consuming and tedious task, and due to the volume of emails and contact requests I receive, I simply cannot do it.
  3. Be respectful of the space. I put a lot of my own personal time into creating these free weekly tutorials. On average, each tutorial takes me 15-20 hours to put together. I love offering these guides to you and I take pride in the content I create. Therefore, I will not approve comments that include large code blocks/terminal output as it destroys the formatting of the page. Kindly be respectful of this space.
  4. Be patient. I receive 200+ comments and emails per day. Due to spam, and my desire to personally answer as many questions as I can, I hand moderate all new comments (typically once per week). I try to answer as many questions as I can, but I'm only one person. Please don't be offended if I cannot get to your question
  5. Do you need priority support? Consider purchasing one of my books and courses. I place customer questions and emails in a separate, special priority queue and answer them first. If you are a customer of mine you will receive a guaranteed response from me. If there's any time left over, I focus on the community at large and attempt to answer as many of those questions as I possibly can.

Thank you for keeping these guidelines in mind before submitting your comment.

Leave a Reply