OpenCV OCR and text recognition with Tesseract

Click here to download the source code to this post.

In this tutorial, you will learn how to apply OpenCV OCR (Optical Character Recognition). We will perform both (1) text detection and (2) text recognition using OpenCV, Python, and Tesseract.

A few weeks ago I showed you how to perform text detection using OpenCV’s EAST deep learning model. Using this model we were able to detect and localize the bounding box coordinates of text contained in an image.

The next step is to take each of these areas containing text and actually recognize and OCR the text using OpenCV and Tesseract.

To learn how to build your own OpenCV OCR and text recognition system, just keep reading!

Looking for the source code to this post?
Jump right to the downloads section.

OpenCV OCR and text recognition with Tesseract

In order to perform OpenCV OCR text recognition, we’ll first need to install Tesseract v4 which includes a highly accurate deep learning-based model for text recognition.

From there, I’ll show you how to write a Python script that:

  1. Performs text detection using OpenCV’s EAST text detector, a highly accurate deep learning text detector used to detect text in natural scene images.
  2. Once we have detected the text regions with OpenCV, we’ll then extract each of the text ROIs and pass them into Tesseract, enabling us to build an entire OpenCV OCR pipeline!

Finally, I’ll wrap up today’s tutorial by showing you some sample results of applying text recognition with OpenCV, as well as discussing some of the limitations and drawbacks of the method.

Let’s go ahead and get started with OpenCV OCR!

How to install Tesseract 4

Figure 1: The Tesseract OCR engine has been around since the 1980s. As of 2018, it now includes built-in deep learning capability making it a robust OCR tool (just keep in mind that no OCR system is perfect). Using Tesseract with OpenCV’s EAST detector makes for a great combination.

Tesseract, a highly popular OCR engine, was originally developed by Hewlett Packard in the 1980s and was then open-sourced in 2005. Google adopted the project in 2006 and has been sponsoring it ever since.

If you’ve read my previous post on Using Tesseract OCR with Python, you know that Tesseract can work very well under controlled conditions…

…but will perform quite poorly if there is a significant amount of noise or your image is not properly preprocessed and cleaned before applying Tesseract.

Just as deep learning has impacted nearly every facet of computer vision, the same is true for character recognition and handwriting recognition.

Deep learning-based models have managed to obtain unprecedented text recognition accuracy, far beyond traditional feature extraction and machine learning approaches.

It was only a matter of time until Tesseract incorporated a deep learning model to further boost OCR accuracy — and in fact, that time has come.

The latest release of Tesseract (v4) supports deep learning-based OCR that is significantly more accurate.

The underlying OCR engine itself utilizes a Long Short-Term Memory (LSTM) network, a kind of Recurrent Neural Network (RNN).

In the remainder of this section, you will learn how to install Tesseract v4 on your machine.

Later in this blog post, you’ll learn how to combine OpenCV’s EAST text detection algorithm with Tesseract v4 in a single Python script to automatically perform OpenCV OCR.

Let’s get started configuring your machine!

Install OpenCV

To run today’s script you’ll need OpenCV installed. Version 3.4.2 or better is required.

To install OpenCV on your system, just follow one of my OpenCV installation guides, ensuring that you download the correct/desired version of OpenCV and OpenCV-contrib in the process.

Install Tesseract 4 on Ubuntu

The exact commands used to install Tesseract 4 on Ubuntu will be different depending on whether you are using Ubuntu 18.04 or Ubuntu 17.04 and earlier.

To check your Ubuntu version you can use the lsb_release  command:

As you can see, I am running Ubuntu 18.04 but you should check your Ubuntu version before continuing.

For Ubuntu 18.04 users, Tesseract 4 is part of the main apt-get repository, making it super easy to install Tesseract via the following command:

If you’re using Ubuntu 14, 16, or 17 though, you’ll need a few extra commands due to dependency requirements.

The good news is that Alexander Pozdnyakov has created an Ubuntu PPA (Personal Package Archive) for Tesseract, which makes it super easy to install Tesseract 4 on older versions of Ubuntu.

Just add the alex-p/tesseract-ocr  PPA repository to your system, update your package definitions, and then install Tesseract:

Assuming there are no errors, you should now have Tesseract 4 installed on your machine.

Install Tesseract 4 on macOS

Installing Tesseract on macOS is straightforward provided you have Homebrew, macOS’ “unofficial” package manager, installed on your system.

Just run the following command, making sure to specify the --HEAD  switch, and Tesseract v4 will be installed on your Mac:

If you already have Tesseract installed on your Mac (if you followed my previous Tesseract install tutorial, for example), you’ll first want to unlink the original install:

And from there you can run the install command.

Verify your Tesseract version

Figure 2: Screenshot of my system terminal where I have entered the tesseract -v command to query for the version. I have verified that I have Tesseract 4 installed.

Once you have Tesseract installed on your machine you should execute the following command to verify your Tesseract version:

As long as you see tesseract 4  somewhere in the output you know that you have the latest version of Tesseract installed on your system.

Install your Tesseract + Python bindings

Now that we have the Tesseract binary installed, we now need to install the Tesseract + Python bindings so our Python scripts can communicate with Tesseract and perform OCR on images processed by OpenCV.

If you are using a Python virtual environment (which I highly recommend so you can have separate, independent Python environments) use the workon  command to access your virtual environment:

In this case, I am accessing a Python virtual environment named cv  (short for “computer vision”) — you can replace cv  with whatever you have named your virtual environment.

From there, we’ll use pip to install Pillow, a more Python-version version of PIL, followed by pytesseract  and imutils :

Now open up a Python shell and confirm that you can import both OpenCV and pytesseract :

Congratulations!

If you don’t see any import errors, your machine is now configured to perform OCR and text recognition with OpenCV

Let’s move on to the next section (skipping the Pi instructions) where we’ll learn how to actually implement a Python script to perform OpenCV OCR.

Install Tesseract 4 and supporting software on Raspberry Pi and Raspbian

Note: You may skip this section if you aren’t on a Raspberry Pi.

Inevitably, I’ll be asked how to install Tesseract 4 on the Rasberry Pi.

The following instructions aren’t for the faint of heart — you may run into problems. They are tested, but mileage may vary on your own Raspberry Pi.

First, uninstall your OpenCV bindings from system site packages:

Here I used the rm  command since my cv2.so  file in site-packages  is just a sym-link. If the cv2.so  bindings are your real OpenCV bindings then you may want to move the file out of site-packages  for safe keeping.

Now install two QT packages on your system:

Then, install tesseract via Thortex’s GitHub:

For whatever reason, the trained English language data file was missing from the install so I needed to download and move it into the proper directory:

From there, create a new Python virtual environment:

And install the necessary packages:

You’re done! Just keep in mind that your experience may vary.

Understanding OpenCV OCR and Tesseract text recognition

Figure 3: The OpenCV OCR pipeline.

Now that we have OpenCV and Tesseract successfully installed on our system we need to briefly review our pipeline and the associated commands.

To start, we’ll apply OpenCV’s EAST text detector to detect the presence of text in an image. The EAST text detector will give us the bounding box (x, y)-coordinates of text ROIs.

We’ll extract each of these ROIs and then pass them into Tesseract v4’s LSTM deep learning text recognition algorithm.

The output of the LSTM will give us our actual OCR results.

Finally, we’ll draw the OpenCV OCR results on our output image.

But before we actually get to our project, let’s briefly review the Tesseract command (which will be called under the hood by the pytesseract  library).

When calling the tessarct  binary we need to supply a number of flags. The three most important ones are -l , --oem , and --psm .

The -l  flag controls the language of the input text. We’ll be using eng  (English) for this example but you can see all the languages Tesseract supports here.

The --oem  argument, or OCR Engine Mode, controls the type of algorithm used by Tesseract.

You can see the available OCR Engine Modes by executing the following command:

We’ll be using --oem 1  to indicate that we wish to use the deep learning LSTM engine only.

The final important flag, --psm  controls the automatic Page Segmentation Mode used by Tesseract:

For OCR’ing text ROIs I’ve found that modes 6  and 7  work well, but if you’re OCR’ing large blocks of text then you may want to try 3 , the default mode.

Whenever you find yourself obtaining incorrect OCR results I highly recommend adjusting the --psm  as it can have dramatic influences on your output OCR results.

Project structure

Be sure to grab the zip from the “Downloads” section of the blog post.

From there unzip the file and navigate into the directory. The tree  command allows us to see the directory structure in our terminal:

Our project contains one directory and two notable files:

  • images/ : A directory containing six test images containing scene text. We will attempt OpenCV OCR with each of these images.
  • frozen_east_text_detection.pb : The EAST text detector. This CNN  is pre-trained for text detection and ready to go. I did not train this model — it is provided with OpenCV; I’ve also included it in the “Downloads” for your convenience.
  • text_recognition.py : Our script for OCR — we’ll review this script line by line. The script utilizes the EAST text detector to find regions of text in the image and then takes advantage of Tesseract v4 for recognition.

Implementing our OpenCV OCR algorithm

We are now ready to perform text recognition with OpenCV!

Open up the text_recognition.py  file and insert the following code:

Today’s OCR script requires five imports, one of which is built into OpenCV.

Most notably, we’ll be using pytesseract  and OpenCV. My imutils  package will be used for non-maxima suppression as OpenCV’s NMSBoxes  function doesn’t seem to be working with the Python API. I’ll also note that NumPy is a dependency for OpenCV.

The argparse  package is included with Python and handles command line arguments — there is nothing to install.

Now that our imports are taken care of, let’s implement the decode_predictions  function:

The decode_predictions  function begins on Line 8 and is explained in detail inside the EAST text detection post. The function:

  1. Uses a deep learning-based text detector to detect (not recognize) regions of text in an image.
  2. The text detector produces two arrays, one containing the probability of a given area containing text, and another that maps the score to a bounding box location in the input image.

As we’ll see in our OpenCV OCR pipeline, the EAST text detector model will produce two variables:

  • scores : Probabilities for positive text regions.
  • geometry : The bounding boxes of the text regions.

…each of which is a parameter to the decode_predictions  function.

The function processes this input data, resulting in a tuple containing (1) the bounding box locations of the text and (2) the corresponding probability of that region containing text:

  • rects : This value is based on geometry  and is in a more compact form so we can later apply NMS.
  • confidences : The confidence values in this list correspond to each rectangle in rects .

Both of these values are returned by the function.

Note: Ideally, a rotated bounding box would be included in rects , but it isn’t exactly straightforward to extract a rotated bounding box for today’s proof of concept. Instead, I’ve computed the horizontal bounding rectangle which does take angle  into account. The angle  is made available on Line 41 if you would like to extract a rotated bounding box of a word to pass into Tesseract.

For further details on the code block above, please see this blog post.

From there let’s parse our command line arguments:

Our script requires two command line arguments:

  • --image : The path to the input image.
  • --east : The path to the pre-trained EAST text detector.

Optionally, the following command line arguments may be provided:

  • --min-confidence : The minimum probability of a detected text region.
  • --width : The width our image will be resized to prior to being passed through the EAST text detector. Our detector requires multiples of 32.
  • --height : Same as the width, but for the height. Again, our detector requires multiple of 32  for resized height.
  • --padding : The (optional) amount of padding to add to each ROI border. You might try values of 0.05  for 5% or 0.10  for 10% (and so on) if you find that your OCR result is incorrect.

From there, we will load + preprocess our image and initialize key variables:

Our image  is loaded into memory and copied (so we can later draw our output results on it) on Lines 82 and 83.

We grab the original width and height (Line 84) and then extract the new width and height from the args  dictionary (Line 88).

Using both the original and new dimensions, we calculate ratios used to scale our bounding box coordinates later in the script (Lines 89 and 90).

Our image  is then resized, ignoring aspect ratio (Line 93).

Next, let’s work with the EAST text detector:

Our two output layer names are put into list form on Lines 99-101. To learn why these two output names are important, you’ll want to refer to my original EAST text detection tutorial.

Then, our pre-trained EAST neural network is loaded into memory (Line 105).

I cannot emphasize this enough: you need OpenCV 3.4.2 at a minimum to have the  cv2.dnn.readNet  implementation.

The first bit of “magic” occurs next:

To determine text locations we:

  • Construct a blob  on Lines 109 and 110. Read more about the process here.
  • Pass the blob  through the neural network, obtaining scores  and geometry  (Lines 111 and 112).
  • Decode the predictions with the previously defined decode_predictions  function (Line 116).
  • Apply non-maxima suppression via my imutils method (Line 117). NMS effectively takes the most likely text regions, eliminating other overlapping regions.

Now that we know where the text regions are, we need to take steps to recognize the text! We begin to loop over the bounding boxes and process the results, preparing the stage for actual text recognition:

We initialize the results  list to contain our OCR bounding boxes and text on Line 120.

Then we begin looping over the boxes  (Line 123) where we:

  • Scale the bounding boxes based on the previously computed ratios (Lines 126-129).
  • Pad the bounding boxes (Lines 134-141).
  • And finally, extract the padded roi  (Line 144).

Our OpenCV OCR pipeline can be completed by using a bit of Tesseract v4 “magic”:

Taking note of the comment in the code block, we set our Tesseract config  parameters on Line 151 (English language, LSTM neural network, and single-line of text).

Note: You may need to configure the --psm  value using my instructions at the top of this tutorial if you find yourself obtaining incorrect OCR results.

The pytesseract  library takes care of the rest on Line 152 where we call pytesseract.image_to_string , passing our roi  and config string .

? Boom! In two lines of code, you have used Tesseract v4 to recognize a text ROI in an image. Just remember, there is a lot happening under the hood.

Our result (the bounding box values and actual text  string) are appended to the results  list (Line 156).

Then we continue this process for other ROIs at the top of the loop.

Now let’s display/print the results to see if it actually worked:

Our results are sorted  from top to bottom on Line 159 based on the y-coordinate of the bounding box (though you may wish to sort them differently).

From there, looping over the results , we:

  • Print the OCR’d text  to the terminal (Lines 164-166).
  • Strip out non-ASCII characters from text  as OpenCV does not support non-ASCII characters in the   cv2.putText  function (Line 171).
  • Draw (1) a bounding box surrounding the ROI and (2) the result text  above the ROI (Lines 173-176).
  • Display the output and wait for any key to be pressed (Lines 179 and 180).

OpenCV text recognition results

Now that we’ve implemented our OpenCV OCR pipeline, let’s see it in action.

Be sure to use the “Downloads” section of this blog post to download the source code, OpenCV EAST text detector model, and the example images.

From there, open up a command line, navigate to where you downloaded + extracted the zip, and execute the following command:

Figure 4: Our first trial of OpenCV OCR is a success.

We’re starting with a simple example.

Notice how our OpenCV OCR system was able to correctly (1) detect the text in the image and then (2) recognize the text as well.

The next example is more representative of text we would see in a real- world image:

Figure 5: A more complicated picture of a sign with white background is OCR’d with OpenCV and Tesseract 4.

Again, notice how our OpenCV OCR pipeline was able to correctly localize and recognize the text; however, in our terminal output we see a registered trademark Unicode symbol — Tesseract was likely confused here as the bounding box reported by OpenCV’s EAST text detector bled into the grassy shrubs/plants behind the sign.

Let’s look at another OpenCV OCR and text recognition example:

Figure 6: A large sign containing three words is properly OCR’d using OpenCV, Python, and Tesseract.

In this case, there are three separate text regions.

OpenCV’s text detector is able to localize each of them — we then apply OCR to correctly recognize each text region as well.

Our next example shows the importance of adding padding in certain circumstances:

Figure 7: Our OpenCV OCR pipeline has trouble with the text regions identified by OpenCV’s EAST detector in this scene of a bake shop. Keep in mind that no OCR system is perfect in all cases. Can we do better by changing some parameters, though?

In the first attempt of OCR’ing this bake shop storefront, we see that “SHOP” is correctly OCR’d, but:

  1. The “U” in “CAPUTO” is incorrectly recognized as “TI”.
  2. The apostrophe and “S” is missing from “CAPUTO’S’.
  3. And finally, “BAKE” is incorrectly recognized as a vertical bar/pipe (“|”) with a period (“.”).

By adding a bit of padding we can expand the bounding box coordinates of the ROI and correctly recognize the text:

Figure 8: By adding additional padding around the text regions identified by EAST text detector, we are able to properly OCR the three words in this bake shop sign with OpenCV and Tesseract. See the previous figure for the first, failed attempt.

Just by adding 5% of padding surrounding each corner of the bounding box we’re not only able to correctly OCR the “BAKE” text but we’re also able to recognize the “U” and “’S” in “CAPUTO’S”.

Of course, there are examples where OpenCV flat out fails:

Figure 9: With a padding of 25%, we are able to recognize “Designer” in this sign, but our OpenCV OCR system fails for the smaller words due to the color being similar to the background. We aren’t even able to detect the word “SUIT” and while “FACTORY” is detected, we are unable to recognize the text with Tesseract. Our OCR system is far from perfect.

I increased the padding to 25% to accommodate the angle/perspective of the words in this sign. This allowed for “Designer” to be properly OCR’d with EAST and Tesseract v4. But the smaller words are a lost cause likely due to the similar color of the letters to the background.

In these situations there’s not much we can do, but I would suggest referring to the limitations and drawbacks section below for suggestions on how to improve your OpenCV text recognition pipeline when confronted with incorrect OCR results.

Limitations and Drawbacks

It’s important to understand that no OCR system is perfect!

There is no such thing as a perfect OCR engine, especially in real-world conditions.

And furthermore, expecting 100% accurate Optical Character Recognition is simply unrealistic.

As we found out, our OpenCV OCR system worked in well in some images, it failed in others.

There are two primary reasons we will see our text recognition pipeline fail:

  1. The text is skewed/rotated.
  2. The font of the text itself is not similar to what the Tesseract model was trained on.

Even though Tesseract v4 is significantly more powerful and accurate than Tesseract v3, the deep learning model is still limited by the data it was trained on — if your text contains embellished fonts or fonts that Tesseract was not trained on, it’s unlikely that Tesseract will be able to OCR the text.

Secondly, keep in mind that Tesseract still assumes that your input image/ROI has been relatively cleaned.

Since we are performing text detection in natural scene images, this assumption does not always hold.

In general, you will find that our OpenCV OCR pipeline works best on text that is (1) captured at a 90-degree angle (i.e., top-down, birds-eye-view) of the image and (2) relatively easy to segment from the background.

If this is not the case, you may be able to apply a perspective transform to correct the view, but keep in mind that the Python + EAST text detector reviewed today does not provide rotated bounding boxes (as discussed in my previous post), so you will still likely be a bit limited.

Tesseract will always work best with clean, preprocessed images, so keep that in mind whenever you are building an OpenCV OCR pipeline.

If you have a need for higher accuracy and your system will have an internet connection, I suggest you try one of the “big 3” computer vision API services:

…each of which uses even more advanced OCR approaches running on powerful machines in the cloud.

Summary

In today’s tutorial you learned how to apply OpenCV OCR to perform both:

  1. Text detection
  2. Text recognition

To accomplish this task we:

  1. Utilized OpenCV’s EAST text detector, enabling us to apply deep learning to localize regions of text in an image
  2. From there, we extracted each of the text ROIs and then applied text recognition using OpenCV and Tesseract v4.

We also looked at Python code to perform both text detection and text recognition in a single script.

Our OpenCV OCR pipeline worked well in some cases but also failed in others. For the best OpenCV text recognition results I would suggest you ensure:

  1. Your input ROIs are cleaned and preprocessed as much as possible. In an ideal world your text would be perfectly segmented from the rest of the image, but in reality, that won’t always be possible.
  2. Your text have been captured at a 90 degree angle from the camera, similar to a top-down, birds-eye-view. In the case this is not the case, a perspective transform may help you obtain better results.

I hope you enjoyed today’s blog post on OpenCV OCR and text recognition!

To be notified when future blog posts are published here on PyImageSearch (including text recognition tutorials), be sure to enter your email address in the form below!

Downloads:

If you would like to download the code and images used in this post, please enter your email address in the form below. Not only will you get a .zip of the code, I’ll also send you a FREE 17-page Resource Guide on Computer Vision, OpenCV, and Deep Learning. Inside you'll find my hand-picked tutorials, books, courses, and libraries to help you master CV and DL! Sound good? If so, enter your email address and I’ll send you the code immediately!

, , , , , , ,

98 Responses to OpenCV OCR and text recognition with Tesseract

  1. YoungCrCy September 17, 2018 at 11:07 am #

    Hello,Adrian,thanks for your amazing work,Ican this work be a real-time work?

    • Adrian Rosebrock September 17, 2018 at 2:05 pm #

      Technically you could use it in a live stream application but I wouldn’t recommend applying it to every frame of the video stream. Instead, fine ones that are stable where you would believe the OCR to be most accurate. Secondly, running OCR on every single frame is also computationally wasteful.

      • Haqkiem October 6, 2018 at 2:11 am #

        ouhh really? but can u explain why is it “computationally wasteful”? the concept is just the same with your previous face recognition right? but OCR is much simpler since we don’t need to train datasets. Correct me if im wrong.

        • Adrian Rosebrock October 8, 2018 at 9:48 am #

          No, you still need to run the forward pass of the network which is still a computationally expensive operation. It is certainly faster than trying to train the network from scratch but it will still be slow. I would suggest you give it a try yourself 🙂

      • Shreyans Sharma November 12, 2018 at 4:59 am #

        Hi Adrian, I would really appreciate if you could suggest some way to distinguish handwritten text from printed text in a scanned document.
        I have tried using MXNet paragraph and line segmentation but that does not distinguish both the classes.
        Your help would be really appreciated.
        Thanks

        • Adrian Rosebrock November 13, 2018 at 4:44 pm #

          A few ideas come to mind:

          1. Local Binary Patterns on each individual character
          2. Train a simple, shallow CNN on lines of handwritten text vs. scanned typed text

          • Lucas Guimarães December 11, 2018 at 12:53 pm #

            Hi Adrian, this is a great post! Thanks for sharing!

            I have the same trouble. I am working in a project where I am OCRizing documents that are scanned but they have handwritten dates which are very important to me. What I did first was define the text region, then apply line segmentation and send each line to the Tesseract network to extract the text. The problem is date these dates are in the middle of some specific line that has other important information and the neural net is getting really confused when trying to predict the dates and sometimes the of the text.

            I think your suggestion of training a simple CNN would work but I’m still a king of newbie. How could I do that? Would it be retraining the Tesseract NN? Do I have to find this lines in each document I run, or the neural net would recognize them by itself?

            I also would like to know if my approach is good:
            1-Define text region and crop the image;
            2-Apply line segmentation
            3-Send each line to Tesseract

            Thank you again!

            Lucas from Brazil 😊

          • Adrian Rosebrock December 11, 2018 at 1:11 pm #

            Training your own NN for OCR can be a huge pain. Most of the time I recommend against it. Have you tried Google’s Vision API yet? It works really well as an off-the-shelf OCR system.

  2. david zhang September 17, 2018 at 11:14 am #

    Your blog is great!

    • Adrian Rosebrock September 17, 2018 at 2:04 pm #

      Thanks so much, David!

  3. Jorge Paredes September 17, 2018 at 11:27 am #

    Great post following OpenCV EAST Text Detector…..

    Also, you read our minds:

    “Inevitably, I’ll be asked how to install Tesseract 4 on the Rasberry Pi…”

    😉

    Thanks!!

    • Adrian Rosebrock September 17, 2018 at 2:03 pm #

      Thanks Jorge 🙂

  4. Abdulmalik Mustapha September 17, 2018 at 11:29 am #

    Nice post. I really could use this for my project really thanks for posting this article. But could you please do tutorial post on how to do handwritten recognition with OpenCV and Deep Learning using the MNIST Dataset. That could help alot!

  5. ygreq September 17, 2018 at 11:41 am #

    Man oh man! I gotta start learning this. You have so many gems here.

    May I ask if you also did a tutorial on correcting perspective, skewing and so on of a document? In the end the script would take many pics made with the phone for example and correct them accordingly.

    Something similar on how the mobile app Office Lens works.

    Something about what I am thinking is here: https://blogs.dropbox.com/tech/2016/08/fast-and-accurate-document-detection-for-scanning/

    Thank you for all your effort!
    ygreq

    • ygreq September 17, 2018 at 11:42 am #

      This is a presentation of the mobile app I was referring: https://www.youtube.com/watch?v=qbobZ43II38

      • Adrian Rosebrock September 17, 2018 at 2:02 pm #

        The primary perspective transform tutorial I refer readers to is this one. I’m not sure if that will help you, but wanted to link you to it just in case.

        • ygreq September 17, 2018 at 3:46 pm #

          My, my! this could be it. Let’s see if my zero knowledge takes me anywhere. ;))

          Thank you so much!

  6. Anthony The Koala September 17, 2018 at 12:34 pm #

    Dear Dr Adrian,
    The above examples work for fonts with serifs eg Times Roman and without serifs, eg Arial,

    Can OCR software be applied to detecting characters of more elaborate fonts, such as Old English fonts used for example in the masthead for the Washington Post,https://www.washingtonpost.com/ ? There are other examples of Old English fonts at https://www.creativebloq.com/features/old-english-fonts-10-of-the-best .

    To put it another way, do you need to train or have a dataset for fancy fonts such as Old English in order to have recognition of fonts of that type?

    Thank you,
    Anthony of Sydney :

    • Adrian Rosebrock September 17, 2018 at 2:01 pm #

      For the best accuracy, yes, you would want to train on a dataset that is representative of what your expect your OCR system to recognize. It’s unrealistic to expect any OCR system to perform well on data it wasn’t trained on.

  7. Walid September 17, 2018 at 2:05 pm #

    Hi Adrian
    Thanks a lot, I am having this error
    AttributeError: module ‘cv2.dnn’ has no attribute ‘readNet’

    Python 3.5.5+OpenCV 3.3.0′ +Ubuntu 16
    Itried net=cv2.dnn.readNetFromTorch(args[“east”])
    but still could not run the code
    Can you please help ?

    Walid

    • Adrian Rosebrock September 17, 2018 at 2:18 pm #

      Hey Walid — you need at least OpenCV 3.4.2 for this blog post. OpenCV 4-pre will also work.

      • Walid September 17, 2018 at 3:00 pm #

        Thanks now it work 🙂

        • Adrian Rosebrock September 17, 2018 at 3:04 pm #

          Awesome, I’m glad to hear it, Walid! 🙂

          • Dany September 18, 2018 at 3:51 pm #

            Hi Adrian, I have the same error because I run in 3.4.1 OpenCV. I follow step by step your guide to install on Ubuntu 18.04. It’s possible to upgrade or I need to recompile?

          • Adrian Rosebrock September 18, 2018 at 4:04 pm #

            You will need to re-compile and re-install although stay tuned for tomorrow’s blog post where I’ll be discussing a super easy way to install OpenCV 😉

          • Dany September 18, 2018 at 4:15 pm #

            Using virtualenv it’s possible to create a new enviroment and recompile inside OpenCv 3.4.3?
            Thanks for your work.

          • Adrian Rosebrock October 8, 2018 at 1:37 pm #

            Yes. Create a new Python virtual environment and then follow one of my OpenCV install guides.

  8. Fred September 17, 2018 at 3:02 pm #

    Hey Adrian,
    Great post!! Have you ever attempted to train Tesseract v4 with a custom font? I’ve had poor results with my dataset..
    Cheers
    Fred

    • Adrian Rosebrock September 17, 2018 at 3:04 pm #

      Hey Fred — sorry, I have not trained Tesseract v4 with a custom font.

  9. Walid September 17, 2018 at 3:12 pm #

    Hi Adrian
    I am having different (less accurate results)

    /example_02.jpg –padding 0.05
    [INFO] loading EAST text detector…
    OCR TEXT
    ========
    l NuDDLEBOROUGha

    Any clue?
    Thanks a lot

    • Adrian Rosebrock September 17, 2018 at 3:20 pm #

      It could be a slightly different Tesseract version. OpenCV itself wouldn’t be the root cause. Unfortunately as I said in the “Limitations and Drawbacks” section, OCR systems can be a bit temperamental!

  10. mohamed September 17, 2018 at 3:52 pm #

    I expected this to be your next step
    I really did not know that the development of the project “Tesseract” has become so advanced.
    Thank you Adrian!
    {Really a wonderful glimpse}

    • Adrian Rosebrock September 17, 2018 at 4:06 pm #

      Thanks Mohamed 🙂

  11. DanB September 17, 2018 at 6:45 pm #

    Awesome write up!

    I ran into an issue were tesseract 4.0.0 does not support digits only white listing. Is there a separate trained network for numerical digits only?

    • Adrian Rosebrock September 17, 2018 at 7:24 pm #

      Hey Dan — where did you run into the “no digits only” issue?

      • DanB September 18, 2018 at 10:18 am #

        There seemed to be a feature of prior versions of tesseract that allowed you to whitelist specific characters.

        I was testing the pretrained OCR network on number signs, but the code was unable to recognize anything 🙁 I’m guessing I will need to train my own network?

        • Adrian Rosebrock September 18, 2018 at 4:06 pm #

          Thanks for the clarification. I recall a similar functionality as well, but unfortunately I cannot recall the exact command to whitelist only specific characters.

        • DanB September 21, 2018 at 11:21 am #

          A follow up to this with a github issue ticket on the tesseract repo explaining more…https://github.com/tesseract-ocr/tesseract/issues/751

          • Adrian Rosebrock October 8, 2018 at 1:07 pm #

            Thank you for the followup Dan!

  12. papy September 17, 2018 at 6:52 pm #

    Good work Adrian, Am currently working of the recognition of license plates using Python + Tesseract OCR. but am having issues training the .trandata file to correctly recognize my countries license plate. Any advice, links or video to help me train this dataset will be of great help.

    Thanks

    • Adrian Rosebrock September 17, 2018 at 7:22 pm #

      I wouldn’t recommend using Tesseract for Automatic License Plate Recognition. It would be better to build your own custom pipeline. In fact, I demonstrate how to build such an ANPR system inside the PyImageSearch Gurus course.

  13. Jari September 17, 2018 at 7:34 pm #

    Hi Adrian,

    Thank you for this. I’ve messed with tesseract in the past but have struggled to get good results out of it (and I _think_ I was using the LSTM version but I’m unsure) on data for work. Our data is under varying lighting conditions and can have significant blur. We use GCP’s OCR solution at the moment which works really really well on this data but if course can get costly.

    One thing I’ve repeatedly tried to do and failed is figure out how to train tesseract on my own data (both real and synthetic). So much so that I gave up and (for the one part of our pipeline that Google doesn’t work well on) built my own deep learning based OCR system which works quite well (but incurs significant RnD overhead). If you know how to train tesseract and would be willing to write that down, I would deeply appreciate that.

    • Adrian Rosebrock September 18, 2018 at 5:56 am #

      Tesseract does assume reasonable lighting conditions and if you’re images are blurry it can get much worse for sure. I’m glad to hear GCP’s solution is working for you though! I personally have never trained a Tesseract model from scratch so I unfortunately do not have any guidance there.

  14. Andrews September 17, 2018 at 7:47 pm #

    Hi Adrian, thanks for your tutorials, they are helping me a lot. I work in a project that i don’t know where to start, if have any tip, I will appreciate a lot.Here is the stackOverflow link:

    https://stackoverflow.com/questions/52377025/how-can-i-use-opencv-to-process-a-market-leaflet-to-extract-product-and-promotio

    • Adrian Rosebrock September 18, 2018 at 5:55 am #

      Your project is very challenging to say the least. It sounds like you may be new to the world of computer vision and OpenCV. I would suggest first working through Practical Python and OpenCV to help you learn the fundamentals. Walk before you run, otherwise you’ll trip yourself up. You’ll also want to further study object detection. This guide will help you get up to speed.

  15. Trami September 17, 2018 at 9:56 pm #

    Hi adrian. I just wonder how i can use your method to recgonize the digits in the meter with a acceptable accuracy

    • Adrian Rosebrock September 18, 2018 at 5:58 am #

      Recognizing water meters is an entirely different beast since the numbers may be partially obscured, dirt/dust on the meter itself, and any number of possible lighting problems. You could try using Tesseract here but I wouldn’t expect too high of accuracy. I’ll try to do a water meter recognition post in the future or include it in a new book.

      • Trami September 18, 2018 at 9:31 pm #

        Thank for so much. could you give me some advice about the the problems on recognizing the meter ?

  16. Sanda September 17, 2018 at 10:27 pm #

    Thank you so much

    Really appreciated

    • Adrian Rosebrock September 18, 2018 at 5:53 am #

      Thanks Sanda, I’m glad you enjoyed the post!

  17. Dilshat September 17, 2018 at 10:33 pm #

    I have an error during run the “text_recognition.py” as follows:
    Traceback (most recent call last):
    pytesseract.pytesseract.TesseractNotFoundError: tesseract is not installed or it’s not in your path

    How can I fix this?
    Thanks.

    EDIT:

    I fixed above problem by changing the ‘pytesseract.py’ as follows:

    tesseract_cmd = ‘tesseract’
    to
    tesseract_cmd = ‘C:\\Program Files (x86)\\Tesseract-OCR\\tesseract’

    Thanks for the great code!

  18. Chen September 18, 2018 at 1:26 am #

    Hi Adrian,
    I have download the source code in my window computer. also install some relevant library.
    i try to execute your source code.

    python text_recognition.py –east frozen_east_text_detection.pb \
    –image images/example_01.jpg
    [INFO] loading EAST text detector…
    OCR TEXT

    • Chen September 18, 2018 at 1:29 am #

      but it show error:unrecognized arguments:\

      • Adrian Rosebrock September 18, 2018 at 5:58 am #

        I assume you are referring to command line arguments? If so, refer to this tutorial to help you get up to speed with command line arguments.

  19. Aveshin Naidoo September 18, 2018 at 2:48 pm #

    Good day. Great blog post as per usual. Question: Would it be possible to run two virtual environments on a Raspberry Pi 3 with a 16 GB card and Rasbian OS. The current virtual environment has a previous version of OpenCV and Python + Tesseract as followed from one of your previous tutorials. I’m worried about space limitations and don’t want the long OpenCV installation to fail midway. Thanks.

    • Aveshin Naidoo September 18, 2018 at 2:50 pm #

      I forgot what to add what I want the second virtual environment for. The new one will hold the EAST text detector and a new version of OpenCV, plus python and Tesseract 4

      • Adrian Rosebrock September 18, 2018 at 4:05 pm #

        Keep in mind that Tesseract is a binary, it’s not a Python package — I think you’re confusing the tesseract command with the pytesseract Python package. You can create two Python virtual environments if you want but you’ll only have one version of the actual Tesseract binary itself which shouldn’t be na issue since Tesseract v4 also includes the v3 engine.

  20. Alex September 18, 2018 at 3:54 pm #

    Hello Adrian, another very good tutorial thanks! Would you recommend it for a license plate reader or in this case is it better to stick with normal segmentation and a KNN?

    • Adrian Rosebrock September 18, 2018 at 4:03 pm #

      Hey Alex, I wouldn’t recommend using Tesseract for Automatic License Plate Recognition. It would be better to build your own custom pipeline. In fact, I demonstrate how to build such an ANPR system inside the PyImageSearch Gurus course.

  21. Niklas Wilke September 19, 2018 at 5:58 pm #

    Hi Adrian, even though not related to this post i had thought about NN/AI security.
    I’m not currently working on CV myself so im unsure if im up to date but you would probably know.

    There were methods (like pixel attacks) that allowed someone who was familiar with the architecture of a CNN to create images or modify images to get a desired output.
    => change x , let the the model classify an airplane as a fish.

    The big “let down” here is that i could only do that with my own NN so its pretty pointless and the security risk pretty low. But now that i think about how CV is implemented by semi-experts and without clear rules and standards i would imagine a lot of CV software solutions out there and those that are about to be build will make use of the state of the art nets of the big researchers and will base their nets on that. They probably tweak and modify it but the core structure might remain the same.

    Now my question:
    Would those slightly modified implementations still be a valid target for pixel manipulation attacks or other attack forms, given i base them on the 5-6 biggest nets out there or will the net as soon as any modification (for example add a label class to the main pool) has been made , be safe of those attacks ?

    Im not concerned about the “sure but you can easily avoid this by … ” solution, im concerned about semi-expert who implement stuff in small businesses or in areas where nobody can really judge their work as long as it seems to be working in my desired business case.

    Thanks for reading through this,
    best regards
    Niklas

  22. Daniel September 20, 2018 at 5:23 am #

    Thank you so much for this post! ??

    • Adrian Rosebrock October 8, 2018 at 1:16 pm #

      Thanks Daniel, I’m glad you enjoyed it!

  23. loch September 22, 2018 at 9:35 pm #

    HI adrian

    your code work perfectly , earlier i had opencv 3.2.0 where camera release function perfectly
    but after upgrading to opencv 3.4.2 to run the programme the camera release( capture.release() ) function not working can u give me a solution to release the camera thank you

    • Adrian Rosebrock October 8, 2018 at 1:00 pm #

      I’m not sure why your camera may have stopped working in between OpenCV 3.2 and OpenCV 3.4.2. That is likely a great question for the OpenCV GitHub Issues page.

  24. Tran September 22, 2018 at 11:53 pm #

    Hi, just an idea. We can next use a translator to translate the text and print it to the image in place of the OCR text.

    • Adrian Rosebrock October 8, 2018 at 1:00 pm #

      You’re absolutely right Tran 🙂

  25. seventheefs September 24, 2018 at 11:30 am #

    Hi Adrian, nice work!!!
    Could you please indicate to me what are the steps that i should use to make it work on arabic text?

    • Adrian Rosebrock October 8, 2018 at 12:50 pm #

      You would want to take a look at Tesseract’s language packs.

  26. vinay September 24, 2018 at 11:32 am #

    how to install tesseract + python bindings and iam getting workon command not found .please help me out.

    • Adrian Rosebrock October 8, 2018 at 12:50 pm #

      Hey Vinay, do you have virtualenv and virtualenvwrapper installed on your system? Did you install OpenCV using Python virtual environments? If not, you can skip the “workon” command.

  27. liu September 28, 2018 at 12:14 am #

    Hi,I got a problem.The code can detect some texts like “AB” or “CD”,etc.but it can’t recognize a single character like ‘A’,’B’,etc.Does anyone know how to recognize a single character or provide another model _detection.pb like east? Great thanks.

  28. keertika September 28, 2018 at 2:28 am #

    Hey Adrian,I am running this code on Jupyter notebook (pyhton 3.6.+conda 4.5.11+opencv 3.4). I get an error unrecognised error.

    • keertika September 28, 2018 at 2:32 am #

      I got it fixed !!

      • Adrian Rosebrock October 8, 2018 at 12:24 pm #

        Congrats on resolving the issue!

  29. K September 28, 2018 at 3:04 am #

    How do i run this program in anaconda prompt ?

  30. K September 28, 2018 at 3:19 am #

    hey,Adrian

    I get the following error

    AttributeError: module ‘cv2.dnn’ has no attribute ‘readNet’

    • Adrian Rosebrock October 8, 2018 at 12:24 pm #

      Make sure you’re using OpenCV 3.4.2 or greater.

  31. Oyekanmi Oyetunji September 30, 2018 at 9:58 am #

    Hi Adrian
    Thanks for the tutorial..
    I really like what you’re doing up here…

    I need your help

    I have raspbian with opencv pre-compiled.. Which I got when I bought a bundle from you…

    Can I install tesaract straight up on it… Or do I have to uninstall opencv..

    I’d appreciate a quick response please…

    Thanks..

    • Adrian Rosebrock October 8, 2018 at 10:54 am #

      No need to uninstall OpenCV! You can simply install Tesseract as I recommend in this guide.

  32. Vittorio October 10, 2018 at 12:25 pm #

    Hi Adrian!

    Thank for the very useful tutorial (as always:))

    In my project, I would need to recognize single RANDOMIC characters from a car chassis.

    Do you think I should try a different solution or it should be good the one explained by this post?

    Thx

    • Adrian Rosebrock October 12, 2018 at 9:13 am #

      Hey Vittorio, do you have any examples of RANDOMIC characters? I’m not sure what they look like off the top of my head.

  33. Royce Ang October 11, 2018 at 12:00 am #

    Hi,I am beginner on this field and I would like to know how to detect letter and number of license plate with this? is it possible?

    sorry if i asked wrong question.

    • Adrian Rosebrock October 12, 2018 at 9:08 am #

      Hey Royce, I would actually recommend working through the PyImageSearch Gurus course where I cover automatic license plate recognition in detail (including code).

  34. Steven October 15, 2018 at 2:44 pm #

    Hi Adrian,

    Great post. I do have to ask: How did you decide on the “Saxon’s Estate Agents” image? Of the many billions of images to choose from online, this is a rather peculiar one. This image was shot in the same town where I am doing my PhD. 🙂

    • Adrian Rosebrock October 16, 2018 at 8:25 am #

      Hah! That’s so cool! I found the image when I searched for storefronts — that was one of the images that popped up!

  35. ranjeet singh October 21, 2018 at 11:25 am #

    Its not working on this image where I want to detect IMEI number
    Pic – https://starofmysore.com/wp-content/uploads/2017/07/news-9-imei.jpg
    Even when I align image correctly, it detects word ‘imei’ but does not capture IMEI number.
    What should I do?

    • Adrian Rosebrock October 22, 2018 at 7:59 am #

      Hey Ranjeet, make sure you read the “Limitations and Drawbacks” section of this tutorial. OCR systems will fail in certain situations. You may want to try creating your own custom digit detector for the actual number.

  36. jim421616 October 25, 2018 at 7:42 pm #

    Hi, Adrian. I got the installation on my RPi first time (!) but when I issue tesseract –help-oem or -psm or -l, I get the following error:

    tesseract: error while loading shared libraries: libtesseract.so.4: cannot open shared object file: No such file or directory.

    I’m in the virtual env cv_tesseract when I issue the command, but I get the same error message when I’m not in it too.

    Any suggestions?

    • Adrian Rosebrock October 29, 2018 at 1:48 pm #

      Hey Jim — have you tried posting on the official Tesseract GitHub Issues page? They would be able to provide more targeted advice to your specific system.

    • juancruzgassoloncan@gmail.com October 30, 2018 at 6:35 pm #

      Hi Jim

      try

      $ sudo ldconfig

      and then test with
      $ tesseract –version

      That work for me on my Raspbian

  37. Gary Chris November 14, 2018 at 1:46 am #

    Hello! Adrian, im having this issue when im running the code


    AttributeError: module ‘cv2.dnn’ has no attribute ‘readNet’

    How to resolve this? Hope you can help me 🙁

    • Adrian Rosebrock November 15, 2018 at 12:10 pm #

      Make sure you are using OpenCV 3.4.2 or greater.

  38. Sangam November 15, 2018 at 4:37 am #

    Hello Adrian – I have come up with an issue that I am not able to get past. I am getting “AttributeError: ‘module’ object has no attribute ‘readNet’ ” with the line “net = cv2.dnn.readNet(args[“east”])”. This is line 109 in the code that I have downloaded. My opencv version 4.0.0-alpha.

    WIll you be able to help me out with it?

    Thanks

    • Adrian Rosebrock November 15, 2018 at 11:52 am #

      I would suggest trying with OpenCV 3.4.2 and see if that resolves the issue.

  39. Vagner December 9, 2018 at 8:58 pm #

    Congratulations on the article.

    Is there anything about comparing signatures, to find possible scams, using opencv and algorithms like gsurf, harrison or something?

    • Adrian Rosebrock December 11, 2018 at 12:48 pm #

      Sorry, I do not have much experience with signature verification or recognition so I unfortunately cannot recommend any resources.

  40. Dorra December 13, 2018 at 8:36 am #

    Hi Doctor Adrian
    Both scripts of “OpenCV Text Detection” and “OpenCV OCR and text recognition with Tesseract” make use of the serialized EAST model ( frozen_east_text_detection.pb ) can you send me the source code of (frozen_east_text_detection.py) I want undrestand how it work.
    Thanks for your help

Leave a Reply