Optical Character Recognition (OCR) using (Py)Tesseract : Part 2

In part1, we have seen that from the textbook page image, and noisy image how we can extract the text using tesseract.


In this part, we will see that from photographs how can we extract text.

Let's try a new example and bring some of the things we have learned.

Here's an image of a storefront, let's load it and try and get the name of the store out of the image.


from PIL import Image
import pytesseract
# Lets read in the storefront image I've loaded into the course and display it
image=Image.open('../input/OCR1/storefront.png')
display(image)
# Finally, let's try and run tesseract on that image and see what the results are
pytesseract.image_to_string(image)


















'fa | INTERNATIONAL\n\nEe oat\n\n \n\nae\n\n| bile\n\n-_\nS =\nE “ee —\n.\n\n| pe 1 800 GO DRAKE PTV Cheol i i\n\noes\n\n \n\nK iM he ie'


We see at the very bottom there is a string that we can not identify easily. Tesseract is unable to take this image and pull out the name. By cropping the image it will be able to identify text. So let's try and help Tesseract by cropping out certain pieces.


First, lets set the bounding box. In this image, the store name is in a box.

bounding_box=(470, 150, 1020, 320)

# Now lets crop the image
title_image=image.crop(bounding_box)

# Now lets display it and pull out the text
display(title_image)
pytesseract.image_to_string(title_image)









'DRAKE\n\nINTERNATIONAL'


Great, we see how with a bit of a problem reduction we can make that work. So now we have been able to take an image, preprocess it where we expect to see text, and turn that text into a string that python can understand.


If you look back up at the image though, you'll see there is a small sign outside of the shop that has the shop website on it. I wonder if we're able to recognize the text on that sign? Let's give it a try.


First, we need to determine a bounding box for that sign. For that, let's just use the bounding box I decided on.

# Now, lets crop the image
little_sign=image.crop((1000, 548, 1215, 690))
display(little_sign)










All right, that is a little sign! OCR works better with higher resolution images, so let's increase the size of this image by using the pillow resize() function.

Lets set the width and height equal to ten times the size it is now in a (w,h) tuple

new_size=(little_sign.width*5,little_sign.height*5)
display(little_sign.resize( new_size, Image.NEAREST))

















pytesseract.image_to_string(little_sign.resize( new_size, Image.NEAREST))

'DRAKEINTL.COM'

With increased size, we are able to extract text.


If you look back up at the image though, you'll see there is a small sign outside of the shop that has the slogan on it. I wonder if we're able to recognize the text on that sign? Let's give it a try on that too.


We will crop that image again.

little_sign=image.crop((570, 490, 690, 720))
display(little_sign)









We will resize this image for a better view.

new_size=(little_sign.width*5,little_sign.height*5)
display(little_sign.resize( new_size, Image.NEAREST))




























'eel\ner D\n\nLe ee de\nWITH ONE OF\n\nOUR CONSULTANTS\nTODAY\n\nRring'


I think we should be able to find something better. I can read it, but it looks really pixelated. Let's see what all the different resize options look like.


options=[Image.NEAREST, Image.BOX, Image.BILINEAR, Image.HAMMING, Image.BICUBIC, Image.LANCZOS]
for option in options:
    # lets print the option name
    print(option)
    # lets display what this option looks like on our little sign
    display(little_sign.resize( new_size, option))

0




























4




























2




























5




























3





























1