2

I'm currently trying to detect numbers from small screenshots. However, I have found the accuracy to be quite poor. I've been using OpenCV, the image is captured in RGB and converted to greyscale, then thresholding has been performed using a global value (I found adaptive didn't work so well).

Here is an example grey-scale of one of the numbers, followed by an example of the image post thresh-holding (the numbers can range from 1-99). Note that the initial screenshot of the image is quite small and is thus enlarged.

enter image description here

enter image description here

Any suggestions on how to improve accuracy using OpenCV or a different system altogether are much appreciated. Some code included below, the function is passed a screenshot in RGB of the number.

def getNumber(image):
    image = cv2.resize(image, (0, 0), fx=3, fy=3)
    img = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

    thresh, image_bin = cv2.threshold(img, 125, 255, cv2.THRESH_BINARY)

    image_final = PIL.Image.fromarray(image_bin)

    txt = pytesseract.image_to_string(
        image_final, config='--psm 13 --oem 3 -c tessedit_char_whitelist=0123456789')
    return txt
3
  • What have you tried? What does not work? Show your code. Please read this forum's help section on how to ask a good question. Commented Sep 20, 2019 at 16:12
  • Apologies @fmw42. Included the current function at the bottom. Commented Sep 20, 2019 at 16:17
  • You might try adaptive thresholding or you might try using some morphology to try to close up the white letters. Commented Sep 20, 2019 at 19:51

1 Answer 1

5

Here's what i could improve, using otsu treshold is more efficent to separate text from background than giving an arbitrary value. Tesseract works better with black text on white background, and i also added padding as tesseract struggle to recognize characters if they are too close to the border.

This is the final image [final_image][1] and pytesseract manage to read "46"

import cv2,numpy,pytesseract
def getNumber(image):
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    
    # Otsu Tresholding automatically find best threshold value
    _, binary_image = cv2.threshold(gray, 0, 255, cv2.THRESH_OTSU)
    
    # invert the image if the text is white and background is black
    count_white = numpy.sum(binary_image > 0)
    count_black = numpy.sum(binary_image == 0)
    if count_black > count_white:
        binary_image = 255 - binary_image
        
    # padding
    final_image = cv2.copyMakeBorder(image, 10, 10, 10, 10, cv2.BORDER_CONSTANT, value=(255, 255, 255))
    txt = pytesseract.image_to_string(
        final_image, config='--psm 13 --oem 3 -c tessedit_char_whitelist=0123456789')

    return txt

Function is executed as :

>> getNumber(cv2.imread(img_path))

EDIT : note that you do not need this line :

image_final = PIL.Image.fromarray(image_bin)

as you can pass to pytesseractr an image in numpy array format (wich cv2 use), and Tesseract accuracy only drops for characters under 35 pixels (and also bigger, 35px height is actually the optimal height) so i did not resize it. [1]: https://i.sstatic.net/OaJgQ.png

Sign up to request clarification or add additional context in comments.

3 Comments

Thank you! It successfully read 46 with your code but failed to read 47 on testing. From the produced final image for 47, I am surprised it failed to read it. Are there any other further steps I can take to push accuracy up?
You could try resampling to a larger size. That might give you some wiggle room to do some smoothing or morphology operations. As a last resort, if your images have the same font at the same size, you could try template matching individual numerals.
Well your particular font/size doesn't make it easy, like the other comment said you can use opencv eroding/dilating morphology operations, also applying a median blur on the grayscale image before tresholding it.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.