Best Python/Ruby lib for reading text inside images [closed]

Question

Closed. This question is off-topic. It is not currently accepting answers.

Want to improve this question? Update the question so it's on-topic for Stack Overflow.

Closed 12 years ago.

Anyone know a library in python/ruby that analize images and extract text inside?

Or a book about image processing ect...

PS: The text is in varius fonts and formats but clear, Tl;Dr: No captcha or similar.

What does the last line you have written convey ? or is it written by mistake ? — Rndm
– Rndm, Commented Jul 15, 2012 at 7:16
possible duplicate of OCR for recognising handwriting in .NET — Adam Mihalcin
– Adam Mihalcin, Commented Jul 15, 2012 at 7:17
@Angelbit I pointed out one particular duplicate, but this question is really a duplicate of almost any OCR question on StackOverflow. — Adam Mihalcin
– Adam Mihalcin, Commented Jul 15, 2012 at 7:18
Sorry, my english is very poor, the text inside images is written in various sizes and formats (bold, italic ect.) — byterussian
– byterussian, Commented Jul 15, 2012 at 7:20
@AdamMihalcin Have edit, don't have find any question ruby/python specific. — byterussian
– byterussian, Commented Jul 15, 2012 at 7:26

Community · Accepted Answer · 2017-05-23 12:31:33Z

You can use OpenCV, an opensource computer vision library and It has Python API. It is considered to be an industry-standard library nowadays.

OpenCV official site : http://opencv.org/

If you need some tutorials on OpenCV-Python, visit : opencvpython.blogspot.com

In addition to that, OpenCV samples has got some OCR implementations.

But I would recommend you to use Tesseract for OCR. It is the best Open source OCR engine, developed by HP, but now handled by Google.

Tesseract site : https://github.com/tesseract-ocr/tesseract

Python API of tesseract, Pytesser : https://github.com/RobinDavid/Pytesser

So you can use OpenCV to preprocess the image and use Tesseract for OCR.

1 Answer 1