Software for Optical Character Recognition (OCR)

I implemented a commercial program to extract text information from catalogs. I used OpenCV to extract regions of interest and preprocess images. First, I used adaptive thresholding to separate desirable foreground image objects from the background. Next, I used a series of morphological transformations, such as the opening to detect horizontal and vertical lines. Finally, I used tesseract for OCR.

Application GUI

Image preprocessing to extract region of interest.

Generating output using tesseract.