opencv: SET Document Processing

Document image analysis is the process of recognizing text and graphics components in images, and extracting the intended information as a human would do. There are categories of document image analysis namely: Textual and Graphics processing. Textual processing deals with recognizing the text by optical character recognition (OCR), determining the skew (any tilt at which the document may have been scanned into the computer), finding columns, paragraphs, text lines, and words. On the other hand, Graphics processing deals with the non-textual line and symbol components that make up line diagrams, delimiting straight lines between text sections, company logos, etc.

Below are the steps of processing a Student Evaluation for Teacher (SET) document using opencv and c++. The goal here is to identify the shaded, unshaded and cross-out circles of certain items in the image.

1. Reading the Image

This is the basic step in all image processing. Without reading an image first, how will you be able to analyze an image?

This is one example of the SET forms. Say this is our original image.

CODE:

2. Converting the Image into Grayscale

The first step in document analysis is to perform processing on this image to prepare it for further analysis. Here we will perform preprocessing on this image by converting it to gray-scale.

CODE:

3. Thresholding the Image to get a better result

Thresholding functions are used mainly for two purposes: (1) masking out some pixels that do not belong to a certain range, for example, to extract blobs of certain brightness or color from the image; (2) converting grayscale image to bi-level or black-and-white image.

Global Thresholding :The most straightforward way to automatically select a global threshold is by use of a histogram of the pixel intensities in the image. The intensity histogram plots the number of pixels with values at each intensity level.

CODE:

Below shows the histogram graph and values:

OTSU Algorithm: a nonparametric approach where calculations are first made of the ratio of between-class variance to within-class variance for each potential threshold value.

Code:

Below is a binary image:

4. Finding the contour of the Image

The function FindContours retrieves contours from the binary image and returns the pointer to the first contour. Access to other contours may be gained through the h_next and v_next fields of the returned structure. The function returns total number of retrieved contours.

CODE:

5. Identifying all the circles of interest among the contours

After finding the contours, we have to detect all the circles.

CODE:

6. Identifying the unshaded circles and draw green square around them

After detecting all the circles, we have to identify the circles of interest. these circle of interest will be enclosed by a square. To do that, we have to identify the point coordinates of these circle which can be found in "fields39.csv". We have to read all the points in from that file.

CODE:

Identify the coordinates (x,y) for the centroid of the circle of interest. Draw a green box for unshaded circles.

CODE:

Below is the sample output.

7. Identifying the shaded circles and draw red square around them

Identify the coordinates (x,y) for the centroid of the shaded circles. Draw a red box for shaded circles.

CODE:

Below is the sample output.

8. Identifying the cross-out circles and draw blue square around them

Identify the coordinates (x,y) for the centroid of the crossed-out circles. Draw a blue box for crossed-out circles.

CODE:

Below is the sample output.

SUMMARY:

After doing all the steps, you will obtain the final output. The basic steps in document analysis involves thresholding, segmentation and representation. To obtain a good threshold, we can histogram or otsu algorithm. One threshold is not enough to represent all the other images since there is variation in the color value of each images. After the thresholding, we can do noise reduction and segmentation through different methods like canny, erosion, dilation, edge detection and many other. We can also find the contours in our image through the function findcontours() and detect the shapes we are interested. In representation, we identify ways to represent a binary image either though its area, perimeter, centroid, etc. In the steps above, we made used of the centroid of circle of interest since it is identified already in the .csv file.
The steps tried to solve the problem of detecting shaded, unshaded and crossed-out shape as much as possible but there are lot of better opencv methods to solve the problem. We just have to explore more about opencv. It's has a lot to offer in image processing.

Mar 2, 2016

SET Document Processing

SUMMARY:

No comments:

Post a Comment