Skew Detection and Correction

Our recognition algorithm relies on the characters of a page being oriented correctly, for this reason detection and correction of any skew the page may have been scanned at is vital. Fateman has done some work in this area.

Skew Detection

There are several commonly used methods for detecting skew in a page, some rely on detecting connected components (for many purposes, they are roughly equivalent to characters) and finding the average angles connecting their centroids. The method we employed (after observing it in Fateman's program) was to project the page at several angles, and determine the variance in the number of black pixels per projected line.

The projetion parallel to the true alignment of the lines will likely have the the maximum variance, since when parallel, each given ray projected through the image will hit either almost no black pixels (as it passes between text lines) or many black pixels (while passing through many characters in sequence).

Sample image and a projection parallel to its text.

Oblique projections will commonly pass both through lines of text, and spaces between lines, the variance in the number of pixels hit by the individual rays will thus be smaller than in the parallel case.

Sample image and an oblique projection. Notice the uniform nature of the distribution

After detecting the skew, it is necessary to correct for it.

Skew Correction

After the skew angle of the page has been detected, our recognition algorithm demands that the page must be rotated to correct for this skew. Our rotation algorithm had to be both fairly fast and fairly accurate. We looked into two strategies for correcting slight skews in scanned images. The first was a pure coordinate transformation, which takes a little bit of time on large images, but gets the rotation exact. The next was somewhat less accurate, but faster (with the proper data structures) and works pretty well with slight skews that are common in scanned images. The second method uses two shears to achieve its result. Our final choice was the coordinate rotation, since we were focusing more on accuracy than speed and the this method was not horrendously slow.

An almost working rotation: