There are several commonly used methods for detecting skew in a page, some rely on detecting connected components (for many purposes, they are roughly equivalent to characters) and finding the average angles connecting their centroids. The method we employed (after observing it in Fateman's program) was to project the page at several angles, and determine the variance in the number of black pixels per projected line.
The projetion parallel to the true alignment of the lines will likely have the the maximum variance, since when parallel, each given ray
projected through the image will hit either almost no black pixels (as
it passes between text lines) or many black pixels (while passing
through many characters in sequence).
Sample image and a projection parallel to its text.
Oblique projections will commonly pass both through lines of text, and spaces between lines, the variance in the number of pixels hit by the individual rays will thus be smaller than in the parallel case.
Sample image and an oblique projection. Notice the uniform nature of the distribution
After detecting the skew, it is necessary to correct for it.
An almost working rotation: