Connected Components



next up previous
Next: Other Bottom-up tools Up: Utility Programs Previous: De-skewing

Connected Components

con-pict(p) produces a Lisp list of the connected components of the picture p. In many cases these components correspond to the characters on the page image, although they can be both character fragments or artificially merged characters. The current version of this program (including Lisp garbage collection times) finds about 2000 components per second on the file 6.tif mentioned earlier. This is probably not an entirely typical example since many of the components are small pieces of the binding and page edges. Another, probably more typical sample: an 8.5 by 11 inch typed page in courier typeface was found to have 2134 connected components in 1.6 seconds, a rate of 1334 chars/sec.

We believe this program is quite fast, although this speed could be improved substantially by using lower resolution, and hence smaller, images. Halving the linear resolution should speed the processing by a factor of two gif.



Class Account
Fri Dec 1 14:31:16 PST 1995