This project was submitted as a senior undergraduate honors project by
Kathey Marsden , a Spring 1996
Computer Science
graduate from
the
University of California at Berkeley.
This Optical Character Recognition software package began with work
in CS169 Fall, 1995, Software Engineering,
under the direction of
Professor Richard J. Fateman . The original OCR package could learn
from a tif file and ascii translation, then recognize a document in the
same font. This semester we added interactive learning, interactive
segmentation of mathematics, page zoning (the ability to automatically or manually
zone columns or regions of text, and interactive read-order specification.
The orginal team members were Archie Russell, James Hopkin, and Cynthia Tian,
who contributed significantly to the original design.
Improved/Cut Down Version
(March, 2001)
Keith Davies, kjdavies@telus.net
has substantially revised this project and made it work for his
application, which appears to require reading numeric digits. He
also removed the dependence on Tcl/TK, which had become a
troublesome issue: the Tcl/TK versions have improved to the
extent that OCRchie needed changes to continue to run.
In any case, Keith was kind enough to send this material back
to us, and we have posted it in this directory. If you make use of it
perhaps you should keep Keith informed as well.
Start by reading
Keith's email about the changes and re-organization.
There is a compressed tarball in that directory that can be
moved at your convenience.
-- Richard Fateman
Project Documents
OCR Reference Links
If the links below have moved, you will undoubtedly be able to
find many links with a search engine. Here are a few current sites
(last updated July, 2000.)
Various non-technical industry reports and press release
reviews on OCR. At least some were written by non-experts, and
judging from the errors, may have been scanned in! How else to explain
misspelling ASCII as ASCH ?
An older (1997) but more respectably academic collection of links on OCR
TIFF Reference Links
Some info on TIFF
Comments? Mail
fateman@cs.berkeley.edu