We discuss some sample material "ideally typeset"
in the postscript file
OCR: what we would
like to understand. We have programs that can parse the main
equations labelled 18 and 19.
We also provide some raw material as a large tif file that you can
either view or download as
a page to
try out. Careful, this file is 2948 by 3884 bits of image, 1433294 bytes.
We have already deskewed the page and cropped off the margins.
You might try to confirm that it is deskewed, filter out noise, repair
the broken or merged characters, recognize the symbols, and parse the
results. Ideally you would provide a form that could be easily
(a) fed into TeX and alternatively (b) fed into a computer algebra system.
We would like to be able to handle pages of this difficulty or worse,
since we have a few thousand of them. We would like to use
the resulting formulas
with a related computer-algebra project that provides rapid
table-lookup
of integrals. This paper appeared in ISSAC-95, (International
Symposium on Symbolic and Algebraic Computation, (c) ACM).