We discuss some sample material "ideally typeset" in the postscript file OCR: what we would like to understand. We have programs that can parse the main equations labelled 18 and 19. We also provide some raw material as a large tif file that you can either view or download as a page to try out. Careful, this file is 2948 by 3884 bits of image, 1433294 bytes. We have already deskewed the page and cropped off the margins. You might try to confirm that it is deskewed, filter out noise, repair the broken or merged characters, recognize the symbols, and parse the results. Ideally you would provide a form that could be easily (a) fed into TeX and alternatively (b) fed into a computer algebra system.
We would like to be able to handle pages of this difficulty or worse, since we have a few thousand of them. We would like to use the resulting formulas with a related computer-algebra project that provides rapid table-lookup of integrals. This paper appeared in ISSAC-95, (International Symposium on Symbolic and Algebraic Computation, (c) ACM).