Our primary class for OCRchie is a page contained in page.h Page contains all of the high level functions to access the OCR package. We keep both a BitMap and RLE representation of a page of scanned data within the class. Throughout this document a row refers to a row of pixels. A line refers to a line of text.
Class data members:
Page.h classes
Page - Represents a page of scanned text
int fnumLines; // number of lines of text in document
LineMarker * flineinfo; // for each line - start and end row in map
Components * fLearnedChars; // A list of learned characters
Components ** fLineComponents; // A list of components fo each line
BitMap * fBitMap; // Pointer to BitMap representation
RLEMap * fRLEMap; // Pointer to an RLEMap represntation
ZonedPage - Inherits from Page. Represents a zoned page of text. Also has the additional data member which holds the zone boundaries and subpages.
Zones * fzones;
Zone.h
Zone - Represents one zone within a zoned page
Point ful; - upper left point
Point flr; - lower right point
Page * fpage; - Page representation of this zone
Page * fparent; // page of which this is a part
Zones - a list of type Zone.
int fstartrow;
int fendrow;
Point ful // upper left point
Point flr; // Lower right point
int fnumBits; // number of bits in character (area)
Property* fproperty[numProperties]; // property vector
Character* fasciiId; // ascii Identification
Confidence fconfid; // Confidence
int fImageWidth;
int fImageLength;
MapStatus fStatus; // MapStatus enum in system.h (see below)
uchar ** fMapData; // a two dimensional array of unsigned chars
RLEMap.h
RLEMap - A run-length-encoded representation of a scanned page, this structure keeps track of black pixel "runs": stretches of adjacent black pixels. If pixels 10 through 200 on a given row are black, this run is stored as (10,200) -- a considerable advantage over the 190 bits required in the bitmap representation. This structure is also very fast to display and rotate. It can be significantly slower than a bitmap at accessing individual pixels.
RLEPairs ** fMapData; // Array length = fImageLength(One list // for each row)
int fImageLength;
int fImageWidth;
MapStatus fStatus;
Word.h
Word - a list of components and ascii translation that make up a word
char * characters; // text string int charCount; // Number of characters in the string int compCount; //Number of components short mispelled; // spelled correctly? only valid if the word // has been spellchecked Confidence confid; // The minimum confidence of each char in word Point ul; // upper left point of word Point lr; // lower right point of word Component ** character; // an array of components that make up the word
Words - a list of type Word
RLEPair.h
RLEPair - Marks start and end of a black pixel range
int start;
int end;
RLEPairs - A list of RLEPairs - Inherits class List and adds .. int numPixels - the number of pixels in the entire list
See system.h for some global vars, typedefs, and enumerated types
Header files
system.h Global typedefs, enumerated types and include files
Also declaration of global variables
Page.h Primary class - access to OCR functions
Zone.h Zone representation
BitMap.h BitMap representation
RLEMap.h RLEMap representation
RLEPair.h Starting and ending point of a range of black pixels
Also contains RLEPairs.
Component.h Components. Each component represents a character area on the page.
Also contains Components
Word.h A collection
of components and their ascii translation.
LineMarker.h LineMarkers mark the start/end row for each line of text.
list.h The list class we are using
learn.h Function headers for learning. Learn functions are in learn.cc