Our primary class for OCRchie is a page contained in page.h Page contains all of the high level functions to access the OCR package. We keep both a BitMap and RLE representation of a page of scanned data within the class. Throughout this document a row refers to a row of pixels. A line refers to a line of text.
Class data members:
Page.h classes
Page - Represents a page of scanned text
int fnumLines; // number of lines of text in document LineMarker * flineinfo; // for each line - start and end row in map Components * fLearnedChars; // A list of learned characters Components ** fLineComponents; // A list of components fo each line BitMap * fBitMap; // Pointer to BitMap representation RLEMap * fRLEMap; // Pointer to an RLEMap represntation
ZonedPage - Inherits from Page. Represents a zoned page of text. Also has the additional data member which holds the zone boundaries and subpages.
Zones * fzones;
Zone.h
Zone - Represents one zone within a zoned page
Point ful; - upper left point
Point flr; - lower right point
Page * fpage; - Page representation of this zone
Page * fparent; // page of which this is a part
Zones - a list of type Zone.
int fstartrow; int fendrow;
Point ful // upper left point Point flr; // Lower right point int fnumBits; // number of bits in character (area) Property* fproperty[numProperties]; // property vector Character* fasciiId; // ascii Identification Confidence fconfid; // Confidence
int fImageWidth; int fImageLength; MapStatus fStatus; // MapStatus enum in system.h (see below) uchar ** fMapData; // a two dimensional array of unsigned chars
RLEMap.h
RLEMap - A run-length-encoded representation of a scanned page, this structure keeps track of black pixel "runs": stretches of adjacent black pixels. If pixels 10 through 200 on a given row are black, this run is stored as (10,200) -- a considerable advantage over the 190 bits required in the bitmap representation. This structure is also very fast to display and rotate. It can be significantly slower than a bitmap at accessing individual pixels.
RLEPairs ** fMapData; // Array length = fImageLength(One list // for each row) int fImageLength; int fImageWidth; MapStatus fStatus;
Word.h
Word - a list of components and ascii translation that make up a word
char * characters; // text string int charCount; // Number of characters in the string int compCount; //Number of components short mispelled; // spelled correctly? only valid if the word // has been spellchecked Confidence confid; // The minimum confidence of each char in word Point ul; // upper left point of word Point lr; // lower right point of word Component ** character; // an array of components that make up the word
Words - a list of type Word
RLEPair.h
RLEPair - Marks start and end of a black pixel range
int start; int end;
RLEPairs - A list of RLEPairs - Inherits class List and adds .. int numPixels - the number of pixels in the entire list
See system.h for some global vars, typedefs, and enumerated types
Header files
system.h Global typedefs, enumerated types and include files Also declaration of global variables Page.h Primary class - access to OCR functions Zone.h Zone representation BitMap.h BitMap representation RLEMap.h RLEMap representation RLEPair.h Starting and ending point of a range of black pixels Also contains RLEPairs. Component.h Components. Each component represents a character area on the page. Also contains Components Word.h A collection of components and their ascii translation. LineMarker.h LineMarkers mark the start/end row for each line of text. list.h The list class we are using learn.h Function headers for learning. Learn functions are in learn.cc