OCRchie Data Representation and Classes

Our primary class for OCRchie is a page contained in page.h Page contains all of the high level functions to access the OCR package. We keep both a BitMap and RLE representation of a page of scanned data within the class. Throughout this document a row refers to a row of pixels. A line refers to a line of text.

Class data members:

Page.h classes

Page - Represents a page of scanned text

        int fnumLines;            // number of lines of text in document
        LineMarker * flineinfo;   // for each line - start and end row in map
        Components * fLearnedChars;      // A list of learned characters
        Components ** fLineComponents;   // A list of components fo each line
        BitMap * fBitMap;                // Pointer to BitMap representation   
        RLEMap * fRLEMap;                // Pointer to an RLEMap represntation
ZonedPage - Inherits from Page. Represents a zoned page of text.  Also
has the additional data member which holds the zone boundaries and subpages.
 Zones * fzones;


Zone.h

Zone - Represents one zone within a zoned page

Point ful; - upper left point
Point flr; - lower right point
Page * fpage; - Page representation of this zone
Page * fparent; // page of which this is a part

Zones - a list of  type Zone.


LineMarker - Records starting and ending row of a line of text

         int fstartrow;
         int fendrow;


Component - Represents one character. Boundary of character is determined by a vertical projection and horizontal shrink down to create a bounding box for the character.

        Point ful    // upper left point
        Point flr;   // Lower right point
        int fnumBits; // number of bits in character (area)
        Property*   fproperty[numProperties]; // property vector
        Character* fasciiId;       // ascii Identification
        Confidence fconfid;       // Confidence


Components - A list of components - Inherits from List class in list.h


BitMap - A BitMap representation of a scanned page

        int fImageWidth;
        int fImageLength;
        MapStatus fStatus;  // MapStatus enum in system.h (see below)
        uchar ** fMapData;  // a two dimensional array of unsigned chars


RLEMap.h

RLEMap - A run-length-encoded representation of a scanned page, this structure keeps track of black pixel "runs": stretches of adjacent black pixels. If pixels 10 through 200 on a given row are black, this run is stored as (10,200) -- a considerable advantage over the 190 bits required in the bitmap representation. This structure is also very fast to display and rotate. It can be significantly slower than a bitmap at accessing individual pixels.

        RLEPairs ** fMapData;    // Array length = fImageLength(One list                                                        // for each row)
        int fImageLength;
        int fImageWidth;
        MapStatus fStatus;

Word.h
Word - a list of components and ascii translation that make up a word
char * characters;        // text string
  int charCount;            // Number of characters in the string
  int compCount;    //Number of components
 
  short mispelled;
  // spelled correctly? only valid if the word
  // has been spellchecked
 
  Confidence confid;        // The minimum confidence of each char in word
  Point ul;                 // upper left point of word
  Point lr;                 // lower right point of word
 
  Component ** character;   // an array of components that make up the word


Words - a list of type Word

RLEPair.h

RLEPair - Marks start and end of a black pixel range

        int start;
        int end;

RLEPairs - A list of RLEPairs - Inherits class List and adds .. int numPixels - the number of pixels in the entire list

See system.h for some global vars, typedefs, and enumerated types

Header files

 system.h    Global typedefs, enumerated types and include files 
              Also declaration of global variables
 Page.h       Primary class - access to OCR functions
 Zone.h       Zone representation
 BitMap.h     BitMap representation
 RLEMap.h     RLEMap representation
 RLEPair.h    Starting and ending point of a range of black pixels
               Also contains RLEPairs.
 Component.h  Components.  Each component represents a character area on the page.
              Also contains Components
 Word.h A collection 
of components and their ascii translation.
 LineMarker.h LineMarkers mark the start/end row for each line of text.
 list.h       The list class we are using     
 learn.h       Function headers for learning. Learn functions are in learn.cc