Connected Components on Distributed Memory Machines

Arvind Krishnamurthy, Steve Lumetta, David Culler,
and Katherine Yelick

In this paper, we describe an implementation of the connected
components algorithm on a distributed memory machine.  A direct
implementation of the PRAM algorithm results in an inefficient
implementation due to the huge number of remote accesses
generated by the algorithm.  Instead, we use a hybrid algorithm
that invokes the sequential algorithm as a local preprocessing
phase before entering a global phase, which is a modified
version of the PRAM algorithm.  We use the Split-C language,
which provides the abstraction of a global address space, for
building the distributed graph data structure.  We obtain
speedups in the order of 20 on a 32 processor CM5 for certain
kinds of graphs.