Effects of Communication Latency, Overhead, and Bandwidth in a Cluster Architecture David E. Culler Computer Science Division University of California Berkeley This talk describes a systematic study of the impact of communication performance on parallel applications in a high performance cluster of workstations. We have developed an experimental system in which the communication latency, overhead, and bandwidth can be independently varied to observe the effects on applications demonstrating a wide range of architectural requirements. Our results indicate that efforts to bring cluster communication performance in line with the performance of more tightly integrated parallel machines have resulted in significantly improved application performance. We show that applications demonstrate strong sensitivity to overhead and message bandwidth, slowing down by up to a factor of 30 on sixteen processors when overhead is increased by 100 $\mu$s. Surprisingly, many of our benchmark applications are tolerant of increased latency and lower bulk message bandwidth. Finally, applications demonstrate a linear dependence to both overhead and gap, indicating that further improvements in communication performance will continue to improve application performance.