Hardware/Software Performance Tradeoffs (plus Msg Passing Finish)

2/3/99


Click here to start


Table of Contents

Hardware/Software Performance Tradeoffs (plus Msg Passing Finish)

SAS Recap

Message Passing Grid Solver

Data Layout and Orchestration

PPT Slide

Notes on Message Passing Program

Send and Receive Alternatives

Orchestration: Summary

Correctness in Grid Solver Program

Performance Goal => Speedup

Analysis Framework

Load Balance and Synchronization

Improving Load Balance

Example: Barnes-Hut

Dynamic Scheduling with Task Queues

Impact of Dynamic Assignment

Self-Scheduling

Reducing Serialization

Impact of Efforts to Balance Load

Arch. Implications of Load Balance

Reducing Extra Work

Reducing Inherent Communication

Domain Decomposition

Domain Decomposition (contd)

Relation to load balance

Implications of Comm-to-Comp Ratio

Structuring Communication

Reducing Overhead

Reducing Network Delay

Reducing Contention

Types of Contention

Overlapping Communication

Communication Scaling (NPB2)

Communication Scaling: Volume

What is a Multiprocessor?

Memory-oriented View

Uniprocessor

Extended Hierarchy

Artifactual Communication

Communication and Replication

Working Set Perspective

Orchestration for Performance

Reducing Artifactual Communication

Exploiting Temporal Locality

Exploiting Spatial Locality

Spatial Locality Example

Architectural Implications of Locality

Tradeoffs with Inherent Communication

Example Performance Impact

Working Sets Change with P

Where the Time Goes: LU-a

Summary of Tradeoffs

Author: David E. Culler

Home Page: http://www.cs.berkeley.edu/~culler/cs258-s99/

Other information:
David E. Culler UC Berkeley CS258 Parallel Computer Architecture Lecture 5