Simple Example
Component performs an operation in 100ns
Simple bandwidth: 10 Mops
Internally pipeline depth 10 => bandwidth 100 Mops
- Rate determined by slowest stage of pipeline, not overall latency
Delivered bandwidth on application depends on initiation frequency
Suppose application performs 100 M operations. What is cost?
- op count * op latency gives 10 sec (upper bound)
- op count / peak op rate gives 1 sec (lower bound)
- assumes full overlap of latency with useful work, so just issue cost
- if application can do 50 ns of useful work before depending on result of op, cost to application is the other 50ns of latency