Simple Example

Internally pipeline depth 10 => bandwidth 100 Mops
- Rate determined by slowest stage of pipeline, not overall latency

Suppose application performs 100 M operations. What is cost?
- op count * op latency gives 10 sec (upper bound)
- op count / peak op rate gives 1 sec (lower bound)
  - assumes full overlap of latency with useful work, so just issue cost
- if application can do 50 ns of useful work before depending on result of op, cost to application is the other 50ns of latency

Component performs an operation in 100ns