Implementing Atomic Operations

In cache or Memory?
- cacheable
  - better latency and bandwidth on self-reacquisition
  - allows spinning in cache without generating traffic while waiting
- at-memory
  - lower transfer time
  - used to be implemented with “locked” read-write pair of bus transitions
  - not viable with modern, pipelined busses
- usually traffic and latency considerations dominate, so use cacheable
  - what the implementation strategy?

Previous slide Next slide Back to first slide View graphic version