Cache Block Size
Trade-offs in uniprocessors with increasing block size
- reduced cold misses (due to spatial locality)
- increased transfer time
- increased conflict misses (fewer sets)
Additional concerns in multiprocessors
- parallel programs have less spatial locality
- parallel programs have sharing
- false sharing
- bus contention
Need to classify misses to understand impact
- cold misses
- capacity / conflict misses
- true sharing misses
- one proc writes words in a block, invalidating a block in another processor’s cache, which is later read by that process
- false sharing misses