Computer Science 252. Graduate Computer Architecture. (4 units)
A binhex'd self-extracting archive of this semester's powerpoint slides can be found here. (3.2 MBytes)
Three hours of lecture and one hour of discussion per week.
Prerequisites: CS 152.
Graduate survey of contemporary computer organizations covering: early systems, CPU
design, instruction sets, control, processors, busses, ALU, memory, pipelined computers,
multiprocessors, and case studies. Term paper or project required.
This course focuses on the techniques of quantitative analysis and evaluation of modern
computing systems, such as the selection of appropriate benchmarks to reveal and compare
the performance of alternative design choices in system design. The emphasis is on the
major component subsystems of high performance computers: pipelining, instruction level
parallelism, memory hierarchies, input/output, and network-oriented interconnections.
Students will undertake a major computing system analysis and design
project of their own choosing.
Homeworks (two person teams): 30%
Exams (two in-class midterms): 30%
Project (two person teams): 30%
Class Participation: 10%
Instructors, Spring 1996
Lecturer: Randy H. Katz, Professor
Teaching Assistant: Daniel Jiang
- 637 Soda Hall, 642-8778, randy@cs.Berkeley.edu
- Office Hours W 2 - 3 PM, Th 11 - noon.
Lecture: MWF 3 - 4 PM, 241 Cory Hall
Discussion: MW 10 - 11 AM, 320 Soda Hall
J. L. Hennessy and D. A. Patterson, Computer Architecture: A Quantitative
Approach, 2nd Edition, Morgan Kaufmann Publishing Co., Menlo
Enrolled Student Snapshots
Course Projects From Last Semester
An Evaluation of Correlation Based Branch Prediction on the
Alpha Architecture, Alok Agrawal and Michael Chu
Power and Performance Tradeoffs in Microprocessor
Cache Design, Jennie Chen and Bruce McGaughy
Optimizing the QR Eigensolver for the IBM Power2
Architecture, Tzu-yi Chen and Andrey Zege
Benchmarks for Graphics/Video Applications
, Stephen Chenney and Alok Mittal
Latency Hiding in Uniprocessors using Multithreading
, Brent Chun and Franklin Cho
Branching on Superscalar Machines: Speculative
Execution of Multiple Branches, Richard Fromm and Bassam Tabbara
Accelerating the RISC Processor Using Programmable
Logic, Sriram Rajamani and Pramod Viswanath
Reducing Power Consumption for the Next
Generation of PDAs: It's the Network Interface!,
Mark Stemm, Paul Gauthier, Daishi Harada
Harware/Software Architectures for TCP/IP Acceleration of UNIX Workstations,
Roy Sutton and Sameer Jalnapurkar
A Comparative Analysis of Branch Prediction Schemes, Zhendong Su and Min Zhou
Slow Fourier Transforms on Fast Microprocessors, Taku Tokuyasu and Bernt Pfromm
Compressed Reduced Instruction Set COmputing (CRISCO), Patrick Warner and Geroncio Galicia
Instruction Level Power Analysis of the ARM60, Anna Reznik
Course Projects from this Semester
Aaron Antonowich, Addison Chen, Branch Prediction Generator.
Yatin Chawathe, Amar Chaudhary, Tina Wong,
Java Architecture Study and Improvement.
Sanjoy Dasgupta, Edouard Servan-Schreiber,
An Updated Analysis of Cache Behavior.
Naji Ghazal, Wing Leung,
Characterization of the "Network Terminal" (the Hollow Client) in a Distributed CAD Environment.
Steve Gribble, Marcel Kornacker,
Memory Hierarchies for Real-World Applications.
Rajeev Ranjan, Shaz Qadeer, Amit Mehrotra,
Benchmarking Architectures for CAD.
Angie Schuett, Marylou Orayani,
Durability and Reliability Issues for Magnetic Tape and Their Impact on Digital Library Workloads.
Andrew Swan, Randi Thomas, Dave Simpson,
Architectural Influences on DCT Based Video Decoders.
Marlene Wan, Varghese George,
Dependency of Cache Performance on 1st and 2nd Level Cache Designs.
Victim Caching for Large Caches and Modern Workloads.
Tao Ye, Steve Fink, Computing on Wheels.
David Wagner, Ian Goldberg,
Performance Issues When Implementing Cryptography on FPGAs.
Instruction Scheduling for a Parameterized VLIW Machine.
Tentative Course Lecture Plan
Week 1 (17 January - 19 January)*
Week 2 (22 January - 26 January)
- Lecture 3: Amdahl's Law, Rules of Thumbs, Performance and CPI Formula;
- Lecture 4: Benchmarks and Performance Metrics;
- Lecture 5: Cost, Price, and Price for Performance;
Week 3 (29 January - 2 February)
Special class meeting times:
29 January (Lecture 6)@ 5:30-6:30 PM in 405 Soda
2 February (Lectures 8,9) @ 2 - 4 PM in 241 Cory
- Lecture 6: Instruction Set Architecture and the 80x86;
- Lecture 7: Introduction to Pipelining, Structural Hazards, Forwarding;
- Lecture 8: Pipelining, Control Hazards, Static Branch Prediction, Interrupts, MIPS
Week 4 (5 February - 7 February)
No class on 5, 7 February; Randy on the East Coast.
Special double lecture on 9 February (Lectures 10, 11), 2 - 4 PM in 241 Cory.
- Lecture 9: Instruction Level Parallelism and Advanced Pipelining (Scoreboards);
- Lecture 10: Case Study: CDC 6600 Scoreboard;
- Lecture 11: Case Study: Tomasulo Algorithm;
Week 5 (12 February - 16 September)
- Lecture 12: Dynamic Branch Prediction, Superscalar, VLIW, Software Pipelining;
- Lecture 13: Midterm Review and Catch-Up
- FIRST MIDTERM EXAMINATION
Week 6 (21 February - 23 February*)
Oops! 19 February is a holiday!
- Lecture 14: Trace Schedule, Conditional Execution, Speculation, Limits of ILP;
- Lecture 15: Memory Hierarchy, Motivation, Definitions, Four Questions about
Week 7 (26 February - 1 March)
- Lecture 16: Memory Hierarchy Misses, 3 Cs and 7 Ways to Reduce Misses;
- Lecture 17: Memory Hierarchy: 5 Ways to Reduce Miss Penalty (including Second Level Cache);
- Lecture 18: Memory Hierarchy: Main Memory and Enhancing its Performance;
Week 8 (4 March - 8 March)
- Lecture 19: DRAM-Specific Memory Organizations, Virtual Memory, Alpha 21064
Memory Hierarchy and Performance, Fallacies;
- Lecture 20: I/O, Storage Devices, Metrics, and Productivity;
- Lecture 21: I/O, A Little Queuing Theory, and I/O Interfaces;
Week 9 (11 March - 15 March)
Week 10 (18 March - 22 March)
- Lecture 25: I/O, UNIX File System Performance on Workstations and Mainframes;
- Lecture 26: I/O, Tertiary Storage Systems-not covered;
- Lecture 27: Networks, Intro, Implementation and Performance Issues-not covered;
Week 11 (25 March - 29 March)
- Lecture 28: Networks, Architectural Issues including Topologies, Practical Issues and Protocols-not covered;
- Lecture 29: Multiprocessors, Flynn Categories, Small vs. Large Scales, Cache Coherency;
- Lecture 30: Multiprocessors, Snoopy Caches and Directory Schemes;
Week 12 (8 April - 12 April)
- Lecture 31: Multiprocessors, Performance Snoopy Cache vs. Directories;
- Lecture 32: Multiprocessors, Synchronization and Consistency Models;
- Lecture 33: Project Status Reports.
Week 13 (15 April - 19 April)
- Lecture 35: Portable Computers, Technology Underpinings;
postscript, pdf (Draft).
- Lecture 36: Midterm Review and Catch-up
- SECOND MIDTERM EXAM
Week 14 (22 April - 26 April)
- Lecture 37: Portable Computers, Low Power Processor Issues;
postscript, pdf (Draft).
- Lecture 38: Portable Computers, Processor Case Studies (AT&T Hobbit, Acorn RISC Machine, Low Power Power PC);
postscript, pdf (Draft).
- Lecture 39: Special Topics;
Week 15 (29 April - 3 May)
- Lecture 40: Special Topics;
- Lecture 41: The Future According to Katz;
- Lecture 42: Course Summary and Review;
Week 16 (6 May)
Handout 0: Background Questionaire
Handout 1: First Problem Set, due 2 February 96.
Handout 2: Second Problem Set, due 9 February 96.
Handout 3: First Project Survey, due 21 February 96.
Handout 4: Third Problem Set, due 1 March 96.
Handout 5: Second Project Survey, due 6 March 96.
Handout 6: Fourth Problem Set, due 22 March 96.
Handout 7: First Project Checkpoint, due 18 March 96.
Handout 8: Second Project Checkpoint, due 8 April 96.
Handout 9: Project Report Write-Up Specification, due 8 May 96.
Other Useful Links
Tom Burd's CPU Central
Susan Egger's Course
at the University of Washington
Hennessy & Patterson's
WWW Computer Architecture Home Page
Randy H. Katz, ed., randy@cs.Berkeley.edu, Last Edited: 22 April 96