U.C. Berkeley CS267/EngC233 Home Page
Applications of Parallel Computers
Spring 2010
T Th 9:30-11:00, 250 Sutardja Dai Hall
Instructors:
Teaching Assistants:
Razvan Carbunescu
Office: Parlab, 5th floor, Soda Hall, cell: (225) 747-0405
Office Hours: W 3:30 - 5pm and Th 1:30 - 3pm, 576 Soda (Euclid, in the ParLab)
(send email)
Andrew Gearhart
Office: Parlab, 5th floor, Soda Hall, cell: (410) 259-1410
Office Hours: W 4 - 5pm and Th 1:30 - 3pm, 576 Soda (Euclid, in the ParLab)
(send email)
Administrative Assistants:
Tammy Johnson
Office: 565 Soda Hall
Phone: (510)643-4816
(send email)
Link to webcasting of lectures
(Active during lectures only; see below under "Lecture Notes" for archived video).
(Jan 19) Due to technical difficulties, we will not be webcasting today.
We will try to fix it next time.
To ask questions during live lectures, please email them to
this address,
which the teaching assistants will be monitoring during lecture.
Syllabus and Motivation
CS267 was originally designed to teach students how to
program parallel computers to efficiently solve
challenging problems in science and engineering,
where very fast computers are required
either to perform complex simulations or to analyze enormous datasets.
CS267 is intended to be useful for students from many departments
and with different backgrounds, although we will assume reasonable
programming skills in a conventional (non-parallel) language,
as well as enough mathematical skills to understand the
problems and algorithmic solutions presented.
CS267 satisfies part of the course requirements for a new
Designated Emphasis ("graduate minor") in
Computational Science and Engineering.
While this general outline remains, a large change
in the computing world has started in the last few years:
not only are the fastest computers
parallel, but nearly all computers will soon be parallel,
because the physics of semiconductor manufacturing
will no longer let conventional sequential processors
get faster year after year, as they have for so long
(roughly doubling in speed every 18 months for
many years). So all programs that need to
run faster will have to become parallel programs.
(It is considered very unlikely that compilers will be
able to automatically find enough parallelism in most
sequential programs to solve this problem.)
For background on this trend toward parallelism, click
here.
This will be a huge change not just for science
and engineering but the entire computing industry,
which has depended on selling new computers by running
their users' programs faster without the users
having to reprogram them. Large research activities
to address this issue are underway at many computer
companies and universities, including
Berkeley's ParLab,
whose research agenda is outlined
here.
While the ultimate solutions to the parallel programming
problem are far from determined, students in CS267 will
get the skills to use some of the best existing parallel programming
tools, and be exposed to a number of open research questions.
Tentative Detailed Syllabus
Grading
There will be several programming assignments to acquaint students
with basic issues in memory locality and parallelism needed for
high performance. Most of the grade will be based on a final project
(in which students are encouraged to work in small interdisciplinary teams),
which could involve parallelizing an interesting application, or
developing or evaluating a novel parallel computing tool. Students
are expected to have identified a likely project by mid semester,
so that they can begin working on it. We will provide many suggestions
of possible projects as the class proceeds.
Homeworks should be submitted by emailing them to
cs267.spring2010.submissions@gmail.com.
Class Projects
You are welcome to suggest your own class project, but you may also look at
the ParLab webpage for ideas, the
Computational Research Division and
NERSC webpages at
LBL, or at the
class posters and their
brief oral presentations from
CS267 in Spring 2009.
Announcements
(May 5) The poster session will be in the main hallway of the 5th floor of Soda Hall.
(Apr 17) The poster session will be on Thursday May 6 instead of Tuesday May 4. Final reports will be due Wednesday, May 12, by noon.
(Mar 9) Material at the end of Lecture 14 was updated to discuss
some possible class projects
(Mar 7) Starting March 10, Prof. Demmel's Wednesday office hours
will be 1-2pm instead of 2-3pm.
(Feb 5) The topics of the lectures scheduled for Feb 11 and 16 have
been swapped, see the syllabus for details.
(Jan 28) There are two seminars of interest to CS267 students today:
At 11am in the Wozniak Lounge, Soda Hall, Laurent Visconti of Microsoft will
talk about "New abstractions for parallel linear algebra libraries."
At 4pm in 306 Soda Hall, Phil Colella of LBL will talk about
"Models, Algorithms, and Software: Tradeoffs in the Design of High-Performance Computational Simulations in Science and Engineering."
(Jan 28) Andrew Gearhart and Razvan Carbunescu will not have office hours today
(Thursday, Jan 28).
(Jan 27) Andrew Gearhart has changed his office hours (see above).
(Jan 26) Prof. Demmel has to cancel his office hours on Thursday, Jan 28, 1:30-2:30pm
(Jan 21) Note the change in Prof. Demmel's office hours.
(Jan 19) Due to technical difficulties, there will no webcasting today.
We will try to fix this next time.
(Jan 17) Homework Assignment 0 has been posted
here, due Feb 2
by midnight.
(Jan 17) Homeworks should be submitted by emailing them to
cs267.spring2010.submissions@gmail.com.
(Jan 17) Please fill out the following
class survey.
(Jan 17) This course satisfies part of the course requirements
for a new Designated Emphasis ("graduate minor") in
Computational Science and Engineering.
(Jan 17) NERSC will host a
workshop on programming their new supercomputer, the Cray XT5, from Feb 1-3.
Students interested in attending should send email to
Richard Gerber and say that they are
CS267 students. This workshop is suitable for more experienced students.
(Jan 17) For students who want to try some on-line self-paced
courses to improve basic programming skills, click
here.
You can use this material without having to register.
In particular, courses like CS 9C (for programming in C) might
be useful.
(Jan 17) This course will have students attending from two
CITRIS campuses: UC Berkeley and
UC Davis.
CITRIS is generously providing the webcasting facilities and other
resources to help run the course.
Lectures will be webcast
here
(active during lectures only).
This will include, among other things,
class handouts, homework assignments,
the class roster, information about class accounts, pointers
to documentation for machines and software tools we will use,
reports and books on supercomputing,
pointers to old CS267 class webpages (including old class projects),
and pointers to other useful websites.
Lecture Notes
Notes from previous offerings of CS267 are posted on old
class webpages available under
Class Resources
In particular, the web page from the
1996 offering
has detailed, textbook-style notes available on-line that are still
largely up-to-date in their presentations of parallel algorithms
(the slides to be posted during this semester will contain some more
recently invented algorithms as well).
Lectures (power point and archived video) for lectures from Spr 2010
will be posted here.
Lecture 1 - Jan 19 - Introduction
(in powerpoint),
(not webcast)
Lecture 2 - Jan 21 - Single Processor Machines: Memory Hierarchies
and Processor Features; Case Study: Tuning Matrix Multiply
(in powerpoint),
(video archive)
Lecture 3 - Jan 26 - Introduction to Parallel Machines
and Programming Models
(in powerpoint),
(video archive)
Lecture 4 - Jan 28 - Finish Parallel Machines
and Programming Models; Shared Memory Programming
with Threads and OpenMP
(in powerpoint),
(video archive)
Lecture 5 - Feb 2 - Distributed memory machines and programming
(in powerpoint),
(video archive)
Lecture 6 - Feb 4 - Sources of Parallelism and Locality in Simulation - Part 1
(in powerpoint),
(video archive)
Lecture 7 - Feb 9
(video archive)
Sources of Parallelism and Locality in Simulation - Part 2
(in powerpoint),
Tricks with Trees
(in powerpoint),
Notes on Homework 1
(in powerpoint),
Lecture 8 - Feb 11 - Graph Partitioning
(in powerpoint),
(video archive)
Lecture 9 - Feb 16 -
(video archive)
Complete Graph Partitioning (same slides as last lecture)
Real-time Knowledge Extraction from Massive Time-Series Datastreams,
by Josh Bloom,
(in pdf),
Lecture 10 - Feb 18 - An Introduction to CUDA/OpenCL and Manycore Graphics Processors,
by Bryan Catanzaro,
(in powerpoint-x),
(video archive)
Lecture 11 - Feb 23 - Architecting Parallel Software with Patterns,
by Kurt Keutzer,
(in powerpoint-x),
(video archive)
Lecture 12 - Feb 25 - Parallel Programming in UPC (Unified Parallel C)
by Kathy Yelick,
(in powerpoint),
(video archive)
Lecture 13 - Mar 2 - Dense Linear Algebra, Part 1
(in powerpoint),
(video archive)
Lecture 14 - Mar 4 - Dense Linear Algebra, Part 2
(in powerpoint)
(updated March 9),
(video archive)
Lecture 15 - Mar 9 - Automatic Performance Tuning and
Sparse-Matrix-Vector-Multiplication (SpMV)
(in powerpoint)
(We will also discuss class projects using slides at the end of
Lecture 14.)
(video archive)
Lecture 16 - Mar 11 - Evolution of Processor Architecture, and
the Implications for Performance Optimization
(in powerpoint-x),
by Sam Williams,
(video archive)
Lecture 17 - Mar 16 - Sparse Matrix Methods on High Performance Computers
(in powerpoint),
by Xiaoye Sherry Li,
(video archive)
Lecture 18 - Mar 18 - Structured Grids
(in powerpoint),
(video archive)
Lecture 19 - Mar 30 - Performance Analysis Tools
by Karl Fuerlinger,
(in powerpoint),
(video archive)
Lecture 20 - Apr 1 - Fast Fourier Transform
(in powerpoint),
(video archive)
See also talk on Spiral
Lecture 21 - Apr 6 - Future Trends in Computing
(in powerpoint),
(video archive)
Lecture 22 - Apr 8 - Cloud Computing
by Matei Zaharia,
(in powerpoint),
(video archive)
Lecture 23 - Apr 13
(video archive)
Dynamic Load Balancing
(in powerpoint),
Parallel Sorting
(in powerpoint),
Lecture 24 - Apr 15 - Parallel Graph Algorithms
by Kamesh Madduri,
(in powerpoint),
(video archive)
Lecture 25 - Apr 20 - Big Bang, Big Iron: High Performance Computing
and the Cosmic Microwave Background,
by Julian Borrill,
(in powerpoint),
(video archive)
Lecture 26 - Apr 22 - Parallelism in Music and Audio Applications,
by David Wessel
(video archive)
"Advances in the Parallelization of Music and Audio Applications,"
(in ppt)
"Advances in the Parallelization of Music and Audio Applications,"
(in pdf, conference paper)
Lecture 27 - Apr 27 - Simulating the Brain, by
Rajagopal Ananthanarayanan
(video archive)
"Anatomy of a Cortical Simulator"
(in pdf)
Supercomputing 07, by R. Ananthanarayanan and D. Modha
"The cat is out of the bag: Cortical Simulations with 10^9 Neurons, 10^{13} Synapses",
(in pdf)
Supercomputing 09, by R. Ananthanarayanan, S. Esser, H. Simon and D. Modha
(winner, Gordon Bell Prize, 2009)
Lecture 27 - Apr 29 - Frameworks in Complex Multiphysics HPC Applications
(in powerpoint)
(video archive)
Sharks and Fish
"Sharks and Fish" are a collection of simplified simulation programs
that illustrate a number of common parallel programming techniques
in various programming languages (some current ones, and some
old ones no longer in use).
Basic problem description, and (partial) code from 1999 class,
written in Matlab, CMMD, CMF, Split-C, Sun Threads, and pSather,
available
here.
Code (partial) from 2004 class, written in MPI, pthreads, OpenMP,
available
here.