U.C. Berkeley CS267/EngC233 Home Page

Applications of Parallel Computers

Spring 2011

T Th 9:30-11:00, 250 Sutardja Dai Hall

Instructors:

Jim Demmel

Offices:
564 Soda Hall ("Virginia", in ParLab), (510)643-5386
831 Evans Hall, same phone number

Office Hours: W 12-1 and F 11-12, in 564 Soda Hall (subject to change)

(send email)

Katherine Yelick

Offices:
Lawrence Berkeley National Lab (LBNL), Building 50B, room 4230, (510)495-2431
581 Soda Hall (ParLab)

Office Hours: by appointment (in the ParLab)

(send email)

Teaching Assistants:

Michael Anderson

Office: Parlab, 5th floor, Soda Hall, cell phone: (608)334-0828

Office Hours: W 2-3:30 and Th 12:30-2, 576 Soda (Euclid, in the ParLab) (note change, as of Feb 3)

(send email)

Grey Ballard

Office: Parlab, 5th floor, Soda Hall, cell phone: (336)244-2366

Office Hours: W 2-3:30 and Th 12:30-2, 576 Soda (Euclid, in the ParLab) (note change, as of Feb 3)

(send email)

Administrative Assistants:

Tammy Johnson

Office: 565 Soda Hall

Phone: (510)643-4816

(send email)

Roxana Infante

Office: 563 Soda Hall

Phone: (510)643-1455

(send email)

Link to webcasting of lectures (Active during lectures only; see here for archived video).

To ask questions during live lectures, you have two options:

You can email them to this address, which the teaching assistants will be monitoring during lecture.

You can use the chat box at the bottom of the webpage of Class Resources and Homework Assignments.

Unfortunately, Lecture 11 on Feb 24 was not recorded. Instead, we have posted a link to the lecture on the same topic given during Spring 2010. Most of the material is the same, although some new results were presented in Spring 2011. Click here for the powerpoint slides for the lecture from Spring 2010.

Syllabus and Motivation

CS267 was originally designed to teach students how to program parallel computers to efficiently solve challenging problems in science and engineering, where very fast computers are required either to perform complex simulations or to analyze enormous datasets. CS267 is intended to be useful for students from many departments and with different backgrounds, although we will assume reasonable programming skills in a conventional (non-parallel) language, as well as enough mathematical skills to understand the problems and algorithmic solutions presented. CS267 satisfies part of the course requirements for a new Designated Emphasis ("graduate minor") in Computational Science and Engineering.

While this general outline remains, a large change in the computing world has started in the last few years: not only are the fastest computers parallel, but nearly all computers will soon be parallel, because the physics of semiconductor manufacturing will no longer let conventional sequential processors get faster year after year, as they have for so long (roughly doubling in speed every 18 months for many years). So all programs that need to run faster will have to become parallel programs. (It is considered very unlikely that compilers will be able to automatically find enough parallelism in most sequential programs to solve this problem.) For background on this trend toward parallelism, click here.

This will be a huge change not just for science and engineering but the entire computing industry, which has depended on selling new computers by running their users' programs faster without the users having to reprogram them. Large research activities to address this issue are underway at many computer companies and universities, including Berkeley's ParLab, whose research agenda is outlined here.

While the ultimate solutions to the parallel programming problem are far from determined, students in CS267 will get the skills to use some of the best existing parallel programming tools, and be exposed to a number of open research questions.

Tentative Detailed Syllabus

Grading

There will be several programming assignments to acquaint students with basic issues in memory locality and parallelism needed for high performance. Most of the grade will be based on a final project (in which students are encouraged to work in small interdisciplinary teams), which could involve parallelizing an interesting application, or developing or evaluating a novel parallel computing tool. Students are expected to have identified a likely project by mid semester, so that they can begin working on it. We will provide many suggestions of possible projects as the class proceeds.

Homeworks should be submitted by emailing them to cs267.spring2011.submissions@gmail.com.

Class Projects

You are welcome to suggest your own class project, but you may also look at the following sites for ideas:

the ParLab webpage,

the Computational Research Division and NERSC webpages at LBL,

class posters from CS267 in Spring 2010

class posters and their brief oral presentations from CS267 in Spring 2009.

Announcements

(Apr 1) Office hours today are postponed one hour until until 12-1pm.

(Mar 16) Class will be held in the usual room (250 SDH), not the auditorium (as previously announced).

(Mar 9) Prof. Demmel's office hours today are moved to 2-3pm.

(Feb 24) Unfortunately, Lecture 11 on Feb 24 was not recorded. Instead, we have posted a link to the Spring 2010 version of lecture.

(Jan 27) There is a local workshop on GPU programming that you may want to attend.

(Jan 19) The GSI office hours have moved on W to 2-3:30, from 1-2:30.

(Jan 19) Remote participants can now either submit questions by email or by chat room, as described above under "Link to webcasting of lectures."

(Jan 18) On Tuesday, Jan 25, CS267 will meet in 630 Sutardja Dai Hall. On Thursday, March 17, CS267 will meet in the main auditorium on the 3rd floor of Sutardja Dai Hall.

(Jan 17) Homework Assignment 0 has been posted here, due Feb 2 by midnight.

(Jan 17) Homeworks should be submitted by emailing them to cs267.spring2011.submissions@gmail.com.

(Jan 17) Please fill out the following class survey.

(Jan 17) This course satisfies part of the course requirements for a new Designated Emphasis ("graduate minor") in Computational Science and Engineering.

(Jan 17) For students who want to try some on-line self-paced courses to improve basic programming skills, click here. You can use this material without having to register. In particular, courses like CS 9C (for programming in C) might be useful.

(Jan 17) This course will have students attending from several CITRIS campuses: these include UC Berkeley, UC Davis, UC Merced and UC Santa Cruz. CITRIS is generously providing the webcasting facilities and other resources to help run the course. Lectures will be webcast here (active during lectures only).

Class Resources and Homework Assignments

This will include, among other things, class handouts, homework assignments, the class roster, information about class accounts, pointers to documentation for machines and software tools we will use, reports and books on supercomputing, pointers to old CS267 class webpages (including old class projects), and pointers to other useful websites.

Lecture Notes and Video

Live video of the lectures may be seen here (only while the lecture is being given).

Archived video, posted after the lectures, may be found here.

Notes from previous offerings of CS267 are posted on old class webpages available under Class Resources

In particular, the web page from the 1996 offering has detailed, textbook-style notes available on-line that are still largely up-to-date in their presentations of parallel algorithms (the slides to be posted during this semester will contain some more recently invented algorithms as well).

Lectures (power point) for lectures from Spr 2010 will be posted here.

Lecture 1, Jan 18, Introduction (ppt)

Lecture 2, Jan 20, Single Processor Machines: Memory Hierarchies and Processor Features(ppt)

Lecture 3, Jan 25, Introduction to Parallel Machines and Programming Models(ppt)

Lecture 4, Jan 27, Sources of Parallelism and Locality in Simulation, Part 1 (ppt)

Lecture 5, Feb 1,

Sources of Parallelism and Locality in Simulation, Part 2 (ppt) (updated Jan 31, 3:40pm)

Tricks with Trees (ppt)

Lecture 6, Feb 3, Shared Memory Programming: Threads and OpenMP (ppt)

Lecture 7, Feb 8, Distributed Memory Machines and Programming(ppt)

Lecture 8, Feb 10, An Introduction to CUDA/OpenCL and Manycore Graphics Processors(pdf)

Lecture 9, Feb 15,

We will complete Lecture 7 on Distributed Memory Machines and Programming(ppt) (updated slides)

Debugging and Optimization Tools(pptx), presented by Richard Gerber

Performance Engineering and Debugging HPC Applications(pptx), presented by David Skinner

Lecture 10, Feb 17, Programming in Unified Parallel C - UPC(ppt)

Lecture 11, Feb 22, Dense Linear Algebra - Part 1(ppt)

Lecture 12, Feb 24, Dense Linear Algebra - Part 2(ppt)

Lecture 13, Mar 1, Graph Partitioning (ppt)

Lecture 14, Mar 3. We will finish Graph Partitioning (ppt) from last time (updated slides!), and then begin
Automatic Performance Tuning and Sparse-Matrix-Vector-Multiplication (SpMV) Partitioning (ppt)

Lecture 15, Mar 8. We will finish
Automatic Performance Tuning and Sparse-Matrix-Vector-Multiplication (SpMV) Partitioning (ppt)
from last time (updated slides!)

Lecture 16, Mar 10, Hierarchical Methods for the N-body problem (ppt)

Lecture 17, Mar 15, Structured Grids (ppt)

Lecture 18, Mar 17, Fast Fourier Transform(ppt)

Lecture 19, Mar 29, Architecting Parallel Software with Patterns (in pptx) or (in pdf), by Kurt Keutzer

Lecture 20, Mar 31, Engineering Parallel Software with Our Pattern Language (in pptx) or (in pdf), by Kurt Keutzer

Lecture 21, Apr 5, Cloud Computing with MapReduce and Hadoop, by Matei Zaharia

Lecture 22, Apr 7, Parallel Graph Algorithms, by Kamesh Madduri

Lecture 23, Apr 12, Frameworks for Complex Multiphysics HPC Applications, in pptx (18 MB) or pdf (36 MB), by John Shalf

Lecture 24, Apr 14, Modeling and Predicting Climate Change (ppt), by Michael Wehner

Lecture 25, Apr 19, Dynamic Load Balancing (ppt)

Lecture 26, Apr 21, Blood Flow Simulation at the Petascale and Beyond (pdf), by Richard Vuduc, co-winner of the 2010 Gordon Bell Prize

Lecture 27, Apr 26, Big Bang, Big Iron: High Performance Computing and the Cosmic Microwave Background (pptx), by Julian Borrill,

Lecture 28, Apr 28, Software and Algorithms for Exascale: Ten Ways to Waste an Exascale Computer (ppt)

Sharks and Fish

"Sharks and Fish" are a collection of simplified simulation programs that illustrate a number of common parallel programming techniques in various programming languages (some current ones, and some old ones no longer in use).

Basic problem description, and (partial) code from 1999 class, written in Matlab, CMMD, CMF, Split-C, Sun Threads, and pSather, available here.

Code (partial) from 2004 class, written in MPI, pthreads, OpenMP, available here.