Tentative Lecture Schedule


Lecture Date Topic  Reading Comments
1 Aug 29 Overview, Logistics, Goals
(Notes: .ppt, .pdf)
2 Aug 31 Datacenter Architectures
  1. The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines (chapters 1 and 2) [Patrick Wendell, slides]
  2. Above the Clouds: A Berkeley View of Cloud Computing [Haoyuan Li]
Sep 5 Labor Day
3 Sep 7 Cloudera's Software Stack [Invited Lecture: Aaron Myers, Cloudera] [slides ]
  1. The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines (chapters 3, 4, 7)
  2. Warehouse-Scale Computing: Entering the Teenage Decade (video presentation, Luiz Andre Barroso, Google)
4 Sep 12 Technology Trends [slides ]
  1. Graphic Processing Units (GPUs): Computer Architecture, Fifth Edition: A Quantitative Approach, 5th edition , Chapter 4 (hard copies will be distributed in the class) [Aurojit Panda, slides]
  2. Multicore CPUs: Amdahl's Law in the Multicore Era
  3. [Andrew Wang, notes]
  4. Solid State Devices (SSDs): PerformanceModeling and Analysis of Flash-based Storage Devices [Nitesh Mor, slides]
5 Sep 14 Project Suggestions [slides]
    The Datacenter Needs an Operating System
6 Sep 19 Consistency, Availability, Partitions [slides]
  1. Cluster-Based Scalable Network Services [Shivaram Venkataraman, slides ]
  2. Brewer's Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services ( Brewer's CAP Theorem - Julian Browne) [Peter Bailis, slides]
7 Sep 21 Paxos [slides]
  1. Paxos Made Simple [Gene Pang, slides]
  2. Paxos Made Practical [Gautam Kumar, slides]
  3. The Chubby Lock Service for Loosely-Coupled Distributed Systems [Mosharaf Chowdhury, slides]
8 Sep 26 Cluster File Systems
  1. The Google File System [Nitesh Mor, slides]
  2. Megastore: Providing Scalable, Highly Available Storage for Interactive Services [Sameer Agarwal, slides]
9 Sep 28 Data-flow Computation Frameworks
  1. MapReduce: Simplified Data Processing on Large Clusters [Angel Rodrigues, slides]
  2. Dryad: Distributed Data-Parallel Programs from Sequential Building Blocks [slides]
10 Oct 3 Relational Storage
  1. HIVE: Data Warehousing & Analytics on Hadoop [Cliff Engle, slides]
  2. Pig Latin: A Not-So-Foreign Language for Data Processing [Gene Pang]
  3. Scads: Scale-independent storage for social computing applications [Reynold Xin, slides]
11 Oct 5 Column-Oriented Storage Systems [Invited Lecture on HBase: Dhruba Borthakur, Facebook, slides]
  1. Bigtable: A Distributed Storage System for Structured Data [Ye Yuan]
  2. HBase
12 Oct 10 Key-Value Store and Interactive Query Systems
  1. Dynamo: Amazon's Highly Available Key-Value Store [Kay Ousterhout, slides]
  2. Dremel: Interactive Analysis of Web-Scale Datasets [Sameer Agarwal, slides]
13 Oct 12 Big Data in the Clouds
  1. DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language [Tathagata Das, slides]
  2. FlumeJava: easy, efficient data-parallel pipelines [Mosharaf Chowdhury, slides]
14 Oct 17 Geographic distributed Storage [Invited Lecture: Raghu Ramakrishnan, Yahoo! Research]
  1. PNUTS: Yahoo!'s Hosted Data Serving Platform
  2. Don't Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with COPS
15 Oct 19 Programming Languages for the Cloud
  1. BOOM Analytics: Exploring Data-Centric, Declarative Programming for the Cloud [Ye Yuan, slides]
  2. Erlang - A survey of the language and its industrial applications [Aurojit Panda, slides]
16 Oct 24 DBases in the Cloud
  1. Relational Cloud: A Database-as-a-Service for the Cloud [Arka Bhattacharya, slides]
  2. Database Scalability, Elasticity, and Autonomy in the Cloud [Andrew Wang, slides]
17 Oct 26 In-Memory Frameworks
  1. Piccolo: Building Fast, Distributed Programs with Partitioned Tables [Shivaram Venkataraman]
  2. Spark [Antonio Lupher, slides]
18 Oct 31 Multiprgramming for Datacenters [Invited Lecture on Hadoop NextGen: Arun Murthy, Hartonworks]
  1. Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center
  2. Hadoop NextGen
19 Nov 2 OSes and Clouds
  1. An Operating System for Multicore and Clouds: Mechanisms and Implementation [Gautam Kumar, slides]
  2. Akaros , (a more recent version of the paper is here) [Albert Kim]
20 Nov 7 Networking: topologies [Invited Lecture: Amin Vahdat, UCSD/Google]
  1. VL2: A Scalable and Flexible Data Center Network
  2. PortLand: A Scalable Fault-Tolerant Layer 2 Data Center Network Fabric
  3. c-Through: Part-time Optics in Data Centers
21 Nov 9 Networking: Traffic Management
  1. Hedera: Dynamic Flow Scheduling for Data Center Networks [Tathagata Das slides]
  2. Managing Data Transfers in Computer Clusters with Orchestra [Justine Sherry, slides]
22 Nov 14 Networking: Transport Protocol Improvements [slides]
  1. Data Center TCP (DCTCP) [Shaddi Hasan, slides]
  2. Improving Datacenter Performance and Robustness with Multipath TCP[Anand Iyer, slides]
  3. ICTCP: Incast Congestion Control for TCP in Data Center Networks [Hilfi Madari Alkaff, slides]
23 Nov 16 Frameworks for Graph Computations [slides]
  1. Pregel: a system for large-scale graph processing [Patrick Wendell slides]
  2. The GraphLab Abstraction [Mosharaf Chowdhury, slides]
24 Nov 21 Security
  1. CryptDB: A Practical Encrypted Relational DBMS [Paul Pearce]
  2. Hey, You, Get Off of My Cloud: Exploring Information Leakage in Third-Party Compute Clouds [Edward Wu, slides]
25 Nov 23 Memory Management
  1. The Case for RAMClouds: Scalable High-Performance Storage Entirely in DRAM [Sangjin Han, slides]
  2. PACMan: Coordinated Memory Caching for Parallel Jobs [Reynold Xin, slides]
26 Nov 28 Scheduling and Resource Management [Invited Lecture on resource management challenges at Google: John Wilkes, Google, slides]
  1. Dominant Resource Fairness (DRF)
  2. Modeling and Synthesizing Task Placement Constraints in Google Compute Clusters
27 Nov 30 "What is Good Research?"
  1. Hamming's "You and Your Research" talk
  2. Allen Newell's research style
  3. Patterson's "How to Have a Bad Career in Research/Academia" talk
No reviews required for this lecture's readings!
Dec 7 Poster Session (9:30am-11:30pm, Wozniak Lounge)
Dec 9 Final project report due (11:59pm)