CS294-87: Datacenter Networking and the New Converged Internet

TuTh 11-12:30, Room 310 Soda Hall

Prof. Randy H. Katz, Instructor

Course Overview

Datacenters, in which organizations like Google and Facebook place tens or hundreds of thousands of computers inside a single co-located facility, present new challenges and opportunities for data networking. This has become a leading topic of interest within the research community, with new proposals for topologies, routing, scheduling, and protocols appearing in virtually every network-related research conference. The community has rediscovered some of the earlier literature that informed high-performance interconnects for multi-computers in the 1980s, revisited today at much larger scales and constructed from commodity technologies like switched ethernet or supercomputer-based infiniband.

The networked datacenter must also be considered in the context of how it is embedded in the wide-area Internet. Modern datacenters are no longer single nodes, but rather are organized into geographically distributed "fleets" of datacenters interconnected by wide-area fiber. New access networks, like Fourth Generation cellular, will offer new data capabilities to connect large communities of users to the applications running in the datacenters. New approaches will be needed for effective managing end-to-end connections in this highly converged, virtualized, and software-enabled network architecture.

In this course, we will read and discuss recent papers, interspersed with industrial speakers who have unique insights into the ways in which the real Internet is rapidly evolving. In addition to actively participating in the discussion of papers, students will undertake a substantial research project to investigate dimensions of new converged Internet.

Topics to be Covered

Expected Background

Graduate level courses in networking (e.g., CS 268) or operating systems (e.g., CS 262) or familiarity with recent research literature in these areas

Draft Syllabus

  CS 294-87 Datacenter Networking   Difane    
Week Date Topics Authors Titles Venue  
1 22-Jan-13 Overview/"Future"  Internet B. M. Leiner, et al. "The Past and Future History of the Internet"
(Randy)
Comm. ACM, V.40, N.2, Feb 1997, pp. 102-108.
J. Pan, et al. "A Survey of the Research on Future Internet Architectures"
(Randy)
IEEE Communications Magazine, (July 2011), pp. 26-36.
24-Jan-13 Cloud Computing K. Kant "Data Center Evolution: A Tutorial on State of the Art, Issues, and Challenges"
(Randy)
Computer Networks, 53 (2009), pp. 2939-2965.
A. Greenberg, et al. The Cost of a Cloud: Research Problems in Data Center Networks
(Randy)
ACM Computer Communications Review, V. 39, N. 1, (January 2009)
2 29-Jan-13 High Performance Interconnect D. Abts, B. Federman "A Guided Tour Through Datacenter Networking"
(Randy)
ACM Queue, 2012  
    Basics D. Abts, J. Kim High Performance Datacenter Networks
(Randy)
Synthesis Lectures on Computer Architecture #14 Ch. 1-3
  31-Jan-13 High Performance Interconnect D. Abts, J. Kim High Performance Datacenter Networks
(Randy)
Synthesis Lectures on Computer Architecture #14 Ch. 4-6
    Topologies        
3 5-Feb-13 High Performance Interconnect D. Abts, J. Kim High Performance Datacenter Networks
(Randy)
Synthesis Lectures on Computer Architecture #14 Ch. 7-9
Switch Architecture/Case Studies
7-Feb-13 Datacenter Traffic S. Kandula, et al. "The Nature of Datacenter Traffic: Measurements & Analysis"
(Yuan)
IMC'09, Chicago, IL (Nov 2009), pp. 202-208.
T. Benson, et al. "Network Traffic Characteristics of Data Centers in the Wild"
(Stephan)
IMC'10, Melbourne, Australia (Nov 2010).
4 12-Feb-13 Datacenter Topologies M. Al-Fares, et al. "A Scalable, Commodity Data Center Network Architecture"
(Kaifei)
ACM SIGCOMM, Seattle, WA, (Aug 2008).  
      R. Maysore, et al. "PortLand: A Scalable Fault-Tolerant Layer 2 Data Center Network Fabric"
(Pete)
ACM SIGCOMM, Barcelona, Spain, (Aug 2009).  
  14-Feb-13   C. Guo, et al. "DCell: A Scalable and Fault-Tolerant Network Structure for Data Centers"
(Kevin)
ACM SIGCOMM, Seattle, WA, (Aug 2008).  
      C. Guo, et al. "Bcube: A High Performance, Server-centric Network Architecture for Modular Datacenters"
(HY)
ACM SIGCOMM, Barcelona, Spain, (Aug 2009).  
5 19-Feb-13 Datacenter Topologies A. Greenberg, et al. "VL2: A Scalable and Flexible Data Center Network"
(Ben)
ACM SIGCOMM, Barcelona, Spain, (Aug 2009).
J. Mudigonda, et al. "SPAIN: COTS Data-Center Ethernet for Multipathing over Arbitrary Topologies"
(James)
NSDI'10, San Jose, CA, (Apr 2010).
21-Feb-13 A. Singla, et al. "Jellyfish: Networking Data Centers Randomly"
(Rashmi)
HotCloud'11, Portland, OR, (June 2011).
J-Y Shin, et al. "Small World Datacenters"
(Zack)
SOCC'11, Cascais, Portugal, (Oct 2011).
6 26-Feb-13 Project Proposals/Presentations        
             
  28-Feb-13 Alternative Technologies G. Wang, et al. "c-Through: Part-time Optics in Data Centers"
(Randy)
ACM SIGCOMM, New Delhi, India, (Aug 2010).  
      D. Halperin, et al. "Augmenting Data Center Networks with Multi-Gigabit Wireless Links"
(Kang)
ACM SIGCOMM, Toronto, Canada, (Aug 2011).  
7 5-Mar-13 Addressing, Routing, Scheduling K. Chen, et al. "Generic and Automatic Address Configuration for Data Center Networks"
(Kaifei)
ACM SIGCOMM, New Delhi, India, (Aug 2010).
H. Abu-Libdeh, et al. "Symbiotic Routing in Future Data Centers"
(Ben)
ACM SIGCOMM, New Delhi, India, (Aug 2010).
7-Mar-13 Addressing, Routing, Scheduling M. Al Fares, et al. "Hedera: Dynamic Flow Scheduling for Data Center Networks"
(Kang)
NSDI'10, San Jose, CA, (Apr 2010).
C. Wilson, et al. "Better Never than Late: Meeting Deadlines in Datacenter Networks"
(Stephan)
ACM SIGCOMM, Toronto, Canada, (Aug 2011).
8 12-Mar-13 Datacenter Transport V. Vasudevan, et al. "Safe and Effective Fine-grained TCP Retransmissions for Datacenter Communications"
(Yuan)
ACM SIGCOMM, Barcelona, Spain, (Aug 2009).  
      H. Wu, et al. "ICTCP: Incast Congestion Control for TCP in Data Center Networks"
(Kevin)
ACM CoNEXT 2010, Philadelphia, PA, (Dec 2010).  
  14-Mar-13 Datacenter Transport M. Alizadeh, et al. "Data Center TCP (DCTCP)"
(Randy)
ACM SIGCOMM, New Delhi, India, (Aug 2010).  
      C. Raiciu, et al. "Improving Datacenter Performance and Robustness with Multipath TCP"
(Randy)
ACM SIGCOMM, Toronto, Canada, (Aug 2011).  
9 19-Mar-13 Datacenter Network Performance H. Ballani, et al. "Towards Predictable Datacenter Networks"
(Rashmi)
ACM SIGCOMM, Toronto, Canada, (Aug 2011).
D. Zats, et al. "DeTail: Reigning in the Long Latency Tail"
(Randy)
ACM SIGCOMM, Helsinki, Finland, (Aug 2012).
21-Mar-13 Project Presentations/Updates
10 26-Mar-13 Spring Break        
             
  28-Mar-13 Spring Break        
             
11 2-Apr-13 Software-Defined Networking M. Dobrescu, et al. "RouteBricks: Exploiting Parallelism To Scale Software Routers"
(HY)
SOSP'09, (Oct 2009).
N. Guide, et al. "NOX: Towards an Operating System for Networks"
(James)
ACM Computer Communication Review, V. 38, N. 3 (July 2008).
4-Apr-13 Software-Defined Networking T. Koponen, et al. "Onix: A Distributed Control Platform for Large-scale Production Networks"
(Kaifei)
OSDI'10, Vancouver, Canada, (Oct 2010).
R. Sherwood, et al. "Can the Production Network Be the Testbed?"
(Ben)
OSDI'10, Vancouver, Canada, (Oct 2010).
12 9-Apr-13 Storage Networking D. Sacks Demystifying Storage Networking
(Rashmi)
IBM White Paper, 2001.  
      P. Sarkar, K. Voruganti IP Storage: The Challenge Ahead
(Kang)
IBM Almaden  
  11-Apr-13 Storage Networking G. Gibson, R. Van Meter "Network Attached Storage Architecture"
(HY)
Comm. ACM, V. 43, N. 11, (Nov 2000), pp. 37-45.  
      G. Gibson, et al. "A Cost-Effective, High-Bandwidth Storage Architecture"
(Kevin)
ACM ASPLOS 1998.  
13 16-Apr-13 Cellular Data Networking I. Akylidiz, et al. "The Evolution to 4G Cellular Systems: LTE-Advanced"
(Zack)
Physical Communication (31 August 2010)
F. Mlinaesky, B. Turner "4G Wireless Evolution Tutorial" (pptx)
(Stephan)
4G Wireless Evolution Conference, (2-4 Feb 2009), Miami, FL
18-Apr-13 EECS Faculty Retreat Class Cancelled
14 23-Apr-13 Cellular Data Networking J. Hamalainen "Towards Heterogeneous Networks" (pptx)
(Zack)
Ericsson, (Feb 2012)
      4G Americas "Developing and Integrating a High Performance Het-Net"
(Randy)
(Oct 2012)  
  25-Apr-13 Network Convergence J. Choi, et al. "A Survey on Content-Oriented Networking for Efficient Content Delivery"
(James)
IEEE Communications Magazine, (March 2011), pp. 121-127.  
      V. Jacobson, et al.  "Networking Named Content" "Introduction to Content Centric Networking" (slides)
(Randy)
CoNEXT '09, Rome, Italy, pp. 1-12.  
15 30-Apr-13 Wrap-Up
2-May-13 Final Project Presentations
Miscellaneous References
Y. Chen, et al. "Understanding TCP Incast and Its Implications for Big Data Workloads" Usenix :login; V 37, N 3, (June  2012), pp. 24-38
A. Shpiner, I. Keslassy "A Switch-based Approach to Throughput Collapse and Starvation in Data Centers" IWQOS'10
D. Crisan, et al. "Short and Fat: TCP Performance in CEE Datacenter Networks" IEEE Symp. High Performance Interconnects, 2011
S. Kulkarni, P. Agrawal "A Probabilistic Approach to Address TCP Incast in Data Center Networks" IEEE Intl. Conf. on Distributed Computing Systems, 2011.
M. Yu, et al. "Scalable Flow-Based Networking with DIFANE" SIGCOMM'10, New Delhi, (August, 2010), pp. 351-362.
M. Casado, et al. "Virtualizing the Network Forwarding Plane" ACM Presto'10, Philadelphia, PA, (Nov 2010).
P. Gill, et al. "Understanding Network Failures in Data Centers: Measurement, Analysis, Implications" ACM SIGCOMM'11, Toronto, CA, (August 2011).

Grading

50% -- Seminar Participation including leading paper discussions;
50% -- Course Research Project, including proposal and project presentations, and end of semester poster session;
No Exams!

Last updated: 20 April 2013 randy@cs.Berkeley.edu