I am a Ph.D. student in the UC Berkeley NetSys Lab, advised by Sylvia Ratnasamy. I am broadly interested in networking, computer systems, and cloud computing. My PhD thesis work focuses on understanding performance in data analytics frameworks.
I am also a committer and PMC member for Apache Spark. My work on Spark has focused on improving scheduler performance, and I currently help maintain and review pull requests for the scheduler code.
I am currently supported by a Google PhD Fellowship. In the past, I was supported by a Hertz Foundation Graduate Fellowship, a UC Berkeley Chancellor's Fellowship, and a Google Anita Borg Memorial Scholarship.
The first component of my thesis work focused on characterizing the performance of large-scale data analytics frameworks like Spark. As part of that project, I added instrumentation to Spark to measure how much time is spent doing network and disk I/O. Most of that instrumentation is now part of Spark, and can be visualized in the Spark UI by clicking the "Event Timeline" link on the stage detail page. More information about that project is available here; that page includes links to some detailed traces we collected.
One takewaway from my work measuring performance in current systems is that today's systems make it difficult to reason about performance. In Spark, for example, pervasive pipelining and parallelism make it difficult (even with extensive instrumentation and metrics) for users to model performance and understand how changing the software or hardware configuration would impact performance. Today's users have many choices in how to configure their workloads (e.g., what type of EC2 instance should they use to run their job?); without the ability to reason about performance, they cannot configure for the best performance. My current research focuses on a new system, Monotasks, that we've designed with the singular goal of making it easy for users to reason about performance. Monotasks is a replacement for the execution layer of Apache Spark, and is fully API-compatible with Spark. For more information about monotasks, refer to my talk at Spark Summit 2016 (linked below).
Making Sense of Performance in Data Analytics Frameworks Kay Ousterhout, Ryan Rasti, Sylvia Ratnasamy, Scott Shenker, Byung-Gon Chun NSDI 2015
Sparrow: Distributed, Low Latency Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica SOSP 2013
The Case for Tiny Tasks in Compute Clusters Kay Ousterhout, Aurojit Panda, Joshua Rosen, Shivaram Venkataraman, Reynold Xin, Sylvia Ratnasamy, Scott Shenker, Ion Stoica HotOS 2013
Sparrow: Scalable Scheduling for Sub-Second Parallel Jobs Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica UC Berkeley EECS 2013