Thanks for the photographer, Microsoft

Romil Bhardwaj


I'm a Ph.D. candidate at UC Berkeley working with Prof. Ion Stoica. I like to work on scheduling and resource management for machine learning workloads.

We are building SkyPilot, an open-source framework to run machine learning jobs on any cloud easily and cost-effectively. If you run ML jobs or would like to contribute to SkyPilot, please reach out!

Previously, I was a Research Fellow at Microsoft Research India, where I had a great time working on problems in modern systems and networks with the systems group. I obtained my B.Tech (with Honors) in Computer Science from IIIT-Delhi in 2015, where I worked on context-aware biometrics systems and open source energy monitoring systems.


Publications

SkyPilot

Cilantro: Performance-Aware Resource Allocation for General Objectives via Online Feedback

Romil Bhardwaj, Kirthevasan Kandasamy, Asim Biswal, Wenshuo Guo, Benjamin Hindman, Joseph Gonzalez, Michael Jordan, Ion Stoica

OSDI 2023

Paper / Code

SkyPilot

SkyPilot: An Intercloud Broker for Sky Computing

Zongheng Yang, Zhanghao Wu, Michael Luo, Wei-Lin Chiang, Romil Bhardwaj, Woosuk Kwon, Siyuan Zhuang, Frank Sifei Luan, Gautam Mittal, Scott Shenker, Ion Stoica

NSDI 2023

Paper / GitHub / Docs / Tutorial / Press

ESCHER

ESCHER: Expressive Scheduling with Ephemeral Resources

Romil Bhardwaj, Alexey Tumanov, Stephanie Wang, Richard Liaw, Philipp Moritz, Robert Nishihara, Ion Stoica

SoCC 2022

Paper / Slides / Poster / Blog

scheduling for continuous learning with Ekya

Ekya: Continuous Learning of Video Analytics Models on Edge Compute Servers

Romil Bhardwaj, Zhengxu Xia, Ganesh Ananthanarayanan, Junchen Jiang, Yuanchao Shu, Nikolaos Karianakis, Kevin Hsieh, Victor Bahl, Ion Stoica

NSDI 2022

Paper / Code + Datasets / Slides / Poster / Blog / Press / Patent

dynamic cluster resizing with rubberband

RubberBand: Cloud-based Hyperparameter Tuning

Ujval Misra, Richard Liaw, Lisa Dunlap, Romil Bhardwaj, Kirthevasan Kandasamy, Joseph E. Gonzalez, Ion Stoica, Alexey Tumanov

EuroSys 2021

Paper / Video

wow hypersched is smart

HyperSched: Dynamic Resource Reallocation for Model Development on a Deadline

Richard Liaw, Romil Bhardwaj, Lisa Dunlap, Yitian Zou, Joseph E. Gonzalez, Ion Stoica, Alexey Tumanov

SoCC 2019

Paper / Slides / Code

GPU Scheduling meets the Mahabharat.

Gandiva: Introspective Cluster Scheduling for Deep Learning

Wencong Xiao*, Romil Bhardwaj*, Ramachandran Ramjee, Muthian Sivathanu, Nipun Kwatra, Zhenhua Han, Pratyush Patel, Xuan Peng, Hanyu Zhao, Quanlu Zhang, Fan Yang, Lidong Zhou

OSDI 2018

Paper / Poster

Are you a municipality with hundreds of traffic cameras but don't know what to do? AutoCalib is for you!.

AutoCalib: Automatic Traffic Camera Calibration at Scale

Romil Bhardwaj, Gopi Krishna Tummala, Ganesan Ramalingam, Ramachandran Ramjee and Prasun Sinha

BuildSys 2017 [Best Paper and Best Demo]

Paper / Slides / Demo Video / Patent

Skip Correlation is the best correlation.

Skip-Correlation for Multi-Power Wireless Carrier Sensing

Romil Bhardwaj, Krishna Chintalapudi, and Ramachandran Ramjee

NSDI 2017

Paper / Slides / Talk / Patent

You seem to hangout with this dude quite often - you'll most likely be with him in your next photograph too!

Harnessing Social Context for Improved Face Recognition

Romil Bhardwaj, Gaurav Goswami, Richa Singh and Mayank Vatsa

IEEE International Conference on Biometrics (ICB) 2015

Paper / Visualization

Deep Learning for Face Recognition in 2013 - before deep learning was cool!

MDLFace: Memorability Augmented Deep Learning for Video Face Recognition

Gaurav Goswami, Romil Bhardwaj, Richa Singh and Mayank Vatsa

IEEE International Joint Conference on Biometrics (IJCB) 2014

Paper

* Co-Primary Authors

Miscellaneous

K8s tutorials

Kubernetes for Grad Students

I designed and open-sourced a tutorial on Kubernetes for CS systems grad students. I subject the members of RISELab to variants of this tutorial every semester.

Code / Slides

This project had raspberry pis. What's not to like?

Energy Monitoring @ IIIT-Delhi

We developed and deployed systems to collect energy consumption data from instrumentation across the campus. I wrote fault-tolerant Raspberry Pi drivers to collect, cache and relay data over diverse protocols.

Code

CarDashian - In-Car Navigation Sans Distraction

Phone based navigation systems can be distracting to use. To let the driver keep her eyes on the road, we built an app to project navigation instructions directly onto the windshield, without needing any dedicated hardware.

Teaching

CS182

GSI - CS182: Deep Neural Networks

Designed programming assignments and exams on ML Systems, ran office hours and lead discussion sections for the 250+ student class with Prof. Anant Sahai. Spring 2023 @ UC Berkeley.

I taught kids probability!

Head TA - Intro to Probability and Statistics

Designed programming assignments, quizzes and conducted tutorials for the incoming class of 180 freshmen. Winter 2015 @ IIIT-Delhi.