I am primarily interested in scalable systems and algorithms for machine learning (ML). With the ever-increasing size of data and machine learning models, I find myself in a unique position to make an impact on the world by developing scalable machine learning systems. I pursue this goal at the RISE and BAIR labs in Berkeley, advised by Michael I. Jordan and Ion Stoica. I began my time at Berkeley by making various contributions to project Ray, which included a rewrite of Ray's communication protocol to accelerate distributed optimization algorithms. Since then, I have worked on optimization algorithms and optimization of automatic serial-to-parallel translation algorithms, which has led me to my current project: NumS, a distributed systems library which seamlessly scales NumPy, and provides tools for optimal parallelization of NumPy programs. In collaboration with the Berkeley NLP group, I am also working on tractable StarCraft II agents, a real-time strategy game with challenges similar to various problems in NLP.
Prior to joining Berkeley, I had the honor of working with Jennifer Listgarden, Nicolo Fusi, and Adam Kalai as a research software engineer in computational biology and machine learning at Microsoft Research, New England. After graduating from high school, I worked full-time as a self-taught software engineer, and later started my undergraduate education through Harvard Extension School. I completed my undergraduate degree in mathematics cum laude through Harvard GSAS and MIT, where I had the honor of working with Finale Doshi-Velez on machine learning research problems in healthcare. Please see my curriculum vitae for a detailed description of my background.
Projects
NumS is a Numerical computing library for Python that Scales. It is a distributed memory array abstraction that matches much of the syntax of the NumPy API, extending NumPy to scale horizontally, as well as provide inter-operation parallelism (e.g. automatic parallelization of Python loops). The project is based on simple yet powerful data structures, called N-dimensional futures, making use of concurrently executing futures to solve many of the problems which arise when parallelizing basic linear algebra operations on distributed memory systems. One notable emergent property of NumS is its automation of data and model parallelism for machine learning applications, eliminating the need for researchers and machine learning practitioners to manually implement distributed memory algorithms, such as the distributed stochastic gradient method.
Publications
- Variance Reduction with Sparse Gradients. Melih Elibol, Lihua Lei, Michael I. Jordan. ICLR 2020.
- Probabilistic Matrix Factorization for Automated Machine Learning. Nicolo Fusi, Rishit Sheth, Melih Elibol. NIPS 2018.
- Ray: A Distributed Framework for Emerging AI Applications. Philipp Moritz, Robert Nishihara, Stephanie Wang, Alexey Tumanov, Richard Liaw, Eric Liang, Melih Elibol, Zongheng Yang, William Paul, Michael I. Jordan, Ion Stoica. OSDI 2018.
- Flexible Primitives for Distributed Deep Learning in Ray. Yaroslav Bulatov, Robert Nishihara, Philipp Moritz, Melih Elibol, Ion Stoica, Michael Jordan. SysML 2018.
- Predicting off-target effects for end-to-end CRISPR guide design. Jennifer Listgarten, Michael Weinstein, Huseyin Melih Elibol, Luong Hoang, John Doench, and Nicolo Fusi. Nature Biomedical Engineering 2018.
- Cross-Corpora Unsupervised Learning of Trajectories in Autism Spectrum Disorders. Huseyin Melih Elibol, Vincent Nguyen, Scott Linderman, Matthew Johnson, Amna Hashmi, and Finale Doshi-Velez. Journal of Machine Learning Research 2016.
- Supervised topic models for clinical interpretability Michael Hughes, Huseyin Melih Elibol, Thomas McCoy, Roy Perlis and Finale Doshi-Velez. NIPS 2016 Workshop on Machine Learning for Health.
Teaching
- Spring 2018 (GSI): CS 189, Introduction to Machine Learning
- Spring 2021 (GSI): CS 267, Applications of Parallel Computers