Chelsea Finn
cbfinn at cs dot stanford dot edu

I am interested in how algorithms can enable machines to acquire more general notions of intelligence through learning and interaction, allowing them to autonomously learn a variety of complex sensorimotor skills in real-world settings. This includes learning deep representations for representing complex skills from raw sensory inputs, enabling machines to learn through interaction without human supervision, and allowing systems to build upon what they've learned previously to acquire new capabilities with small amounts of experience.

I am currently a research scientist at Google Brain, a post-doc at Berkeley AI Research Lab (BAIR), and an acting assistant professor at Stanford. I will join the Stanford Computer Science faculty full time, starting in Fall 2019.

I recently completed my PhD in CS at UC Berkeley, working on machine learning and its intersection with robotic perception and control. I was a part of Berkeley AI Research Lab (BAIR), advised by Pieter Abbeel and Sergey Levine. Before graduate school, I received a Bachelors in EECS at MIT, where I worked on several research projects, including an assistive technology project in CSAIL under Seth Teller and an animal biometrics project under Sai Ravela. I have also spent time at Counsyl, Google, and Sandia National Labs.

Prospective students, please read this before contacting me.

CV  /  Google Scholar  /  GitHub  /  Twitter

  • We wrote a blog post describing our latest work on learning a single model from unsupervised interaction that can be used to accomplish many different tasks. Our approach enables robots to use visual foresight to plan to achieve goals.
  • I am honored to have received the MIT TR35 pioneer award.
  • Some of my research was featured in a TechCrunch article.
  • At NIPS 2017, we showcased our research on meta-imitation learning and visual foresight in a live robot demo! For more information and a video, see this page.
  • I wrote a blog post describing recent approaches to meta-learning and our recent paper on model-agnostic meta-learning. It was also translated into Chinese.
  • In summer 2017, I co-organized BAIR camp, a 2-day summer camp on human-centered AI for high-school students from low income backgrounds. We are organizing a second camp in August 2018.
  • In Spring 2017, I helped develop and co-taught a course on deep reinforcement learning.
  • show more
Recent Talk (October 2018)

Invited Talks and Lectures
Recent Preprints

One-Shot Hierarchical Imitation Learning of Compound Visuomotor Tasks
Tianhe Yu, Pieter Abbeel, Sergey Levine, Chelsea Finn
arXiv / project page

We aim to learn multi-stage vision-based tasks on a real robot from a single video of a human performing the task. We propose a method that learns both how to learn primitive behaviors from video demonstrations and how to dynamically compose these behaviors to perform multi-stage tasks by "watching" a human demonstrator.

Time Reversal as Self-Supervision
Suraj Nair, Mohammad Babaeizadeh, Chelsea Finn, Sergey Levine, Vikash Kumar
arXiv / project page

We propose a technique that uses time-reversal to learn goals and provide a high level plan to reach them. In particular, our approach explores outward from a set of goal states and learns to predict these trajectories in reverse, which provides a high-level plan towards goals.

Unsupervised Meta-Learning for Reinforcement Learning
Abhishek Gupta, Ben Eysenbach, Chelsea Finn, Sergey Levine

While meta-learning enables fast learning of new tasks, it requires a human to specify a distribution over tasks for meta-training. In effect, meta-learning offloads the design burden from algorithm design to task design. We propose to automate the design of tasks for meta-learning, describing a family of unsupervised meta-reinforcement learning algorithms that are truly automated.

Learning a Prior over Intent via Meta-Inverse Reinforcement Learning
Kelvin Xu, Ellis Ratner, Anca Dragan, Sergey Levine, Chelsea Finn

Learning the objective underlying example behavior is a challenging, under-defined problem, particularly when only a few demonstrations are available. However, there is structure among the type of behaviors that we might want agents to learn. We learn this structure from demonstrations across many tasks, acquiring a prior over intentions, and use this learned prior to infer reward functions for new tasks from only a few demonstrations.

Stochastic Adversarial Video Prediction
Alex Lee, Richard Zhang, Frederik Ebert, Pieter Abbeel, Chelsea Finn, Sergey Levine
arXiv / videos / code

We combine latent variable models with adversarial training to build a video prediction model that produces predictions that look more realistic to human raters and better cover the range of possible futures.

All Papers

Learning to Adapt in Dynamic, Real-World Environments Through Meta-Reinforcement Learning
Anusha Nagabandi*, Ignasi Clavera*, Simin Liu, Ron Fearing, Pieter Abbeel, Sergey Levine, Chelsea Finn
International Conference on Learning Representations (ICLR), 2019
arXiv / videos / code (coming soon)

We propose a method that learns how to adapt online to new situations and perturbations, through meta reinforcement learning. Unlike prior meta-RL methods, our approach is model-based, making it sample-efficient during meta-training and thus practical for real world problems.

Unsupervised Learning via Meta-Learning
Kyle Hsu, Sergey Levine, Chelsea Finn
International Conference on Learning Representations (ICLR), 2019
arXiv / project page / code

We propose CACTUs, an unsupervised learning algorithm that learns to learn tasks constructed from unlabeled data. CACTUs leads to significantly more effective downstream learning and enables few-shot learning without requiring labeled meta-learning datasets.

Probabilistic Model-Agnostic Meta-Learning
Chelsea Finn*, Kelvin Xu*, Sergey Levine
Neural Information Processing Systems (NIPS), 2018
arXiv / supplementary website

Few-shot learning problems can be ambiguous. We propose a modification of the MAML algorithm that can handle ambiguity by sampling different multiple classifiers. Our approach uses a Bayesian formulation of meta-learning, building upon prior work on hierarchical Bayesian models and variational inference.

Learning to Learn with Gradients
Chelsea Finn
PhD Dissertation, 2018

We develop a clear and formal definition of the meta-learning problem, its terminology, and desirable properties of meta-learning algorithms. Building upon these foundations, we present a class of model-agnostic meta-learning methods that embed gradient-based optimization into the learner. Finally, we show how these methods can be extended for applications in motor control by combining elements of meta-learning with techniques for deep model-based reinforcement learning, imitation learning, and inverse reinforcement learning.

Few-Shot Goal Inference for Visuomotor Learning and Planning
Annie Xie, Avi Singh, Sergey Levine, Chelsea Finn
Conference on Learning (CoRL), 2018
arXiv / videos / code

Specifying a reward or objective in the real world is hard. We propose a method that enables a robot to learn an objective from a few images of success by leveraging a dataset of positive and negative examples of previous tasks. We show how the objectives learned with our method can be used for both planning in the real world and reinforcement learning in simulation.

Robustness via Retrying: Closed-Loop Robotic Manipulation via Self-Supervised Learning
Frederik Ebert, Sudeep Dasari, Alex Lee, Sergey Levine, Chelsea Finn
Conference on Learning (CoRL), 2018
arXiv / video

Planning with video prediction models trained on self-supervised data allows robots to learn diverse manipulation skills. However, to recover from disturbances and inaccurate predictions, we need to track pixels continuously to evaluate the planning objective at each timestep. We propose a self-supervised image-to-image registration model that enables robust behavior.

Universal Planning Networks
Aranvind Srinivas, Allan Jabri, Pieter Abbeel, Sergey Levine, Chelsea Finn
International Conference on Machine Learning (ICML), 2018
arXiv / videos / code

We propose to embed differentiable planning within a goal-directed policy, integrating planning and representation learning. Our approach optimizes for representations that lead to effective goal-based planning for visual tasks. Our results show that the representation not only allow for effective goal-based planning through imitation, but also transfers to more complex robot morphologies and action spaces.

One-Shot Imitation from Observing Humans via Domain-Adaptive Meta-Learning
Tianhe Yu*, Chelsea Finn*, Annie Xie, Sudeep Dasari, Pieter Abbeel, Sergey Levine
Robotics: Science and Systems (RSS), 2018
arXiv / video / code

We develop a domain-adaptive meta-learning method that allows for one-shot learning under domain shift. We show that our method can enable a robot to learn to maneuver a new object after seeing just one video of a human performing the task with that object.

Meta-Learning and Universality: Deep Representations and Gradient Descent can Approximate any Learning Algorithm
Chelsea Finn, Sergey Levine
International Conference on Learning Representations (ICLR), 2018

We show that model-agnostic meta-learning (MAML), which embeds gradient descent into the meta-learning algorithm, can be as expressive as black-box meta-learners: both can approximate any learning algorithm. Furthermore, we empirically show that MAML consistently finds learning strategies that generalize to new tasks better than recurrent meta-learners.

Recasting Gradient-Based Meta-Learning as Hierarchical Bayes
Erin Grant, Chelsea Finn, Sergey Levine , Trevor Darrell, Tom Griffiths
International Conference on Learning Representations (ICLR), 2018

We reformulate the model-agnostic meta-learning algorithm (MAML) as a method for probabilistic inference in a hierarchical Bayesian model. Unlike prior methods for meta-learning via hierarchical Bayes, MAML is naturally applicable to large function approximators, like neural networks. Our interpretation sheds light on the meta-learning procedure and allows us to derive an improved version of the MAML algorithm.

Stochastic Variational Video Prediction
Mohammad Babaeizadeh, Chelsea Finn, Dumitru Erhan, Roy Campbell, Sergey Levine
International Conference on Learning Representations (ICLR), 2018
arXiv / code (coming soon) / video results

We present a stochastic video prediction method, SV2P, that builds upon the conditional variational autoencoder to make stochastic predictions of future video. We find that pretraining is crucial for enabling stochasticity. Our experiments demonstrate stochastic multi-frame predictions on three real world video datasets.

Deep Reinforcement Learning for Vision-Based Robotic Grasping: A Simulated Comparative Evaluation of Off-Policy Methods
Deirdre Quillen*, Eric Jang*, Ofir Nachum*, Chelsea Finn, Julian Ibarz , Sergey Levine
International Conference on Robotics and Automation (ICRA), 2018
arXiv / project page / benchmark code

We propose a simulated benchmark for robotic grasping that emphasizes off-policy learning and generalization to unseen objects. Our results indicate that several simple methods provide a surprisingly strong competitor to popular deep RL algorithms such as double Q-learning, and our analysis sheds light on the relative tradeoffs between the methods.

One-Shot Visual Imitation Learning via Meta-Learning
Chelsea Finn*, Tianhe Yu*, Tianhao Zhang, Pieter Abbeel, Sergey Levine
Conference on Robot Learning (CoRL), 2017 (Long Talk)
Oral presentation at the NIPS 2017 Deep Reinforcement Learning Symposium
arXiv / code / result video / talk video

Using demonstration data from a variety of tasks, our method enables a real robot to learn a new related skill, trained end-to-end, using a single visual demonstration of the skill. Our approach also allows for the provided demonstration to be a raw video, without access to the joint trajectory or controls applied to the robot arm.

Self-Supervised Visual Planning with Temporal Skip Connections
Frederik Ebert, Chelsea Finn, Alex Lee, Sergey Levine
Conference on Robot Learning (CoRL), 2017 (Long Talk)
arXiv / code / video results and data

We present three simple improvements to our prior work on self-supervised visual foresight that lead to substantially better visual planning capabilities. Our method can perform tasks that require longer-term planning and involve multiple objects.

Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Chelsea Finn, Pieter Abbeel, Sergey Levine
International Conference on Machine Learning (ICML), 2017
arXiv / blog post / code / video results

We propose a model-agnostic algorithm for meta-learning, where a model's parameters are trained such that a small number of gradient updates with a small amount of training data from a new task will produce good generalization performance on that task. Our method learns a classifier that can recognize images of new characters using only a few examples, and a policy that can rapidly adapt its behavior in simulated locomotion tasks.

Generalizing Skills with Semi-Supervised Reinforcement Learning
Chelsea Finn, Tianhe Yu, Justin Fu, Pieter Abbeel, Sergey Levine
International Conference on Learning Representations (ICLR), 2017
arXiv / video results / code

We formalize the problem of semi-supervised reinforcement learning (SSRL), motivated by real-world scenarios where reward information is only available in a limited set of scenarios such as when a human supervisor is present, or in a controlled laboratory setting. We develop a simple algorithm for SSRL based on inverse reinforcement learning and show that it can improve performance by using 'unlabeled' experience.

Deep Visual Foresight for Planning Robot Motion
Chelsea Finn, Sergey Levine
International Conference on Robotics and Automation (ICRA), 2017
Best Cognitive Robotics Paper Finalist
arXiv / video

We combine an action-conditioned predictive model of images, "visual foresight," with model-predictive control for planning how to push objects. The method is entirely self-supervised, requiring minimal human involvement.

Reset-Free Guided Policy Search: Efficient Deep Reinforcement Learning with Stochastic Initial States
William Montgomery*, Anurag Ajay*, Chelsea Finn, Pieter Abbeel, Sergey Levine
International Conference on Robotics and Automation (ICRA), 2017
arXiv / video / code

We present a new guided policy search algorithm that allows the method to be used in domains where the initial conditions are stochastic, which makes the method more applicable to general reinforcement learning problems and improves generalization performance in our robotic manipulation experiments.


A Connection Between Generative Adversarial Networks, Inverse Reinforcement Learning, and Energy-Based Models
Chelsea Finn*, Paul Christiano*, Pieter Abbeel, Sergey Levine
NIPS Workshop on Adversarial Training, 2016

We show that a sample-based algorithm for maximum entropy inverse reinforcement learning (MaxEnt IRL) corresponds to a generative adversarial network (GAN) with a particular choice of discriminator. Since MaxEnt IRL is simply an energy-based model (EBM) for behavior, we further show that GANs optimize EBMs with the corresponding discriminator, pointing to a simple and scalable EBM training procedure using GANs.


Active One-Shot Learning
Mark Woodward, Chelsea Finn
NIPS Deep Reinforcement Learning Workshop, 2016
arXiv / video description / poster

We propose a technique for learning an active learning strategy by combining one-shot learning and reinforcement learning, and allowing the model to decide, during classification, which examples are worth labeling. Our experiments demonstrate that our model can trade-off accuracy and label requests based on the reward function provided.

Unsupervised Learning for Physical Interaction through Video Prediction
Chelsea Finn, Ian Goodfellow, Sergey Levine
Neural Information Processing Systems (NIPS), 2016
arXiv / videos / data / code

Our video prediction method predicts a transformation to apply to the previous image, rather than pixels values directly, leading to significantly improved multi-frame video prediction. We also introduce a dataset of 50,000 robotic pushing sequences, consisting of over 1 million frames.

Adapting Deep Visuomotor Representations with Weak Pairwise Constraints
Eric Tzeng, Coline Devin, Judy Hoffman, Chelsea Finn, Pieter Abbeel, Sergey Levine, Kate Saenko, Trevor Darrell
Workshop on the Algorithmic Foundations of Robotics (WAFR), 2016

Collecting real-world robotic experience for learning an initial visual representation can be expensive. Instead, we show that it is possible to learn a suitably good initial representation using data collected largely in simulation.

Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization
Chelsea Finn, Sergey Levine, Pieter Abbeel
International Conference on Machine Learning (ICML), 2016
Oral presentation at the NIPS 2016 Deep Learning Symposium
arXiv / video results / code / talk video

We propose an method for Inverse Reinforcement Learning (IRL) that can handle unknown dynamics and scale to flexible, nonlinear cost functions. We evaluate our algorithm on a series of simulated tasks and real-world robotic manipulation problems, including pouring and inserting dishes into a rack.

End-to-End Training of Deep Visuomotor Policies
Sergey Levine*, Chelsea Finn*, Trevor Darrell, Pieter Abbeel
CCC Blue Sky Ideas Award
Journal of Machine Learning Research (JMLR), 2016
arXiv / video / project page / code

We demonstrate a deep neural network trained end-to-end, from perception to controls, for robotic manipulation tasks.

Deep Spatial Autoencoders for Visuomotor Learning
Chelsea Finn, Xin Yu Tan, Yan Duan, Trevor Darrell, Sergey Levine, Pieter Abbeel
International Conference on Robotics and Automation (ICRA), 2016
arXiv / video

We learn a lower dimensional visual state-space without supervision using deep spatial autoencoders, and use it to learn nonprehensile manipulation tasks, such as pushing a lego block and scooping a bag into a bowl.


Learning Deep Neural Network Policies with Continuous Memory States
Marvin Zhang, Zoe McCarthy, Chelsea Finn, Sergey Levine, Pieter Abbeel
International Conference on Robotics and Automation (ICRA), 2016
arXiv / video

We propose a method for learning recurrent neural network policies using continuous memory states. The method learns to store information in and use the memory states using trajectory optimization. Our method outperforms vanilla RNN and LSTM baselines.

Bridging text spotting and SLAM with junction features.
Hsueh-Cheng Wang, Chelsea Finn, Liam Paull, Michael Kaess, Ruth Rosenholtz, Seth Teller, John Leonard
International Conference on Intelligent Robots and Systems (IROS), 2015

We develop a method that integrates text-spotting with simultaneous localization and mapping (SLAM), that determines loop closures using text in the environment.


Beyond Lowest-Warping Cost Action Selection in Trajectory Transfer
Dylan Hadfield-Menell, Alex X. Lee, Chelsea Finn, Eric Tzeng, Sandy Huang, Pieter Abbeel,
International Conference on Robotics and Automation (ICRA), 2015

We consider the problem of selecting which demonstration to transfer to the current test scenario. We frame the problem as an options Markov decision process (MDP) and develop an approach to learn a Q-function from expert demonstrations. Our results show significant improvement over nearest-neighbor selection.


CS294-112: Deep Reinforcement Learning - Spring 2017

CS188: Introduction to Artificial Intelligence - Spring 2015
Graduate Student Instructor (GSI)

6.S080: Introduction to Inference - Spring 2014
Teaching Assistant (TA)

6.141: Robotics: Science and Systems - Spring 2013
Lab Assistant (LA)

6.02: Digital Communication Systems - Spring 2012
Lab Assistant (LA)

This guy makes a nice webpage.