News

Invited Talks and Lectures
 At NIPS 2017, I gave an invited talk in the Workshop on MetaLearning (slides here) and contributed talks at the Deep Reinforcement Learning Symposium and the workshop on Deep Learning: Bridging Theory and Practice.
 In August 2017, I gave guest lectures on modelbased reinforcement learning and inverse reinforcement learning at the Deep RL Bootcamp (slides here and here, videos here and here).
 At ICML 2017, I gave a tutorial with Sergey Levine on Deep Reinforcement Learning, Decision Making, and Control (slides here, video here).
 At ICML 2017, I gave an invited talk in the Reinforcement Learning Workshop (slides here) and a contributed talk in the Lifelong Learning Workshop (slides here).
 At RSS 2017, I gave an invited talk at the workshop on New Fronteirs for Deep Learning in Robotics (slides here).
 In May 2017, I gave a talk at the Symposium on Robot Learning at UC Berkeley (slides here, video here).
 In March 2017, I gave a talk on the guided policy search codebase at the Open Source Software for Decision Making Workshop at Stanford (video here).
 In January 2017, I gave a talk at the Rework Deep Learning Summit in SF (video here).
 At NIPS 2016, I gave invited talks in the Deep Learning Symposium, Deep RL Workshop, and Neurorobotics Workshop. I will also be giving a contributed talk in the Intuitive Physics Workshop.

Research
I am interested in how learning algorithms can enable machines to acquire common sense, allowing them to autonomously learn a variety of complex sensorimotor skills in realworld settings. This includes learning deep representations for representing complex skills from raw sensory inputs, enabling machines to learn on their own, without human supervision, and allowing systems to build upon what they've learned previously to quickly acquire new skills with a small amount of experience.

Recent Preprints

Unsupervised MetaLearning for Reinforcement Learning
Abhishek Gupta,
Ben Eysenbach,
Chelsea Finn,
Sergey Levine
arXiv
While metalearning enables fast learning of new tasks, it requires a human to specify a distribution over tasks for metatraining. In effect, metalearning offloads the design burden from algorithm design to task design. We propose to automate the design of tasks for metalearning, describing a family of unsupervised metareinforcement learning algorithms that are truly automated.


Probabilistic ModelAgnostic MetaLearning
Chelsea Finn*,
Kelvin Xu*,
Sergey Levine
arXiv / supplementary website
Fewshot learning problems can be ambiguous. We propose a modification of the MAML algorithm that can handle ambiguity by sampling different multiple classifiers. Our approach uses a Bayesian formulation of metalearning, building upon prior work on hierarchical Bayesian models and variational inference.


Learning a Prior over Intent via MetaInverse Reinforcement Learning
Kelvin Xu,
Ellis Ratner,
Anca Dragan,
Sergey Levine,
Chelsea Finn
arXiv
Learning the objective underlying example behavior is a challenging, underdefined problem, particularly when only a few demonstrations are available. However, there is structure among the type of behaviors that we might want agents to learn. We learn this structure from demonstrations across many tasks, acquiring a prior over intentions, and use this learned prior to infer reward functions for new tasks from only a few demonstrations.


Learning to Adapt: MetaLearning for ModelBased Control
Ignasi Clavera*,
Anusha Nagabandi*,
Ron Fearing,
Pieter Abbeel,
Sergey Levine,
Chelsea Finn
arXiv / videos / code (coming soon)
We propose a method that learns how to adapt online to new situations and perturbations, through meta reinforcement learning. Unlike prior metaRL methods,
our approach is modelbased, making it sampleefficient during metatraining and thus practical for real world problems.


Stochastic Adversarial Video Prediction
Alex Lee,
Richard Zhang,
Frederik Ebert,
Pieter Abbeel,
Chelsea Finn,
Sergey Levine
arXiv / videos / code
We combine latent variable models with adversarial training to build a video prediction model that produces predictions that look more realistic to human raters and better cover the range of possible futures.

All Papers

Universal Planning Networks
Aranvind Srinivas,
Allan Jabri,
Pieter Abbeel,
Sergey Levine,
Chelsea Finn
International Conference on Machine Learning (ICML), 2018
arXiv / videos / code (coming soon)
We propose to embed differentiable planning within a goaldirected policy, integrating planning and representation learning. Our approach optimizes for
representations that lead to effective goalbased planning for visual tasks. Our results show that the representation not only allow for effective goalbased
planning through imitation, but also transfers to more complex robot morphologies and action spaces.


OneShot Imitation from Observing Humans via DomainAdaptive MetaLearning
Tianhe Yu*,
Chelsea Finn*,
Annie Xie, Sudeep Dasari,
Pieter Abbeel,
Sergey Levine
Robotics: Science and Systems (RSS), 2018
arXiv / video
We develop a domainadaptive metalearning method that allows for oneshot learning under domain shift. We show that our method can enable a robot to learn to maneuver a new object after seeing just
one video of a human performing the task with that object.


MetaLearning and Universality: Deep Representations and Gradient Descent can Approximate any Learning Algorithm
Chelsea Finn,
Sergey Levine
International Conference on Learning Representations (ICLR), 2018
arXiv
We show that modelagnostic metalearning (MAML), which embeds gradient descent into the metalearning algorithm, can be as expressive as blackbox metalearners: both can approximate any learning algorithm.
Furthermore, we empirically show that MAML consistently finds learning strategies that generalize to new tasks better than recurrent metalearners.


Recasting GradientBased MetaLearning as Hierarchical Bayes
Erin Grant,
Chelsea Finn,
Sergey Levine ,
Trevor Darrell,
Tom Griffiths
International Conference on Learning Representations (ICLR), 2018
arXiv
We reformulate the modelagnostic metalearning algorithm (MAML) as a method for probabilistic inference in a hierarchical Bayesian model.
Unlike prior methods for metalearning via hierarchical Bayes, MAML is naturally applicable to large function approximators, like neural networks.
Our interpretation sheds light on the metalearning procedure and allows us to derive an improved version of the MAML algorithm.


Stochastic Variational Video Prediction
Mohammad Babaeizadeh,
Chelsea Finn,
Dumitru Erhan,
Roy Campbell,
Sergey Levine
International Conference on Learning Representations (ICLR), 2018
arXiv
/
code (coming soon)
/
video results
We present a stochastic video prediction method, SV2P, that builds upon the conditional variational autoencoder to make stochastic predictions of future video.
We find that pretraining is crucial for enabling stochasticity. Our experiments demonstrate stochastic multiframe predictions on three real world video datasets.


Deep Reinforcement Learning for VisionBased Robotic Grasping: A Simulated Comparative Evaluation of OffPolicy Methods
Deirdre Quillen*, Eric Jang*,
Ofir Nachum*,
Chelsea Finn,
Julian Ibarz ,
Sergey Levine
International Conference on Robotics and Automation (ICRA), 2018
arXiv / project page / benchmark code
We propose a simulated benchmark for robotic grasping that emphasizes offpolicy learning and generalization to unseen objects.
Our results indicate that several simple methods provide a surprisingly strong
competitor to popular deep RL algorithms such as double Qlearning, and our analysis sheds light on the relative tradeoffs between the methods.


OneShot Visual Imitation Learning via MetaLearning
Chelsea Finn*,
Tianhe Yu*,
Tianhao Zhang,
Pieter Abbeel,
Sergey Levine
Conference on Robot Learning (CoRL), 2017 (Long Talk)
Oral presentation at the NIPS 2017 Deep Reinforcement Learning Symposium
arXiv
/
code
/
result video
/
talk video
Using demonstration data from a variety of tasks, our method enables a real robot to learn a new related skill, trained endtoend, using a single visual demonstration of the skill. Our approach also allows for the provided demonstration to be a raw video, without access to the joint trajectory or controls applied to the robot arm.


SelfSupervised Visual Planning with Temporal Skip Connections
Frederik Ebert, Chelsea Finn, Alex Lee,
Sergey Levine
Conference on Robot Learning (CoRL), 2017 (Long Talk)
arXiv
/
code
/
video results and data
We present three simple improvements to our prior work on selfsupervised visual foresight that lead to substantially better visual planning capabilities. Our
method can perform tasks that require longerterm planning and involve multiple objects.


ModelAgnostic MetaLearning for Fast Adaptation of Deep Networks
Chelsea Finn,
Pieter Abbeel,
Sergey Levine
International Conference on Machine Learning (ICML), 2017
arXiv
/
blog post
/
code
/
video results
We propose a modelagnostic algorithm for metalearning, where a model's parameters
are trained such that a small number of gradient updates with a small amount of training data from a new task
will produce good generalization performance on that task. Our method learns a classifier that can recognize
images of new characters using only a few examples, and a policy that can rapidly adapt
its behavior in simulated locomotion tasks.


Generalizing Skills with SemiSupervised Reinforcement Learning
Chelsea Finn,
Tianhe Yu,
Justin Fu,
Pieter Abbeel,
Sergey Levine
International Conference on Learning Representations (ICLR), 2017
arXiv
/
video results
/
code
We formalize the problem of semisupervised reinforcement learning (SSRL), motivated by realworld scenarios where reward information
is only available in a limited set of scenarios such as when a human supervisor is present, or in a controlled laboratory setting.
We develop a simple algorithm for SSRL based on inverse reinforcement learning and show that it can improve performance by using
'unlabeled' experience.


Deep Visual Foresight for Planning Robot Motion
Chelsea Finn, Sergey Levine
International Conference on Robotics and Automation (ICRA), 2017
Best Cognitive Robotics Paper Finalist
arXiv
/
video
We combine an actionconditioned predictive model of images, "visual foresight," with modelpredictive control for planning how
to push objects. The method is entirely selfsupervised, requiring minimal human involvement.


ResetFree Guided Policy Search: Efficient Deep Reinforcement Learning with Stochastic Initial States
William Montgomery*,
Anurag Ajay*,
Chelsea Finn,
Pieter Abbeel,
Sergey Levine
International Conference on Robotics and Automation (ICRA), 2017
arXiv
/
video
/
code
We present a new guided policy search algorithm that allows the method to be used in domains where the initial conditions are stochastic, which makes the method
more applicable to general reinforcement learning problems and improves generalization performance in our robotic manipulation experiments.


A Connection Between Generative Adversarial Networks, Inverse Reinforcement Learning, and EnergyBased Models
Chelsea Finn*, Paul Christiano*,
Pieter Abbeel,
Sergey Levine
NIPS Workshop on Adversarial Training, 2016
arXiv
We show that a samplebased algorithm for maximum entropy inverse reinforcement learning (MaxEnt IRL) corresponds to a generative adversarial network (GAN) with a particular choice of discriminator.
Since MaxEnt IRL is simply an energybased model (EBM) for behavior, we further show that GANs optimize EBMs with the corresponding discriminator,
pointing to a simple and scalable EBM training procedure using GANs.


Active OneShot Learning
Mark Woodward, Chelsea Finn
NIPS Deep Reinforcement Learning Workshop, 2016
arXiv / video description / poster
We propose a technique for learning an active learning strategy by combining oneshot learning and reinforcement learning, and allowing the model
to decide, during classification, which examples are worth labeling. Our experiments demonstrate that our model can tradeoff
accuracy and label requests based on the reward function provided.


Unsupervised Learning for Physical Interaction through Video Prediction
Chelsea Finn, Ian Goodfellow, Sergey Levine
Neural Information Processing Systems (NIPS), 2016
arXiv
/
videos
/
data
/
code
Our video prediction method predicts a transformation to apply to the previous image, rather than pixels values directly, leading to significantly improved multiframe video prediction. We also introduce
a dataset of 50,000 robotic pushing sequences, consisting of over 1 million frames.


Adapting Deep Visuomotor Representations with Weak Pairwise Constraints
Eric Tzeng,
Coline Devin,
Judy Hoffman,
Chelsea Finn,
Pieter Abbeel,
Sergey Levine,
Kate Saenko,
Trevor Darrell
Workshop on the Algorithmic Foundations of Robotics (WAFR), 2016
arXiv
Collecting realworld robotic experience for learning an initial visual representation can be expensive. Instead, we show that it is possible to learn
a suitably good initial representation using data collected largely in simulation.


Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization
Chelsea Finn, Sergey Levine, Pieter Abbeel
International Conference on Machine Learning (ICML), 2016
Oral presentation at the NIPS 2016 Deep Learning Symposium
arXiv /
video results /
code /
talk video
We propose an method for Inverse Reinforcement Learning (IRL) that can handle unknown dynamics and scale to flexible, nonlinear cost functions. We evaluate our algorithm on a series of simulated tasks and realworld robotic manipulation problems, including pouring and inserting dishes into a rack.


EndtoEnd Training of Deep Visuomotor Policies
Sergey Levine*,
Chelsea Finn*, Trevor Darrell,
Pieter Abbeel
CCC Blue Sky Ideas Award
Journal of Machine Learning Research (JMLR), 2016
arXiv /
video /
project page /
code
We demonstrate a deep neural network trained endtoend, from perception to controls, for robotic manipulation tasks.


Deep Spatial Autoencoders for Visuomotor Learning
Chelsea Finn, Xin Yu Tan,
Yan Duan, Trevor Darrell, Sergey Levine,
Pieter Abbeel
International Conference on Robotics and Automation (ICRA), 2016
arXiv /
video
We learn a lower dimensional visual statespace without supervision using deep spatial autoencoders, and use it to learn nonprehensile manipulation
tasks, such as pushing a lego block and scooping a bag into a bowl.


Learning Deep Neural Network Policies with Continuous Memory States
Marvin Zhang, Zoe McCarthy,
Chelsea Finn, Sergey Levine,
Pieter Abbeel
International Conference on Robotics and Automation (ICRA), 2016
arXiv /
video
We propose a method for learning recurrent neural network policies using continuous memory states. The method learns to store information in and use the memory states
using trajectory optimization. Our method outperforms vanilla RNN and LSTM baselines.


Bridging text spotting and SLAM with junction features.
HsuehCheng Wang,
Chelsea Finn,
Liam Paull,
Michael Kaess,
Ruth Rosenholtz,
Seth Teller,
John Leonard
International Conference on Intelligent Robots and Systems (IROS), 2015
We develop a method that integrates textspotting with simultaneous localization and mapping (SLAM), that determines loop closures using text in the environment.


Beyond LowestWarping Cost Action Selection in Trajectory Transfer
Dylan HadfieldMenell,
Alex X. Lee,
Chelsea Finn,
Eric Tzeng,
Sandy Huang,
Pieter Abbeel,
International Conference on Robotics and Automation (ICRA), 2015
We consider the problem of selecting which demonstration to transfer to the current test scenario.
We frame the problem as an options Markov decision process (MDP) and develop an approach to learn a Qfunction from expert demonstrations.
Our results show significant improvement over nearestneighbor selection.

