Nick Rhinehart, Postdoc, UC Berkeley

Home News Publications
picture of Nick Rhinehart

Nick Rhinehart

Postdoctoral Researcher
Berkeley Artificial Intelligence Research Laboratory
University of California, Berkeley

Email: nrhinehart@berkeley.edu
CV | Bio | Google Scholar | Twitter | Github

About me

Welcome to my academic website! I'm a Postdoctoral Scholar working with Sergey Levine and others within the UC Berkeley Artificial Intelligence Research lab. I received a Ph.D. in Robotics working with Kris Kitani at Carnegie Mellon University. I've also worked with Paul Vernaza at NEC Labs America, and Drew Bagnell at Uber ATG and Carnegie Mellon. I studied CS and Engineering at Swarthmore College. See this page for a more formal bio.


My research: One of my main goals is to create useful and general learning agents that make complex decisions by forecasting the long-term consequences of their actions. Towards this goal, I often work on reinforcement learning, imitation learning, and probabilistic modeling methods at the interface of machine learning and computer vision.

News (Last modified: 2021-07-23.)
Conference, Journal, and arXiv Publications (Last modified: 2021-07-23.) (list)
Explore and Control with Adversarial Surprise

A. Fickinger*, N. Jaques*, S. Parajuli, M. Chang, N. Rhinehart, G. Berseth, S. Russell, S. Levine

arXiv 2021 | pdf | show abs | show bib | project page

We propose an unsupervised RL technique based on an adversarial game which pits two policies against each other to compete over the amount of surprise an RL agent experiences. The method leads to the emergence of complex skills by exhibiting clear phase transitions, and we show theoretically and empirically that our method has the potential to be applied to the exploration of stochastic, partially-observed environments.



Lorem Ipsum
  		
RECON: Rapid Exploration for Open-World Navigation with Latent Goal Models

D. Shah, B. Eysenbach, N. Rhinehart, S. Levine

arXiv 2021 | pdf | show abs | show bib | project page

We developed a learning-based robotic system that efficiently explores large open-world environments without constructing geometric maps. The key is a latent goal model that forecasts actions and transit times to goals, is robust to variations in the input images, and enables 'imagining' relative goals. The latent goal model is used to continually construct topological maps that the robot can use to quickly travel to specified goals.



Lorem Ipsum
  		
decision tree representing CfO method
Contingencies from Observations: Tractable Contingency Planning with Learned Behavior Models

N. Rhinehart*, J. He*, C. Packer, M. A. Wright, R. McAllister, J. E. Gonzalez, S. Levine

ICRA 2021 | pdf | show abs | show bib | project page

We developed an approach for deep contingency planning by learning from observations. Given a context, the approach plans a policy that achieves high expected return under the uncertainty of the forecasted behavior of other agents. We evaluate our method's closed-loop performance in common driving scenarios constructed in the CARLA simulator, show that our contingency planner solves these scenarios, and show that noncontingent planning approaches cannot.



Lorem Ipsum
  	        
ViNG: Learning Open-World Navigation with Visual Goals

D. Shah, B. Eysenbach, G. Kahn, N. Rhinehart, S. Levine

ICRA 2021 | pdf | show abs | show bib | project page

We developed a graph-based RL approach to enable a robot to navigate real-world environments given diverse, visually-indicated goals. We instantiate our method on a real outdoor ground robot and show that our system, which we call ViNG, outperforms previously-proposed methods for goal-conditioned reinforcement learning.



Lorem Ipsum
Parrot: Data-Driven Behavioral Priors for Reinforcement Learning

A. Singh*, H. Liu*, G. Zhou, A. Yu, N. Rhinehart, S. Levine

Oral Presentation (1.8% of submissions)
ICLR 2021 | pdf | show abs | show bib | project page

Whereas RL agents usually explore randomly when faced with a new task, humans tend to explore with structured behavior. We demonstrate a method for learning a behavioral prior that can be used for rapidly learning new tasks without impeding the RL agent's ability to try out novel behaviors.



Lorem Ipsum
  		
SMiRL: Surprise Minimizing RL in Dynamic Environments

G. Berseth, D. Geng, C. Devin, N. Rhinehart, C. Finn, D. Jayaraman, S. Levine

Oral Presentation (1.8% of submissions)
ICLR 2021 | pdf | show abs | show bib | project page

We propose that a search for order amidst chaos might offer a unifying principle for the emergence of useful behaviors in artificial agents, and formalize this idea into an unsupervised reinforcement learning method called surprise minimizing RL (SMiRL). The resulting agents acquire several proactive behaviors to seek and maintain stable states, which include successfully playing Tetris, Doom, and controlling a humanoid to avoid falls, without any task-specific reward supervision.



Lorem Ipsum
		
Conservative Safety Critics for Safe Exploration

H. Bharadhwaj, A. Kumar, N. Rhinehart, S. Levine, F. Shkurti, A. Garg

ICLR 2021 | pdf | show abs | show bib | project page

The key idea of our algorithm is to train a conservative safety critic that overestimates how unsafe a particular state is and modifies the exploration strategy to appropriately account for this safety under-estimate (by overestimating the probability of failure). Empirically, we show that the proposed approach can achieve competitive performance on challenging navigation, manipulation, and locomotion tasks while incurring significantly lower catastrophic failure rates during training than prior methods.



Lorem Ipsum
  		
Inverting the Pose Forecasting Pipeline with SPF2: Sequential Pointcloud Forecasting for Sequential Pose Forecasting

X. Weng, J. Wang, S. Levine, K. Kitani, N. Rhinehart

CoRL 2020 | pdf | show abs | show bib | project page

Instead of a standard pipeline for trajectory forecasting that first (1) detects objects with LiDAR (2) forecasts object pose trajectories, we ''inverted'' it to create a new pipeline that (1) forecasts LiDAR trajectories (2) detects object pose trajectories. We found that our proposed pipeline is competitive with the standard pipeline in the domains of vehicle forecasting and robotic manipulation forecasting, and has the ability to scale its performance with the addition of unlabelled LiDAR data.



Lorem Ipsum
		
Can Autonomous Vehicles Identify, Recover from, and Adapt to Distribution Shifts?

A. Filos*, P. Tigas*, R. McAllister, N. Rhinehart, S. Levine, Y. Gal

ICML 2020 | pdf | show abs | show bib | code | blog post | project page

We used recent techniques to estimate the epistemic uncertainty of a Deep Imitative Model used for planning vehicle trajectories and found that we could use this epistemic uncertainty to reliably detect out-of-distribution situations, plan more effectively in them, and adapt the model online with expert feedback.



Lorem Ipsum
		
Generative Hybrid Representations for Activity Forecasting with No-Regret Learning

J. Guan, Y. Yuan, K. M. Kitani, N. Rhinehart

Oral Presentation (4.6% of submissions)
CVPR 2020 | pdf | show abs | show bib | supp

Some activites are best represented discretely, others continuously. We learn a deep likelihood-based generative model to jointly forecast discrete and continuous activities, and show how to tweak the model to learn efficiently online.



Lorem Ipsum
		
Deep Imitative Models for Flexible Inference, Planning, and Control

N. Rhinehart, R. McAllister, S. Levine

ICLR 2020 | pdf | show abs | show bib | code (tf, official) | code (pytorch, reimplementation) | project page | talk video

We learn a deep conditional distribution of human driving behavior to guide planning and control of an autonomous car in simulation, without any trial-and-error data. We show that the approach can be adapted to execute tasks that were never demonstrated, including safely avoiding potholes, and is robust to misspecified goals that would cause it to violate its model of the rules of the road, and achieve S.O.T.A. on the CARLA benchmark.



Lorem Ipsum
		
PRECOG: PREdiction Conditioned On Goals in Visual Multi-Agent Settings

N. Rhinehart, R. McAllister, K. M. Kitani, S. Levine

Best Paper, ICML 2019 Workshop on AI for Autonomous Driving
ICCV 2019 | pdf | show abs | show bib | project page | code | visualization code | iccv pdf | iccv talk slides (pdf) | Baylearn talk (youtube)

We perform deep conditional forecasting with multiple interacting agents: when you control one of them, you can use its goals to better predict what nearby agents will do. The model also outperforms S.O.T.A. methods on the more standard task of unconditional forecasting.



Lorem Ipsum
		

Directed-Info GAIL: Learning Hierarchical Policies from Unsegmented Demonstrations using Directed Information

M. Sharma*, A. Sharma*, N. Rhinehart, K. M. Kitani

ICLR 2019 | pdf | show abs | show bib | project page

Many behaviors are naturally composed of sub-tasks. Our approach learns to imitate behaviors with subtasks by discovering topics of latent behavior to influence its imitation.



Lorem Ipsum
		
First-Person Activity Forecasting from Video with Online Inverse Reinforcement Learning

N. Rhinehart, K. Kitani

TPAMI 2018 | pdf | show abs | show bib | project page

We continuously model and forecast long-term goals of a first-person camera wearer through our Online Inverse RL algorithm. We show our approach learns efficiently continuously in theory and practice.



Lorem Ipsum
		
R2P2: A ReparameteRized Pushforward Policy for Diverse, Precise Generative Path Forecasting

N. Rhinehart, K. M. Kitani, P. Vernaza

ECCV 2018 | pdf | show abs | show bib | project page | supplement | blog post (third-party)

We designed an objective to jointly maximize diversity and precision for generative models, and designed a deep autoregressive flow to efficiently optimize this objective for the task of motion forecasting. Unlike many popular generative models, ours can exactly evaluate its probability density function for arbitrary points.



Lorem Ipsum
		  
		
Learning Neural Parsers with Deterministic Differentiable Imitation Learning

T. Shankar, N. Rhinehart, K. Muelling, K. M. Kitani

CORL 2018 | pdf | show abs | show bib | code

We developed and applied a new imitation learning approach for the task of sequential visual parsing. The approach learns to imitate an expert parsing oracle.



Lorem Ipsum
		
depiction of HIRL method
Human-Interactive Subgoal Supervision for Efficient Inverse Reinforcement Learning

X. Pan, E. Ohn-Bar, N. Rhinehart, Y. Xu, Y. Shen, K. M. Kitani

AAMAS 2018 | pdf | show abs | show bib

We analyze the benefit of incorporating a notion of subgoals into Inverse Reinforcement Learning (IRL) with a Human-In-The-Loop (HITL) framework and find our approach to require less demonstration data than a baseline Inverse RL approach



Lorem Ipsum
		

N2N Learning: Network to Network Compression via Policy Gradient Reinforcement Learning

A. Ashok, N. Rhinehart, F. Beainy, K. Kitani

ICLR 2018 | pdf | show abs | show bib | code

We designed a principled method to perform neural model compression: we trained a compression agent via RL on the sequential task of compressing large networks while maintaining high performance. The compressing agent was able to generalize to compress previously-unseen networks.



Lorem Ipsum
		
Predictive-State Decoders: Encoding the Future Into Recurrent Neural Networks

A. Venkataraman*, N. Rhinehart*, W. Sun, L. Pinto, M. Hebert, B. Boots, K. Kitani, J. A. Bagnell

NIPS 2017 | pdf | show abs | show bib

We use the idea of Predictive State Representations to guide learning of RNNs: by encouraging the hidden-state of the RNN to be predictive of future observations, we found it to improve RNN performance on various tasks in probabilistic filtering, imitation learning, and reinforcement learning.



Lorem Ipsum
		
First-Person Activity Forecasting with Online Inverse Reinforcement Learning

N. Rhinehart, K. Kitani

Best Paper Honorable Mention, ICCV 2017 (3 of 2,143 submissions)
ICCV 2017 | pdf | show abs | show bib | project page | code

We continuously model and forecast long-term goals of a first-person camera wearer through our Online Inverse RL algorithm. In contrast to motion forecasting, our approach reasons about semantic states and future goals that are potentially far away in space and time.



Lorem Ipsum
		
Learning Action Maps of Large Environments Via First-Person Vision

N. Rhinehart, K. Kitani

CVPR 2016 | pdf | show abs | show bib

We developed an approach that learns to associate visual cues associated with sparse behaviors to make dense predictions of functionality in seen and unseen environments.



		  @InProceedings{Rhinehart2016CVPR,
		  author = {Rhinehart, Nicholas and Kitani, Kris M.},
		  title = {Learning Action Maps of Large Environments via First-Person Vision},
		  booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
		  month = {June},
		  year = {2016}
		  } 
Visual Chunking: A List Prediction Framework for Region-Based Object Detection

N. Rhinehart, J. Zhou, M. Hebert, J. A. Bagnell

ICRA 2015 | pdf | show abs | show bib

We developed a principled imitation learning approach for the task of object detection, which is best described as a sequence prediction problem. Our approach reasons sequentially about objects, and requires no heuristics, such as Non-Maxima Suppression, to filter its predictions that are common in object detection frameworks.



		  @inproceedings{rhinehart2015visual,
		  title={Visual chunking: A list prediction framework for region-based object detection},
		  author={Rhinehart, Nicholas and Zhou, Jiaji and Hebert, Martial and Bagnell, J Andrew},
		  booktitle={Robotics and Automation (ICRA), 2015 IEEE International Conference on},
		  pages={5448--5454},
		  year={2015},
		  organization={IEEE}
		  }
© 2015-2021 Nick Rhinehart