Planning for Autonomous Cars that Leverages Effects on Human Actions

Dorsa Sadigh, Shankar Sastry, Sanjit A. Seshia, and Anca D. Dragan. Planning for Autonomous Cars that Leverages Effects on Human Actions. In Proceedings of the Robotics: Science and Systems Conference (RSS), June 2016.

Download

[pdf] 

Abstract

Traditionally, autonomous cars make predictions about other drivers' future trajectories, and plan to stay out of their way. This tends to result in defensive and opaque behaviors. Our key insight is that an autonomous car's actions will actually affect what other cars will do in response, whether the car is aware of it or not. Our thesis is that we can leverage these responses to plan more efficient and communicative behaviors. We model the interaction between an autonomous car and a human driver as a dynamical system, in which the robot's actions have immediate consequences on the state of the car, but also on human actions. We model these consequences by approximating the human as an optimal planner, with a reward function that we acquire through Inverse Reinforcement Learning. When the robot plans with this reward function in this dynamical system, it comes up with actions that purposefully change human state: it merges in front of a human to get them to slow down or to reach its own goal faster; it blocks two lanes to get them to switch to a third lane; or it backs up slightly at an intersection to get them to proceed first. Such behaviors arise from the optimization, without relying on hand-coded signaling strategies and without ever explicitly modeling communication. Our user study results suggest that the robot is indeed capable of eliciting desired changes in human state by planning using this dynamical system.

BibTeX

@inproceedings{sadigh-rss16,
  author    = {Dorsa Sadigh and Shankar Sastry and Sanjit A. Seshia and Anca D. Dragan},
  title     = {Planning for Autonomous Cars that Leverages Effects on Human Actions},
 booktitle = {Proceedings of the Robotics: Science and Systems Conference (RSS)},
 month = "June",
 year = {2016},
 OPTpages = {66--73},
 abstract = {Traditionally, autonomous cars make predictions 
about other drivers' future trajectories, and plan to 
stay out of their way. This tends to result in defensive and 
opaque behaviors. Our key insight is that an autonomous 
car's actions will actually affect what other cars will do in 
response, whether the car is aware of it or not. Our thesis is 
that we can leverage these responses to plan more efficient 
and communicative behaviors. We model the interaction 
between an autonomous car and a human driver as a dynamical 
system, in which the robot's actions have immediate 
consequences on the state of the car, but also on human 
actions. We model these consequences by approximating the 
human as an optimal planner, with a reward function that 
we acquire through Inverse Reinforcement Learning. When 
the robot plans with this reward function in this dynamical 
system, it comes up with actions that purposefully change 
human state: it merges in front of a human to get them to 
slow down or to reach its own goal faster; it blocks two 
lanes to get them to switch to a third lane; or it backs up 
slightly at an intersection to get them to proceed first. Such 
behaviors arise from the optimization, without relying on 
hand-coded signaling strategies and without ever explicitly 
modeling communication. Our user study results suggest that 
the robot is indeed capable of eliciting desired changes in 
human state by planning using this dynamical system.},
}

Generated by bib2html.pl (written by Patrick Riley ) on Thu Jan 12, 2017 16:01:14