CS 294-125, Spring 2016: Human-compatible AI
This list is still under construction. An empty bullet item indicates more readings to come for that week.
Week 1 (1/26): Markov decision processes
- R+N, 2, 17.1-17.4
Ch.2 explores the range of environment
types, agent types, and agent-environment relationships. CH.17 deals
with MDPs and solution algorithms
Week 2 (2/2): Reinforcement learning, multi-attribute utility theory, preference elicitation
- R+N 16.1-16.5
(Multi-attribute) Utility Theory and Influence Diagrams
- R+N 21.1-21.5
Background on reinforcement learning
Week 3 (2/9): Goal inference.
Week 4 (2/16): Human preferences
- Kushnir, Tamar, Fei Xu, and Henry M. Wellman. "Young children use statistical sampling to infer the preferences of other people." Psychological Science 21.8 (2010): 1134-1140.
- Lucas, C. G., Griffiths, T. L., Xu, F., Fawcett, C., Gopnik, A., Kushnir, T., Markson, L., & Hu, J. (2014). "The child as econometrician: A rational model of preference understanding in children." PLOS One, 9(3), e92160.
- Daniel Kahneman and Amos Tversky (1979). Prospect Theory: An Analysis of Decision under Risk.
Econometrica, 47(2), pp. 263-291.
Week 5 (2/23): Collaborative systems
- Fern, A, Natarajan, S., Tadepalli, P., and Judah, K. (2014). A Decision-Theoretic Model of Assistance. Journal of Artificial Intelligence Research, 50.
- Dragan, Anca D., and Siddhartha
S. Srinivasa. "Formalizing
assistive teleoperation." MIT Press, July, 2012.
- Liu, C. et al. "A
Framework for Autonomous Vehicles With Goal Inference and Task
Allocation Capabilities to Support Peer Collaboration With Human
Agents." ASME 2014 Dynamic Systems and Control Conference. American Society of Mechanical Engineers, 2014.
Week 6 (3/1): Psychology of moral decisions
- Cushman, F. "Action, outcome, and value a dual-system framework for morality."
Personality and social psychology review 17.3 (2013): 273-292.
- Greene, Joshua D.,
"Beyond point-and-shoot morality: Why Cognitive (Neuro)Science Matters for Ethics."
Ethics 124(4), 695-726, 2001.
- [Optional extra: Greene, Joshua D., et
"An fMRI investigation of emotional engagement in moral judgment."
Science 293.5537 (2001): 2105-2108.]
- [Optional extra: Lieder, F., et
"Algorithm selection by rational metareasoning as a model of human
Week 7 (3/8): Inverse reinforcement learning
Week 8 (3/15): Inverse reinforcement learning (cont'd)
Week 9 (3/22):
Week 10 (3/29): Multiagent Sequential Decision Making
- Olihoek, Frans A. and Amato, Christopher "Dec-POMDPs as Non-Observable MDPs". IAS Technical Report, 2014.
- Boutilier, Craig "Planning, Learning and Coordination in Multiagent Decision Processes". Theoretical Aspects of Rationality and Knowledge, 1996.
- (Optional) Boutilier, Craig "Sequential Optimality and Coordination in Multiagent Systems". IJCAI, 1999.
This goes into more detail on solution algorithms for MMDPs that track the coordination state. This is related to the Dec-POMDP solution algorithms.
- (optional) Dibangoye, Jilles S., et al. "Optimally Solving Dec-POMDPs as Continuous-State MDPs". JAIR, 2016.
In depth writeup of state-of-the-art Dec-POMDP algorithms. Long, but quite thorough.
Week 11 (4/5): Game theory
Week 12 (4/12): Inverse games
Week 13 (4/19): Embedded reinforcement learning, Baldwinian evolution
- Mark Ring and Laurent Orseau, "Delusion, Survival, and Intelligent Agents."
In Proc. AGI, 2011.
Describes a possible difficulty with reward-based agents, wherein the agent builds a delusion box that produces fake rewards that make it happy.
- (optional) Daniel Dewey, "Learning What to Value.".
In Proc. AGI, 2011.
Argues that wireheading arises from RL formulations and proposes instead an approach based on learning an initially unknown utility function.
- (optional) Bill Hibbard, "Model-based Utility Functions.". JAGI, 3(1), 1-24, 2012.
Proposes and analyzes a solution to the wireheading problem based on utility functions that depend on unobserved state variables
whose values the agent must infer.
- (optional) Laurent Orseau and Mark Ring, "Space-Time Embedded Intelligence.". Proc. AGI, 2012.
Defines a very general notion of rationality for agents whose computational substrate is part of the environment they inhabit.
- David Ackley and Michael Littman, Interactions between learning and evolution. In Proc. Artificial Life II, 1991.
Discusses the origin of reward functions and how learning speeds up evolution, clarifying the Baldwin effect first proposed in 1896.
Week 14 (4/26): Corrigibility
- Soares, Nate, et
"Corrigibility." Workshops at the Twenty-Ninth AAAI Conference on
Artificial Intelligence. 2015.