CS 294-7, Spring 2003: Reinforcement Learning
Reading list
Books
Week 1 (1/22): Agents, environments, Markov decision processes
- S+B 1, 3
- B+T 1, 2.1, 2.4
- Background: Russell and Norvig, 2nd edition, Chapters 2, 17.1.
Week 2 (1/29): Dynamic programming
Week 3 (2/5): Dynamic programming contd.
Week 4 (2/12): Monte Carlo methods
Week 5 (2/19): Samuel's checker player
- Samuel, Arthur L., "Some studies in machine learning using the game of
checkers." IBM Journal of Research and Development 3(3), 210-229. (Hard copy handout.)
Week 6 (2/26): Running averages, temporal difference learning
- S+B 6.1-6.5
- B+T 4.1, 5.3, 5.6
Week 7 (3/5): Q-decomposition, convergence of Q-learning
Week 8 (3/12): Shaping, lambda, function approximation
Week 9 (3/19): Function approximation contd.: update, examples, convergence
Week 10 (3/26): SPRING BREAK
Week 11 (4/2): Partially observable MDPs
Project proposals due.
Week 12 (4/9): Policy search methods
Week 13 (4/16): Multiagent reinforcement learning
Week 14 (4/23): Multiagent reinforcement learning contd.; hierarchical reinforcement learning
Week 15 (4/30): Hierarchical reinforcement learning
Week 16 (5/7): Exploration; evolution
Week 17 (5/14): Project presentations