CS287 Home Page

University of California at Berkeley
Dept of Electrical Engineering & Computer Sciences

CS 287: Advanced Robotics, Fall 2009

Instructor: Pieter Abbeel
Lectures: Tuesdays and Thursdays, 12:30pm-2:00pm, 405 Soda Hall
Office Hours: Thursdays 2:00-3:00pm (and by email arrangement) in 746 Sutardja Dai Hall

Announcements

PS3 out: ps3.pdf due Friday Dec 4.
In class presentations:
Guidelines:
- We will start at 12:30 sharp.
- Time: for 8 minutes it's entirely your show (going a bit shorter is ok, but unfortunately, given the time constraints, we cannot go longer and you will get cut off at the 8 minute mark; it's up to you to which extent to entertain questions throughout the presentation)
- Presentation carries 10 out of 45 points of the final project.
- Feel free to sync up with me if you would like to discuss the contents of your presentation beforehand
Schedule:
- Tuesday December 1: Barron, Smith, Liu, Kolev, Lin, Strausser, Chang-Siu, Moldovan, Soerensen
- Thursday December 3: Hoburg, Weekly, Song, Swift, Hunter, Javdani, Berg Kirkpatrick, Singh+Tang, Maitin-Shepard
On your presentation and the delivery:
- Jonathan Shewchuk has a nice webpage on giving an academic talk
- While this is not a course on public speaking, you might as well take advantage of the opportunity and practice your delivery skills. In case you have never taken a public speaking class before, here are a few (of course somewhat subjective) pointers from a first lecture in such a class: pdf. It can be fun to pay attention to a small selection of them.
Looking forward to your presentations!
Latest update: 2009/11/24 Problem set 2 faq
2009/11/07 Problem set 2 is out. Due Friday November 20, 23:59 ps2.pdf, starter code:.tar.gz, .zip
2009/10/25 PS 1: Q&A page. Includes installer+instructions for helicopter graphics!
2009/10/07 Project due-dates: Oct 9: abstract, Nov 9: milestone, Dec 3: presentation, Dec 11: final paper
2009/10/07 Project guidelines pdf
2009/10/07 PS 1 is out: PS 1, starter code: .tar.gz .zip Due: Monday October 26, 23:59pm.
If you don't have access to matlab, let me know so we can find a solution.

Announcements
Assignments
Course description
Prerequisites
Grading
Assignment policy
Syllabus and materials
Related materials

Assignments

ps3.pdf
ps2.pdf, starter code:.tar.gz, .zip
ps1.pdf, starter code:.tar.gz,.zip

Course description

A tentative list of topics includes (subject to substantial change!):

Control: underactuation, controllability, Lyapunov, dynamic programming, LQR, feedback linearization, MPC
Estimation: Bayes filters, KF, EKF, UKF, particle filter, occupancy grid mapping, EKF slam, GraphSLAM, SEIF, FastSLAM
Manipulation and grasping: force closure, grasp point selection, visual servo-ing, more sub-topics tbd
Reinforcement learning: value iteration, policy iteration, linear programming, Q learning, TD, value function approximation, Sarsa, LSTD, LSPI, policy gradient, inverse reinforcement learning, reward shaping, hierarchical reinforcement learning, inference based methods, exploration vs. exploitation
Brief coverage of: system identification, simulation, pomdps, k-armed bandits, separation principle
Case studies: autonomous helicopter, Darpa Grand/Urban Challenge, walking, mobile manipulation.

Prerequisites

Familiarity with mathematical proofs, probability, algorithms, linear algebra; ability to implement algorithmic ideas in code.

Consent of instructor required for undergraduate students.

Grading

Open-ended final project (45%)
Assignments (2 times 25% [first 2 assignments], 1 time 5% [last assignment])

Assignment policy

Collaboration: Students may discuss assignments. However, each student must code up their solutions independently and write down their answers independently.
Late assignments: Recognizing that students may face unusual circumstances and require some flexibility in the course of the semester, each student will have a total of seven free late (calendar) days to use as s/he sees fit. Late days are counted at the granularity of days: e.g., 3 hours late is one late day. If an assignment is submitted beyond the late-day budget, you will lose 20 (out of 100) points per day over budget (but you cannot go below zero).
Late days cannot be used for the final project.

Syllabus and materials

AM: Astrom and Murray, Feedback Systems, pdf
T: Tedrake, Underactuated Robotics, Course Notes for MIT 6.832. August 2009 snapshot (referenced on this page and in the slides) [Here is a link to the current working draft ]
SB: Sutton and Barto, Reinforcement Learning, html
BT: Bertsekas and Tsitsiklis, Neuro-dynamic programming
TBF: Thrun, Burgard, Fox, Probabilistic Robotics
[optional readings] Slotine and Li, Applied Nonlinear Control --- great read on the topic

Lecture	Topic	Notes	Readings	Optional/Additional Readings	Videos
Th Aug 27	Course introduction.	2pp 6pp
Tu Sep 1	Feedforward, Feedback, PID, Control of fully actuated systems	2pp 6pp code	AM 10.3, 10.4; T 1.2
Th Sep 3	Lyapunov, (Energy pumping)	2pp 6pp		T Ch. 3; SL, Example 3.21
Tu Sep 8	Optimal Control, HJB, Discretization	2pp 6pp	T Ch. 6; SB Ch. 4; Munos and Moore, MLJ 2001 pp.1-9 pdf; Chow and Tsitsiklis, 1991 pdf	Kushner and Dupuis, 1992/2001
Th Sep 10	Dynamic programming with function approximation	2pp 6pp	Gordon, 1995 pdf; Tsitsiklis and Van Roy, 1996 pdf
Tu Sep 15	No lecture.
Th Sep 17	Dynamic programming with function approximation; speed-ups/tweaks	2pp 6pp	Gordon, 1995 pdf; Tsitsiklis and Van Roy, 1996 pdf	BT 6.5, 6.1; Moore and Atkeson, Prioritized sweeping pdf
Tu Sep 22	LQR + variations	2pp 6pp		Tedrake, LQR trees pdf, Atkeson and Stephens pdf, Todorov, 2005 pdf, Anderson and Moore, Optimal Control: Linear Quadratic Methods
Th Sep 24	MPC, feedback linearization	draft slides: 2pp 6pp	Tedrake Ch. 9 and App. A	Diehl +al., MPC overview pdf; John T. Betts, "Practical Methods for Optimal Control Using Nonlinear Programming," 2001; Slotine and Li Chapter 6; Isidori, "Nonlinear control systems," 1989.
Tu Sep 29	Bandits	draft notes: pdf, draft slides: 2pp 6pp		Regret-based approaches: Lai and Robbins, 1985; Auer +al, UCB algorithm 1998 pdf; papers on Bayesian exploration in MDPs: Poupart+al, Asmuth+al, Kolter+Ng
Thu Oct 1	Policy iteration	If you wanted to read ahead for this and the next few lecturs: See chapters 4, 6, 7, 8, 11.1 of SB	SB Chapters 4, 5, 6, 7, 8, 11.1 html
Tue Oct 6	Example MDPs, Recap some exact methods for MDPs	2pp 6pp	SB Chapter 4
Thu Oct 8	Linear programming; Model-free methods (TD)	2pp 6pp LP notes	SB Chapter 6
Tu Oct 13	TD, sarsa, Q, TD(\lambda)	2pp 6pp	SB Chapters 6, 7
Thu Oct 15	TD with function approximation, TD Gammon, other examples	2pp 6pp	SB Chapters 8, 11.1	Tsitsiklis and Van Roy, 1997 pdf,
Tu Oct 20	LSTD, LSPI, RLSTD, behavioral cloning	2pp 6pp		LSPI pdf, Bradtke and Barto, 1996, LSTD pdf , Kolter and Ng, Feature selection in LSTD pdf
Thu Oct 22	Behavioral cloning, Inverse RL	2pp 6pp	More complete slides on Inverse RL from Robot Learning Summer School, 2009 pdf
Tu Oct 27	Inverse RL wrap-up, Policy search	2pp 6pp
Thu Oct 29	Policy search & Actor-Critic	2pp 6pp		Peters and Schaal, IROS 2006, Policy gradient methods for robotics; Ng and Jordan UAI 2000, an analysis of fixing the random seed in policy search
Tu Nov 3	An application of likelihood ratio methods: Learning to walk	2pp 6pp		Lecture by Russ Tedrake on learning to walk Tedrake, Zhang and Seung, Learning to walk in 20 minutes	toddler, tinker-toy, Cornell kneed-walker
Thu Nov 5	Natural gradient; briefs on various topics incl. approximate LP, pomdp's, reward shaping, exploration vs. exploitation, hierarchical methods	2pp 6pp		Kakade, NIPS 2002 Natural policy gradient; Peters and Schaal natural actor critic ; Calafiore and Campi constraint sampling; de Farias and Van Roy constraint sampling ALP; Kearns and Singh E3; Ng, Harada and Russell reward shaping; Wiewiora reward shaping equivalence with V, Q initialization; Marthi, Russell and Andre hierarchical Q decomposition ; Kober, Peters ball in a cup (nips2008)	ball-in-a-cup-video
Tue Nov 10	State estimation: HMM, KF	2pp 6pp	Probabilistic Robotics Chapters 1, 2, 3	From Gauss to Kalman
Thu Nov 12	State estimation: KF, EKF, UKF	2pp 6pp	Probabilistic Robotics Chapter 3 (Gaussian filters), 10 (EKF SLAM), Julier and Uhlmann, the UKF
Tu Nov 17	State estimation: UKF, particle filter, mapping, localization, SLAM	2pp 6pp	Probabilistic Robotics Chapter 4 (particle filters), 6 (robot perception), 8 (localization); particle filter tutorial
Thu Nov 19	State estimation: mapping, SLAM	2pp 6pp	Probabilistic Robotics Chapters 9 (occupancy grid mapping), 13 (fastSLAM), 11 (graphSLAM)	Bailey and Durrant-Whyte SLAM tutorial: part 1; part 2
Tue Nov 24	Quadruped locomotion --- Guest Lecturer: J. Zico Kolter	speaker bio 2009 slides 2008 slides
Thu Nov 26	Happy Thanksgiving!
Tue Dec 1	Project presentations.	Barron, Smith, Liu, Kolev, Lin, Strausser, Chang-Siu, Moldovan, Soerensen
Thu Dec 3	Project presentations.	Hoburg, Weekly, Song, Swift, Hunter, Javdani, Berg Kirkpatrick, Singh+Tang, Maitin-Shepard