CS287 Fall 2015

University of California at Berkeley
Dept of Electrical Enginring & Computer Sciences

CS 287: Advanced Robotics, Fall 2015

Fall 2013 offering (reasonably similar to current year's offering)
Fall 2012 offering (reasonably similar to current year's offering)
Fall 2011 offering (fairly similar to current year's offering)
Fall 2009 offering (not particularly closely matched to current year's offering)

Instructors:
Professor: Pieter Abbeel
TAs: Sandy Huang and Zoe McCarthy
Lectures: Tuesdays and Thursdays, Session 1: 9:30-11am in 405 Soda Hall; Session 2: 4-5:30pm in 310 Soda Hall
Office Hours:
Pieter: Mondays 1:15-2:15pm (and by email arrangement) in 746 Sutardja Dai Hall
Sandy: Wednesdays 4-5pm in 730 Sutardja Dai Hall
Zoe: Fridays noon-1pm in 411 Soda Hall
Communication: Piazza is intended for general questions about the course, clarifications about assignments, student questions to each other, discussions about material, and so on. To sign up, go to the Piazza website and sign up with "UC Berkeley" and "CS287" for your school and class.

Announcements
Assignments
Prerequisites
Class goals
Grading
Assignment policy
Syllabus and materials
Related materials

Announcements

Please sign up on Piazza for CS287 Advanced Robotics for all future announcements.
Welcome to the Fall 2015 edition of CS287!

Assignments

Due: Wed 9/9, Problem Set 1: Markov Decision Processes, Value Iteration, Linear Programming ps1.pdf, starter code v2
Due: Wed 9/23, Problem set 2: Function Approximation and LQR ps2.pdf, starter code
Due: Wed 10/7, Problem Set 3: Convex Optimization, Sequential Convex Programming, Optimization-based Motion Planning and Control ps3.pdf, starter code
Due: Wed 11/4, Problem Set 4: Multivariate Gaussians, Kalman Filtering, Maximum Likelihood, EM ps4.pdf, starter code
Due: Fri 12/11, Optional Extra Credit Problem Set 1: Extended Kalman Filtering and Gaussian Belief Space Planning through Sequential Convex Programming EC1-EKF-GBSP.pdf, starter code
Due: Fri 12/11, Optional Extra Credit Problem Set 2: Learning from Demonstrations through Trajectory Generalization with Thin-Plate Splines EC2-LfD.pdf, starter code
Due: Fri 12/11, Optional Extra Credit Problem Set 3: Guided Policy Search EC3-GPS.pdf, starter code
Due: Wed 12/16, Optional Extra Credit Problem Set 4: Policy Search and Locomotion EC4-PS-v2.pdf, diff-v1-v2.pdf, starter code v2

Assignment policy

Collaboration: Students may discuss assignments. However, each student must code up and write up their solutions independently.
Late assignments: Recognizing that students may face unusual circumstances and require some flexibility in the course of the semester, each student will have a total of seven free late (calendar) days to use as s/he sees fit. Late days are counted at the granularity of days: e.g., 3 hours late is one late day. If an assignment is submitted beyond the late-day budget, you will lose 20 (out of 100) points per day over budget (but you cannot go below zero).

Final Project

The final project could be either of the following, where in each case the topic should be closely related to the course:

An algorithmic or theoretical contribution that extends the current state of the art.
An implementation of a state-of-the-art algorithm using real-world data and/or robots.

Ideally, the project covers interesting new ground and might be the basis for a future conference paper submission or product. You are encouraged to come up with your own project ideas, yet make sure to pass them by Pieter before you submit your abstract.

Logistics and Timeline

1 or 2 students per project.
Oct 30th: Approved by instructor abstracts due: 1 page description of project + goals for milestone. Make sure to meet with Pieter before then!
Nov 13th: 3 page milestone due. You are not graded on the milestone. Think of it as a sanity check for yourself that you indeed have started to make progress on the project and an opportunity to get feedback on your progress thus far, as well as on any revisions you might have made to your project goals.
December 7th: Project presentations.
December 14th: Final paper due. This should be a 6 page paper, structured like a conference paper. I.e., focus on the problem setting, why it matters and what's interesting/novel about it, your approach, your results, analysis of results, limitations, future directions. Cite and briefly survey prior work as appropriate but don't re-write prior work when not directly relevant to understand your approach.
Late days cannot be used for the final project.

Prerequisites

Familiarity with mathematical proofs, probability, algorithms, linear algebra; ability to implement algorithmic ideas in code.
If your probability is rusty, you might want to handpick some homework/section exercises from past CS188 offerings, located here and similar url's replacing sp12 with fa11, sp11, fa10, sp10, fa09, etc.
Consent of instructor required for undergraduate students. Come see instructor after lecture or during office hours.

Class Goals

Learn the math and algorithms underneath state-of-the-art robotic systems. The majority of these techniques are heavily based on probabilistic reasoning and optimization---two areas with wide applicability in modern Artificial Intelligence. An intended side-effect of the course is to generally strengthen your expertise in these two areas.
Implement, and experiment with, these algorithms.
Be able to understand research papers in the field of robotics:
- Main conferences: ICRA, IROS, RSS, ISER, ISRR.
- Main journals: IJRR, T-RO, Autonomous Robots.
Try out some ideas/extensions of your own.
Note: the focus of the course is on math and algorithms. We will not study mechanical or electrical design of robots.

Grading

Open-ended final project (30%)

1/5 quality of presentation, 1/5 quality of writing of the final paper, 3/5 quality of the results themselves.

Assignments (70%)

Each regular assignment gets re-scaled out of 100 and equally contributes 70/4 of the points.
Each extra credit assignment gets re-scaled out of 50, and each can contribute up to 0.5 * 70/4 points.

Syllabus and materials

here

Tentative schedule (edits in progress):

Lecture	Topic	Readings	Optional/Additional Readings
Th Aug 27	Course Introduction
Tu Sep 1	MDP's, Exact Methods: Value Iteration, Policy Iteration, Linear Programming, LP notes	Sutton and Barto, Reinforcement Learning, Chapters 3 and 4
Th Sep 3	Discretization of Continuous State Space MDPs (v2), Code for Discretization Examples		Moore and Atkeson, 1993, Munos and Moore, MLJ 2001,
Tu Sep 8	Function Approximation / Feature-based Representations		Chow and Tsitsiklis, 1991, Gordon, 1995, Tsitsiklis and Van Roy, 1996, Kushner and Dupuis, 1992/2001,
Th Sep 10	LQR, iterative LQR / Differential Dynamic Programming
Tu Sep 15	Convex Optimization	cvx_example.m	Boyd and Vandenberghe, Chapters 9-11
Th Sep 17	Convex Optimization (part II) (same slides as previous lecture)
Tu Sep 22	Non-Convex Optimization through Sequential Convex Programming (SCP), Locally Optimal Control through Optimization: Collocation, Shooting, Model Predictive Control (MPC), Trajectory Optimization for Motion Planning	code examples	Nocedal and Wright, Chapter 18
Th Sep 24	Inverse Optimal Control
Tu Sep 29 (PM only)	Guest Lecture: Adam Bry (skyd.io)	Adam Bry (skydio) Lecture Video
Th Oct 1	Motion Planning: PRM, RRT + variants	Steven M. Lavalle, Motion Planning, Chapters 5, 14, RRT*, Karaman and Frazzoli, LQR trees, Tedrake, code example
Tu Oct 6	Inverse Optimal Control (part II) (same slides as previous lecture)
Thu Oct 8	Probability Review, Bayes Filters	Intro: PR 1; Probability Review and Bayes Filters: PR 2
Tu Oct 13	Multivariate Gaussians	PR 3
Th Oct 15	Kalman Filtering	PR 3	From Gauss to Kalman
Tu Oct 20	EKF, UKF	PR 3	Julier and Uhlmann, the UKF
Th Oct 22	Smoother, MAP
Fr Oct 23	Bay Area Robotics Symposium	Chevron Auditorium at the International House
Tu Oct 27	Maximum Likelihood, EM
Th Oct 29	POMDPs
Tu Nov 3 (PM only)	Guest Lecturers: Buddy Michini (Airware), Brad Neumann (Anki), and Liz Murphy (Savioke)	Buddy Michini (Airware) Lecture Video, Brad Neumann (Anki) Lecture Video, Liz Murphy (Savioke) Lecture Video
Th Nov 5	Policy Gradients
Tu Nov 10	Guided Policy Search (Guest Lecturer: Sergey Levine)
Th Nov 12	Hierarchical Planning (Guest Lecturer: Dylan Hadfield-Menell)
Tu Nov 17	Learning from Demonstrations (Guest Lecturer: Sandy Huang)
Th Nov 19	Particle Filters
Tu Nov 24	Projects speed-dating
Th Nov 26	Happy Thanksgiving!
Tu Dec 1	EKF-SLAM, Graph-SLAM, Beam Sensor Model, Current Directions
Th Dec 3	Autonomous Helicopters and Course Wrap-Up		Abbeel, Coates, Ng, IJRR 2010, videos and data
Fri Dec 4, 1:30pm, Cory 540A/B	Project Presentations Session 1
Mon Dec 7, 1:15pm, 405 Soda	Project Presentations Session 2
Mon Dec 7, 3:30pm, 242 SDH	Project Presentations Session 3

Related materials

Thrun, Burgard, Fox, Probabilistic Robotics

If you want to brush up your linear algebra background, I suggest working through this course (video lectures and homeworks available online) at your own pace: Stephen Boyd's EE263: Introduction to Linear Dynamical Systems.
If you want to learn more about the linear systems aspects (Kalman filtering, LQR), I recommend Stephen Boyd's EE363: Linear Dynamical Systems.
If you want go deeper into the theory of linear systems, I recommend: Claire Tomlin's EE221a: Linear System Theory
If you want to learn more about convex optimization, I recommend: Stephen Boyd's EE364a: Convex Optimization I and Stephen Boyd's EE364b: Convex Optimization II. Both of them have all course materials, including lecture videos, available online.
For (although draft-status) more about optimal control and motion planning, Russ Tedrake's class: Underactuated Robotics: Learning Planning, and Control for Efficient Agile Machines could give you a somewhat different angle, some complementary ideas, and more examples.
A more traditional book on control theory: Astrom and Murray, Feedback Systems
A more traditional book on nonlinear control: Slotine and Li, Applied Nonlinear Control.
A great introductory text on reinforcement learning: Sutton and Barto, Reinforcement Learning
A more mathematically oriented text on reinforcement learning: Bertsekas and Tsitsiklis, Neuro-dynamic programming
Earlier offerings of my graduate class had more emphasis on reinforcement learning / approximate dynamic programming than the current offering: CS294-40, Fall 2008 and CS287, Fall 2009