CS 194-10, Fall 2011
Introduction to Machine Learning
Overview
CS 194-10 is a new undergraduate machine learning course designed to complement CS 188, which covers all areas of AI.
Eventually it will become CS 189. The main prerequisite is CS 188 or consent of the instructor; students are assumed to have
lower-division mathematical preparation including CS 70 and Math 54. The course will be a mixture of theory, algorithms,
and hands-on projects with real data. The goal is to enable students to
understand and use machine learning methods across a wide range of
settings. As this is a new course, the ride may be a little bumpy, so enrollment is limited (see below).
Announcements
10/21/11 |
Assignment 7 posted, due 11/20. |
10/21/11 |
Assignment 6 posted, due 11/9. |
10/28/11 |
Corrected Assignment 5 posted (removed requirement for fixed weights in 1(c) and (d)). |
10/21/11 |
Assignment 5 posted, due 10/30. |
10/14/11 |
Assignment 4 due date extended to due 10/23. |
10/14/11 |
Revised Assignment 4 posted (including testing code, new submission instructions), due 10/21. |
10/14/11 |
Assignment 4 posted (including training data), due 10/21. |
10/6/11 |
Midterm solutions posted. |
9/9/11 |
Assignment 0 solutions and Assignment 2 solutions posted. |
9/9/11 |
Assignment 3 posted, due 10/3. |
9/11/11 |
Corrected version of Assignment 2 posted (fixes typos in Q2 and Q4(c)). |
9/9/11 |
Assignment 2 posted (including training data), due 9/19. |
9/1ish/11 |
Assignment 1 posted, due 9/9. |
9/1/11 |
submit is not working, due to 194-10 being just a section of 194; for the time being, email your solutions
as firstname.lastname.tar.gz or firstname.lastname.zip to Avital at cs194-tc@imail.eecs.berkeley.edu. |
8/25/11 |
Corrected version of Assignment 0 posted, fixes typos in Q.3. |
8/22/11 |
Assignment 0 posted, due 9/2. |
8/16/11 |
Discussion sections WILL be held in Week 1, i.e., on Aug 24 before the first class; they will be in 310 Soda instead of the usual rooms. |
Instructor Stuart Russell
748 Sutardja Dai Hall,
russell AT cs.berkeley.edu;
(510) 642 4964
Office hours Mon 10-12 and Wed 9-10 in 748 Sutardja Dai Hall.
GSIs
- Mert Pilanci, mert At eecs.berkeley.edu
Office Hours Tue 2-3pm and Wed 4-5pm, both in 751 Soda.
- Avital Steinitz, steinitz AT eecs.berkeley.edu
Office Hours Wed 11-12 in 751 Soda.
Lecture TuTh 3.30-5, 390 Hearst Mining
Discussion sections
101, Wed 10-11am, 75 Evans (Avital)
102, Wed 2-3pm, 3109 Etcheverry (Mert)
103, Wed 3-4pm, 87 Evans (Mert)
Final Exam Friday Dec 16th, 7:00-10:00pm, location TBD.
Prerequisites
Enrollment in the Course
Course Requirements and Grading
Reading List
Syllabus, Lecture Slides/Notes, Readings, Due Dates
Assignments
Computer Accounts and Course Software
Prerequisites
The prerequisite for the course is CS188 or consent of the instructor.
I will assume familiarity with logic, elementary probability theory,
elementary linear algebra, and multivariable calculus.
The overall technical level will be similar to that in 188.
It will help to know Python; if you don't, it can be picked up quickly.
As usual, ask me if you're not sure about whether to take the
course.
Enrollment in the Course
Enrollment is limited to 60 for this offering; the primary criterion
for enrollment will be performance in 188. At present, it is likely the cutoff will be around B+. (This will apply to those who are already enrolled as well as to those on the wait list.)
Since we are not much above 60 at present, it is likely that students on the wait list with a B+ or better will be able to enroll.
If you are not yet enrolled as of the first day of lecture and still
want to be considered for participation in the course,
you should attend class and file an appeal form by the end of the first week of classes.
Appeal forms are available from the CS Office on the 3rd floor of Soda Hall.
Course Requirements and Grading (Preliminary Draft)
- Midterm (18%)
- Final (28%)
- Assignments (54% total) -- 9 assignments (A0 through A8, worth 2, 6, 6, 6, 6, 6, 6, 6, 10% respectively.),
Late policy: NO LATE HOMEWORKS WILL BE ACCEPTED, with the following exception:
Over the semester, you have a total of 5 FREE
LATE DAYS to cover for emergencies.
If you wish to use one or more of these, indicate it clearly on your homework.
Assignments are to be turned in by midnight on the due date.
Grading policy: the class is not graded on a curve.
Grade is based on total percentage as follows:
A+
A
A-
B+
B
B-
C+
C
C-
D+
D
D-
F
|
[90 -- 100]%
[85 -- 90)%
[80 -- 85)%
[75 -- 80)%
[70 -- 75)%
[65 -- 70)%
[60 -- 65)%
[55 -- 60)%
[50 -- 55)%
[45 -- 50)%
[40 -- 45)%
[35 -- 40)%
[0 -- 35)%
|
These boundaries are sharp, i.e., no rounding up.
Some assignments and exam questions may offer extra credit;
good performance on extra credit questions may result
in an improved grade, at the instructor's discretion.
A course grade of F will be assigned if the midterm or final is skipped.
Reading List (Preliminary Draft)
The first two books are very helpful, and are available online, so those (in addition to AIMA) will be the primary sources.
Bishop has a wide range of solid mathematical derivations, while Witten and Frank
focus much more on the practical side of applied machine learning and on the Weka package (a Java library and interface for machine learning).
- Trevor Hastie, Rob Tibshirani, and Jerry Friedman, Elements of Statistical Learning, Second Edition, Springer, 2009. (Full pdf available for download.)
- Kevin P. Murphy, Machine Learning: A Probabilistic Perspective. Unpublished. Access information will be provided.
- Stuart Russell and Peter Norvig, Artificial Intelligence: A
Modern Approach, Third Edition, Prentice Hall, 2010.
The machine learning chapters were substantially revised in the third edition; previous editions are not usable for this course.
- Christopher Bishop,
Pattern Recognition and Machine Learning, Springer, 2006.
- Ian Witten and Eibe Frank, Data Mining: Practical Machine Learning Tools and Techniques,
Third Edition, Morgan Kaufmann, 2011.
Reading assignments for each week (to prepare for lecture, or review for assignments) appear here.
Assignments
- Assignment 0, due 9/2, 2% of grade. Assignment 0 solutions.
- Assignment 1, due 9/9, 6% of grade. Assignment 1 solutions.
- Assignment 2, due 9/19, 6% of grade. Assignment 2 solutions.
- Assignment 3, due 10/3, 6% of grade. Assignment 3 solutions.
- Midterm, midterm solutions.
- Assignment 4, due 10/23, 6% of grade. Assignment 4 solutions.
- Assignment 5, due 10/30, 6% of grade. Assignment 5 solutions.
- Assignment 6, due 11/9, 6% of grade. Assignment 6 solutions.
- Assignment 7, due 11/20, 6% of grade. Assignment 7 solutions.
Assignments are due by midnight on the day indicated.
Each assignment will include a combination of problems to solve and programs to write and test.
Assignments should be turned in using the submit program
from an instructional (named or class) account, as described here.
If necessary, solutions to the homework problems can be turned in on paper
in the homework box in 283 Soda, or they may be turned
in online (e.g., as pdfs produced from LaTeX) using submit, as part of your overall submission.
Except for Assignment 0, which must be done individually, assignments can be done individually or in pairs. (This goes for both problem-solving and programming parts.)
If done in pairs, each partner should be involved in all the work!!
The usual rule about free-riding applies: the more you free-ride, the lower will be your score on the midterm and final.
Discussion of assignments among students is permitted and encouraged, but solutions
and programs may not be copied. I would recommend NOT mixing inter-group discussion
with writing up of solutions or code. See the
EECS Department Policy on Academic Dishonesty
and Kris Pister's policy for further explanation and examples.
Finding solutions on the web: It is becoming increasingly
difficult to give homework problems whose solutions are not already
available in some form on the web. This does not mean that your first
response to any homework is to type the question into Google. The EECS
policy begins "Copying all or part of another person's work, or using
reference material not specifically allowed, are forms of cheating."
For the purposes of this course, the allowed reference materials are
the reading materials listed on the course web page and any additional materials specified in the homework;
in addition, you may use Wikipedia for background reference.
It is a good idea to start your programming assignments as soon as you can; computers
have a tendency to go down the night before an assignment is due. There
is evidence from past courses that students who start working well
before the due date take about one third the
time to complete their work compared to students who wait until the last
minute. In general, it will be worth your while to spend more time away
from the screen thinking about programs than struggling with them
on-line.
Computing Facilities
You will have access to department UNIX workstations for this course.
If you already have a "named" account or are enrolled in another
EECS course this semester that provides named accounts, you can use
that account for this class. Otherwise, you will get a "class" account
specifically for CS 194-10 -- see
Information
for New Instructional Users as well as the departmental
policies.
Course Software
The primary programming language for the course will be Python, including the numpy, scipy, and matplotlib packages.
In addition we may be using special-purpose machine learning software packages.
Class newsgroup
The class newgroup will be on Piazza; students who are enrolled or on
the waitlist will receive an email inviting them to access the
newsgroup.
The class newsgroup is suitable for asking general questions about
what the homework questions mean, how the course software works,
etc. Do not ask or answer specific questions about homework solutions,
e.g., "What's the right answer for number 2?" One of the course GSIs
will be checking the newsgroup fairly regularly, but for "official"
answers to important questions you might want to email your own GSI
directly, AFTER you have checked to see if the question has already
been answered on the newsgroup!