CS 289A: Machine Learning
(Spring 2016)
Project
20% of final grade.
The project should be done in teams of 2–3 students.
Please find a partner.
Teaching Assistant Tuomas Haarnoja,
haarnoja@berkeley.edu,
is in charge of project supervision.
Please discuss your ideas with Tuomas, Professor Shewchuk, or other TAs
before submitting your initial proposal.
Deliverables
- Initial proposal, due Monday, April 4
- Project video (maximum 3 minutes), due Friday, May 6, 50% of score
- Final report (maximum 8 pages), due Friday, May 6, 50% of score
Overview
The project theme may be anything related to machine learning techniques
discussed in class, including
- critically revisiting a published paper (including reproducing
the experiments, generating new graphs and visual representations, and
discussing the results);
- writing a literature review in a specific domain
(e.g., transfer learning, active learning, meta-learning) and
making a critical comparison (ideally with common datasets);
- conducting original theoretical research (e.g., attack one of the COLT
open problems: scroll to bottom of
2014,
2015);
- conducting original practical research by applying
machine learning methods to a public or private dataset
(see project ideas below).
You are welcome and encouraged to design a project that
is related to your research outside this course.
However, please be honorable and don't suggest a project that
you've already implemented as part of your research.
Initial
proposal
The initial proposal is primarily a proposal, and need not be long.
Write a few paragraphs describing what you have decided to do.
You may have any number of figures and references.
- Include background information: What is the application domain or
field of research? Why is the problem important?
What specific questions will you try to answer?
- Talk about what data sources you plan to use, including
the number(s) of samples and number(s) of features.
If it's visual data, include an illustration if possible.
- Explain what methods you are planning to use and why.
- If you have done preliminary work, explain what you have already done
(e.g., downloaded and played with data, tried k-nearest neighbors,
did a mock-up of the user interface, etc.).
- The initial proposal is not graded. Its purpose is to make sure
you start early and we give you feedback on your idea.
- Please submit the initial proposal (like the final) through bCourses.
Video
- The video should be clear and understandable, describing everything
you think is important about your project
(motivation, description, techniques, results, etc.).
- The video needs to be self-contained: any CS 289A student should
be able to understand what you did (at least at a high level)
without consulting any other materials.
- You can make the video as simple as slides with a voice overlay, or
as fancy as you want.
- As long as it is clear and understandable, you will not be graded on
the fanciness of the video. Content is what will matter.
(Fanciness might be fun, though.)
- You must upload the video on YouTube and provide us with the link.
You may choose to keep the video private (i.e., only those with
the link can view it), in which case only the instructors will view it.
You can make the video public if you want to.
- Important:
The video can be at most 3 minutes long.
This is a very strict requirement;
a video of length 3 minutes and 1 second does not count.
The length is counted as whatever YouTube says it is.
In case you are worried about how you will fit an entire class project
into three minutes, take a look at
these
videos, which fit an entire PhD thesis into three minutes.
Final
report
- We encourage you to use a template from your favourite
machine learning conference (e.g.,
NIPS
or ICML).
- There is no minimum length requirement. The maximum length is 8 pages.
- The submission must be made through bCourses.
Any one person from the group can submit the proposal.
Please include the full names, student IDs, and email addresses for
all the members of the group.
- Also, for inspiration, here are some of
the final projects from
a Neural Networks class at Stanford.
It is a truth universally acknowledged that
your work will be better than Stanford's.
Grading
criteria
The video and the final report will be graded with 5 criteria.
- Relevance: should be related to machine learning techniques.
- Usefulness: should answer good questions or
solve problems worth solving.
(The questions should be clearly stated in the proposal.)
- Soundness: choose data sets with enough examples to get
statistically significant results;
conduct sound numerical experiments
(split the data into training/validation/test sets);
make comparative result tables using validation or cross-validation;
use the test set only for final assessment;
include error bars if appropriate;
add graphs and other good means of visualization
(e.g., projections onto principal components);
provide sound proofs;
if you choose a literature review, mention the most important papers in
the area and give proper credit.
- Clarity/presentation: good paper organization, good bibliography,
enough graphs and visual support, length should not exceed 8 pages.
- Novelty/originality: we do not require novelty/originality, but
it could add a few points.
Project
ideas
The ideas in this list fall mainly under the fourth category,
practical research.
If you prefer to revisit an important paper, simply pick a paper.
If you prefer to conduct a literature review,
simply pick a machine learning topic that interests you.
If you prefer to conduct theoretical research, you'd better already know
what you're doing.
- Prof. Shewchuk has an idea for clustering he wants to test empirically.
Specifically, it's an algorithm for graph clustering based on
a relatively recent idea called isoperimetric graph partitioning.
The goal is to see whether it is competitive with traditional
spectral graph clustering methods on large, sparse graphs
while enjoying faster speed and greater simplicity
(because it relies on solving a sparse linear system rather than
solving a sparse eigenvector problem).
Ask him if you're interested.
- Predict the price of Bitcoin. Data is available
here.
- Wikipedia has a
large
corpus of categorized documents
(category indexes with categories like “Lists of actors”).
Can you predict document categories from document text?
- The 3D Shape
Retrieval Contest (SHREC) has various challenges and datasets
available.
- Businesses love to promote themselves on Yelp,
leaving false positive reviews.
With this dataset,
predict which reviews are likely to have been written by
the business itself.
- Browsers typically download files to the Downloads folder
(or another fixed, set folder).
Develop a method for automatically placing files in
the appropriate folder.
You can constrain yourself to text documents, or to images.
- Cross-validation seems to be the consensus method for model selection.
But which method should be used for performance prediction?
Cross-validation won the performance prediction challenge in 2006.
Using the data of this
performance prediction challenge, study various methods of
performance prediction, including the bootstrap and cross-validation.
Which value of k in k-fold cross-validation is best? Why?
Useful hints may be found in the book
Hands-On
Pattern Recognition.
- Automatic Machine Learning.
The ChaLearn AutoML challenge
asks you to build the perfect black box for machine learning.
With the tools you've learned in class, put together your own black box
and enter the challenge.
- Google's Deep Dream,
a platform for neural-network-based art, went viral on the internet:
You could propose to add to its capabilities or duplicate
some of its functionality.
You can also look at capturing
artistic styles.
- A recent paper shows
how to train a convolutional neural network to play Go surprisingly well,
by doing pattern matching instead of brute-force search.
Think of another domain where patterns of past memories might do better
than brute-force computation.
- There is an interesting paper
on fooling neural networks for classification
(see also this
blog
post).
But these methods examine the neural network they want to fool before
adding noise to the images.
It would be interesting to study the effect of adding noise to images
independent of the classifier, and
see how the classifier accuracy varies with that.
Time permitting, a second phase would be
to make a classifier robust to such noises.
Motivation: suppose you want to fool the image classifier of
Google or flickr, but you don't have access to those classifiers,
so you would like to add the most effective general-purpose noise
to fool them.
Conversely, if you work at Google, you would like to make
your classifier resilient to that.
Other useful data sources: