16-721: Advanced Perception

ROBOTICS INSTITUTE
CARNEGIE MELLON UNIVERSITY

16-721: Advanced Machine Perception

Spring 2006

OVERVIEW

16-721 is a graduate seminar devoted to recent research on computer vision. We will be reading an eclectic mix of vision papers on topics such as perception, object and scene recognition, segmentation, tracking, as well as "best papers of all time".

We will meet on Mondays and Wednesdays from 10:30am-11:50am in NSH 3002. The first meeting will be on Monday January 16th, and the final meeting will be on Wednesday May 3, 2006.

Instructor: Alexei (Alyosha) Efros, Assistant Professor, 4207 Newell-Simon Hall.

Office Hours: Monday 12:-12:30 p.m.

Friday 2:30-3:30 p.m.

TA: David Bradley, 2216 Newell-Simon Hall.

Office Hours: Tuesday 1:00-2:00 p.m. or by appointment.

Feel free to send email to efros (at) cs or dbradley (at) cs with any questions.

PROJECTS

Check out this list of data sources for some ideas on where to get images to work with.

NEW 20 minute Project meetings will be held with each group every other week at Craig Street Coffee on Mondays and on campus on Wednesdays.

Time	Monday (A)	Wednesday (A)	Monday (B)	Wednesday (B)
12:10 - 12:30	N/A	Zickler	N/A	Vallespi
12:30 - 12:50	Thompson & Dunlop		Batra & Kim
12:50 - 1:10	Ramnath		Chan & Barnum
1:10 - 1:30	Djugash
1:30 - 1:50	Melchior

MEETING SCHEDULE

A list of suggested papers to present is available here.

For some journal-length papers, shorter conference versions have been posted. Feel free to read either paper.

The discussion board for signing up for papers is now available here

If you want to change your presentation date, please arrange a swap with another student and notify the instructor at least two weeks in advance.

date	Presenter	paper title	author(s)	discussion board	slides
Jan. 16	Alyosha Efros	Introduction, Vision: Measurement vs. Perception Administrative stuff, overview of the course, datasets			ppt
Jan. 18	Alyosha Efros	Overview lecture on the physiology of vision Suggested reading: The Plenoptic Function and the Elements of Early Vision (1991)	Adelson & Bergen	here	ppt
Jan. 23	Alyosha Efros Dave Thompson	Overview lecture on theories of Visual Perception Vision is getting easier every day (1995) What's up in top-down processing? (1991) Pictorial art and vision (1991)	Patrick Cavanagh	here	Perception ppt Cavanagh ppt
Part I: Low-level Vision (images as texture)
Jan. 25	Peter Barnum Heather Dunlop	Presenting: The Earth Mover's Distance as a Metric for Image Retrieval. (conference version) Optional Reading: Empirical Evaluation of Dissimilarity Measures for Color and Texture Presenting: Learning to Detect Natural Image Boundaries Using Local Brightness, Color, and Texture Cues (conference version)	Rubner , Tomasi, & Guibas Rubner, Puzicha, Tomasi, & Buhmann Martin, Fowlkes, Malik	Rubner Martin	Rubner ppt Martin ppt
Jan. 30	Jonathan Huang	Statistics of Natural Image Categories Optional Reading: Depth estimation from image structure Optional Reading: Modeling the shape of the scene: a holistic representation of the spatial envelope	Torralba & Oliva Torralba, & Oliva Oliva & Torralba	here	ppt
Feb. 1	Alyosha Efros	Presenting an overview of bag-of-words appraoches: Optional: When is scene recognition just texture recognition? Optional: Visual categorization with bags of keypoints Optional: Object Categorization by Learned Universal Visual Dictionary	Renninger, L.W. & Malik, J G. Csurka, C. Bray, C. Dance, and L. Fan Winn, A. Criminisi and T. Minka	Bag-o-words	ppt
Feb. 6	David Bradley	Presenting: Object Recognition with Informative Features and Linear Classification Optional Reading: Visual features of intermediate complexity and their use in classification	Ullman, S., Vidal-Naquet, M. , and Sali, E Michel Vidal-Naquet, Shimon Ullman	Ullman	ppt
Feb. 8	Tomasz Malisiewicz (P) Alyosha Effros (O)	A Bayesian hierarchical model for learning natural scene categories. Discovering Objects and their Location in Images,	Fei-Fei and P. Perona Josef Sivic, Bryan Russell, Alexei A. Efros, Andrew Zisserman, Bill Freeman	here	ppt
Part II: Mid-level Vision (Image Segmentation)
Feb. 13-15	Carlos Vallespi (P) Joseph Djugash (D) Gunhee Kim (O)	Normalized cuts and image segmentation Segmentation using eigenvectors: a unifying view	Jianbo Shi; Malik, J. Weiss, Y.	here	ppt
Feb. 20	Mohit Gupta (P)	Interactive Graph Cuts for Optimal Boundary & Region Segmentation of Objects in ND Images Optional: Lazy Snapping Optional: Video Object Cut and Paste (cool SIGGRAPH video)	Boykov & Jolly Yin Li, Jian Sun, Chi-Keung Tang, Heung-Yeung Shum Yin Li, Jian Sun, Heung-Yeung Shum	here	ppt
Feb. 22		Project Proposals
Feb. 27	Derek Hoiem	Geometric Context from a Single Image	Derek Hoiem, Alexei Efros, Martial Hebert	here
Mar. 1	Tomasz Malisiewicz (P)	Image Segmentation by Data-Driven Markov Chain Monte Carlo	Tu and Zhu	here	ppt
Part III: 2D Recognition
Mar. 6 (A)	Nicolas Chan (P) Tomasz Malisiewicz (O) Pete Barnum (D)	Object Detection Using the Statistics of Parts Robust Real-time Object Detection	H. Schneiderman and T. Kanade Viola, Jones	here	Main ppt Opp. ppt Demo ppt
Mar. 8 (A)	Pete Barnum (P)	Histograms of Oriented Gradients for Human Detection	Dalal, Triggs	here	ppt
Mar. 13		Spring Break
Mar. 15		Spring Break
Mar. 20 (B)	David Lee (P) Heather Dunlop (D) David Thompson (O)	Object Recognition from Local Scale-Invariant Features	David G. Lowe	here	Main ppt Demo ppt
Mar. 22 (B)	Stephan Zickler (P)	Real-time Object Detection for Smart Vehicles Optional: Automatic Target Recognition by Matching Oriented Edge Pixels	Gavrila & Philomin Olson & Huttenlocher	here	Main ppt
Mar. 27 (A)	Gunhee Kim (P) Joseph Djugash (O) ? (D)	Shape Matching and Object Recognition Using Shape Contexts Shape Matching and Object Recognition using Low Distortion Correspondences	Belongie, Malik, and Puzicha A Berg, T Berg, J Malik	here	Main ppt
Mar. 29 (A)	Dhruv Batra (P) Krishnan Ramnath (D)	*NEW: paper changed to a more readable version* Active Appearance Models Optional: Active Appearance Models Revisited Optional: Manipulating Facial Appearance Through Shape and Color Optional: A Morphable Model for the Synthesis of 3D Faces	T. F. Cootes, G. J. Edwards, C. J. Taylor Matthews & Baker Rowland & Perret Blanz & Vetter	here	Main ppt Demo (quicktime)
Recognition with Segmentation
Apr. 3 (B)	Joseph Djugash (P) Heather Dunlop (O)	Class-Specific, Top-Down Segmentation Learning to Segment Combining Top-Down and Bottom-Up Segmentation	Eran Borenstein, Shimon Ullman Eran Borenstein, Shimon Ullman E. Borenstein, E. Sharon, S. Ullman	here	Opp ppt
Apr. 5 (B)	Dhruv Batra (P)	Pedestrian Detection in Crowded Scenes	B Leibe, E Seemann, B Schiele	Here	Main ppt
Apr. 10 (A)	Nik Melchior (P) David Lee (O) Stephan Zickler (D)	LOCUS: Learning Object Classes with Unsupervised Segmentation	J. Winn and N. Jojic	Here	Main (openoffice) Opp ppt Demo ppt
Apr. 12 (A)	David Lee (P) Carlos Vallespi (O) Gunhee Kim (D)	Context-based vision system for place and object recognition Optional: Contextual Models for Object Detection using Boosted Random Fields	A. Torralba, K. P. Murphy, W. T. Freeman and M. A. Rubin	Here	Main ppt Opp ppt
Machine Translation Approaches
Apr. 17 (B)	Heather Dunlop (P)	Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary Matching Words and Pictures	Pinar Duygulu, Kobus Barnard, Nando de Freitas, and David Forsyth Kobus Barnard, Pinar Duygulu, Nando de Freitas, David Forsyth, David Blei, and Michael I. Jordan	Here	Main ppt
Apr. 19 (B)	Krishnan Ramnath (P) Nicholas Chan (D) Nicolas Chan (O)	Names and Faces	Tamara L. Berg, Alexander C. Berg, Jaety Edwards, Michael Maire, Ryan White, Yee Whye Teh, Erik Learned-Miller, David A. Forsyth	Here	Main ppt
Intrinsic Images
Friday Apr. 21 (A) NSH 4201	Mohit Gupta (P) Mohit Gupta (D) Malola Prasath (P) David Lee (D) Nik Melchoir (O)	Deriving intrinsic images from image sequences Recovering Intrinsic Images from a Single Image	Yair Weiss Marshall F Tappen, William T Freeman, Edward H Adelson	Here Here	Main and Demo ppt Demo ppt Opp (pdf)
Apr. 26 (A)	Stephan Zickler (P)	The Perception of Shading and Reflectance	Adelson & Pentland	Here	Main ppt
Manifold Learning
May 1	Dave Thompson (P) Jonathan Huang (O) Nik Melchoir (D)	A global geometric framework for nonlinear dimensionality reduction Nonlinear dimensionality reduction by locally linear embedding	J. B. Tenenbaum, V. De Silva, and J. C. Langford Sam Roweis & Lawrence Saul	Here	Main pdf
May 3	Ramnath & Gupta Batra & Kim Huang & Malisiewicz Melchior & Lee Stefan Zickler	Project Presentations – Part I
May 8	Chan & Barnum Djugash Thompson & Dunlop Carlos Vallespi	Project Presentations – Part II

RELEVANT TEXTS

Vision Science: Photons to Phenomenology by Stephen E. Palmer
Computer Vision: A Modern Approach, Forsyth and Ponce
Introductory Techniques for 3-D Computer Vision Trucco and Verri
An Invitation to 3D Vision: From Images to Geometric Models, Y. Ma, S. Soatto, J. Kosecka, S. Sastry
Multiple View Geometry in Computer Vision by Hartley & Zisserman
The Geometry of Multiple Images by Faugeras, Luong, and Papadopoulo
Neural Networks for Pattern Recognition, Bishop.

Most recently updated on January. 27, 2006 by David Bradley

Site design courtesy of Serge Belongie.

16-721: Advanced Machine Perception

Spring 2006

OVERVIEW

PROJECTS

*NEW* 20 minute Project meetings will be held with each group every other week at Craig Street Coffee on Mondays and on campus on Wednesdays.

Time

Monday (A)

Wednesday (A)

Monday (B)

Wednesday (B)

12:10 - 12:30

N/A

Zickler

N/A

Vallespi

12:30 - 12:50

Thompson & Dunlop

Batra & Kim

12:50 - 1:10

Ramnath

Chan & Barnum

1:10 - 1:30

Djugash

1:30 - 1:50

Melchior

MEETING SCHEDULE

RELEVANT TEXTS

NEW 20 minute Project meetings will be held with each group every other week at Craig Street Coffee on Mondays and on campus on Wednesdays.