ROBOTICS INSTITUTE
CARNEGIE MELLON UNIVERSITY

16-721: Advanced Machine Perception

Spring 2006

 

OVERVIEW

16-721 is a graduate seminar devoted to recent research on computer vision. We will be reading an eclectic mix of vision papers on topics such as perception, object and scene recognition, segmentation, tracking, as well as "best papers of all time".

We will meet on Mondays and Wednesdays from 10:30am-11:50am in NSH 3002. The first meeting will be on Monday January 16th, and the final meeting will be on Wednesday May 3, 2006.

Instructor: Alexei (Alyosha) Efros, Assistant Professor, 4207 Newell-Simon Hall.

Office Hours:    Monday 12:-12:30 p.m.

Friday 2:30-3:30 p.m.

TA: David Bradley, 2216 Newell-Simon Hall. 

Office Hours: Tuesday 1:00-2:00 p.m. or by appointment.

Feel free to send email to efros (at) cs or dbradley (at) cs with any questions.

PROJECTS

Check out this list of data sources for some ideas on where to get images to work with.

*NEW* 20 minute Project meetings will be held with each group every other week at Craig Street Coffee on Mondays and on campus on Wednesdays. 

Time

Monday (A)

Wednesday (A)

Monday (B)

Wednesday (B)

12:10 - 12:30

N/A

Zickler

N/A

Vallespi

12:30 - 12:50

Thompson & Dunlop

 

Batra & Kim

 

12:50 - 1:10

Ramnath

Chan & Barnum

1:10 - 1:30

Djugash

 

1:30 - 1:50

Melchior

 

 

MEETING SCHEDULE

A list of suggested papers to present is available here.

For some journal-length papers, shorter conference versions have been posted.  Feel free to read either paper.

 The discussion board for signing up for papers is now available here

Sign up for at least 2 papers, demo 1 and oppose 1.

If you want to change your presentation date, please arrange a swap with another student and notify the instructor at least two weeks in advance.

date

Presenter

paper title

author(s)

discussion
board

slides

Jan. 16

Alyosha Efros

Introduction, Vision: Measurement vs. Perception

Administrative stuff, overview of the course, datasets

 

 

ppt

Jan. 18

Alyosha Efros

Overview lecture on the physiology of vision

Suggested reading: The Plenoptic Function and the Elements of Early Vision (1991)

Adelson & Bergen

here

ppt

Jan. 23

Alyosha Efros

Dave Thompson

Overview lecture on theories of Visual Perception

Vision is getting easier every day (1995)

What's up in top-down processing? (1991)

Pictorial art and vision (1991)

Patrick Cavanagh

here

Perception ppt

Cavanagh ppt

Part I: Low-level Vision (images as texture)

Jan. 25

Peter Barnum

 

Heather Dunlop

Presenting: The Earth Mover's Distance as a Metric for Image Retrieval. (conference version)

Optional Reading: Empirical Evaluation of Dissimilarity Measures for Color and Texture

 

Presenting: Learning to Detect Natural Image Boundaries Using Local Brightness, Color, and Texture Cues (conference version)

Rubner , Tomasi, & Guibas

Rubner, Puzicha, Tomasi, & Buhmann

 

Martin, Fowlkes, Malik

Rubner

Martin

Rubner ppt

Martin ppt

Jan. 30

Jonathan Huang

Statistics of Natural Image Categories

Optional Reading: Depth estimation from image structure

Optional Reading: Modeling the shape of the scene: a holistic representation of the spatial envelope

Torralba & Oliva

Torralba, & Oliva

Oliva & Torralba

here

ppt

Feb. 1

Alyosha Efros

 

 

Presenting an overview of  bag-of-words appraoches:

      Optional: When is scene recognition just texture recognition?

      Optional:  Visual categorization with bags of keypoints

      Optional: Object Categorization by Learned Universal Visual Dictionary

 

 

Renninger, L.W. & Malik, J

G. Csurka, C. Bray, C. Dance, and L. Fan

Winn, A. Criminisi and T. Minka

Bag-o-words

ppt

Feb. 6

David Bradley

Presenting: Object Recognition with Informative Features and Linear Classification

Optional Reading: Visual features of intermediate complexity and their use in classification

Ullman, S., Vidal-Naquet, M. , and Sali, E

Michel Vidal-Naquet, Shimon Ullman

Ullman

ppt

Feb. 8

Tomasz Malisiewicz (P)

Alyosha Effros (O)

A Bayesian hierarchical model for learning natural scene categories.

Discovering Objects and their Location in Images,

Fei-Fei and P. Perona

Josef Sivic, Bryan Russell, Alexei A. Efros, Andrew Zisserman, Bill Freeman

here

ppt

Part II: Mid-level Vision (Image Segmentation)

Feb. 13-15

Carlos Vallespi (P)

Joseph Djugash (D)

Gunhee Kim  (O)

 Normalized cuts and image segmentation

Segmentation using eigenvectors: a unifying view

Jianbo Shi; Malik, J.

Weiss, Y.

here

ppt

Feb. 20

Mohit Gupta (P)

Interactive Graph Cuts for Optimal Boundary & Region Segmentation of Objects in ND Images

Optional: Lazy Snapping

 

Optional: Video Object Cut and Paste (cool SIGGRAPH video)

Boykov & Jolly

Yin Li, Jian Sun, Chi-Keung Tang, Heung-Yeung Shum

Yin Li, Jian Sun, Heung-Yeung Shum

here

ppt

Feb. 22

 

 

Project Proposals

 

 

 

 

Feb. 27

Derek Hoiem

Geometric Context from a Single Image

Derek Hoiem, Alexei Efros, Martial Hebert

here

 

Mar. 1

Tomasz Malisiewicz (P)

Image Segmentation by Data-Driven Markov Chain Monte Carlo

Tu and Zhu

here

ppt

Part III: 2D Recognition

Mar. 6 (A)

Nicolas Chan  (P)

Tomasz Malisiewicz (O)

Pete Barnum (D)

Object Detection Using the Statistics of Parts

Robust Real-time Object Detection

H. Schneiderman and T. Kanade

Viola, Jones

here

Main ppt

Opp. ppt

Demo ppt

Mar. 8 (A)

Pete Barnum (P)

Histograms of Oriented Gradients for Human Detection

Dalal, Triggs

here

ppt

Mar. 13

 

Spring Break

 

 

 

Mar. 15

 

 Spring Break

 

 

 

Mar. 20 (B)

David Lee (P)

Heather Dunlop (D)

David Thompson (O)

Object Recognition from Local Scale-Invariant Features

David G. Lowe

here

Main ppt

Demo ppt

Mar. 22 (B)

 Stephan Zickler (P)

Real-time Object Detection for Smart Vehicles

Optional: Automatic Target Recognition by Matching Oriented Edge Pixels

Gavrila & Philomin

Olson & Huttenlocher

here

Main ppt

Mar. 27 (A)

Gunhee Kim (P)

Joseph Djugash (O)

    ?    (D)

Shape Matching and Object Recognition Using Shape Contexts

Shape Matching and Object Recognition using Low Distortion Correspondences

Belongie, Malik, and Puzicha

A Berg, T Berg, J Malik

here

Main ppt

Mar. 29 (A)

Dhruv Batra (P)

Krishnan Ramnath (D)

*NEW: paper changed to a more readable version* Active Appearance Models

Optional: Active Appearance Models Revisited

Optional: Manipulating Facial Appearance Through Shape and Color

Optional: A Morphable Model for the Synthesis of 3D Faces

T. F. Cootes, G. J. Edwards, C. J. Taylor

Matthews & Baker

Rowland & Perret

Blanz & Vetter

here

Main ppt

Demo (quicktime)

Recognition with Segmentation

Apr. 3 (B)

Joseph Djugash (P)

Heather Dunlop (O)

Class-Specific, Top-Down Segmentation

Learning to Segment

Combining Top-Down and Bottom-Up Segmentation

Eran Borenstein, Shimon Ullman

Eran Borenstein, Shimon Ullman

E. Borenstein, E. Sharon, S. Ullman

here

 Opp ppt

Apr. 5 (B)

Dhruv Batra (P)

Pedestrian Detection in Crowded Scenes

B Leibe, E Seemann, B Schiele

Here

Main ppt

Apr. 10 (A)

Nik Melchior (P)

David Lee (O)

Stephan Zickler (D)

LOCUS: Learning Object Classes with Unsupervised Segmentation

J. Winn and  N. Jojic

Here

Main (openoffice)

Opp ppt

Demo ppt

Apr. 12 (A)

David Lee (P)

Carlos Vallespi (O)

Gunhee Kim (D)

Context-based vision system for place and object recognition

Optional: Contextual Models for Object Detection using Boosted Random Fields

A. Torralba,  K. P. Murphy, W. T. Freeman and M. A. Rubin

Here

Main ppt

Opp ppt

Machine Translation Approaches

Apr. 17 (B)

Heather Dunlop (P)

Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary

 

Matching Words and Pictures

Pinar Duygulu, Kobus Barnard, Nando de Freitas, and David Forsyth

Kobus Barnard, Pinar Duygulu, Nando de Freitas, David Forsyth, David Blei, and Michael I. Jordan

Here

Main ppt

Apr. 19 (B)

Krishnan Ramnath (P)

Nicholas Chan (D)

Nicolas Chan (O)

Names and Faces

Tamara L. Berg, Alexander C. Berg, Jaety Edwards, Michael Maire, Ryan White, Yee Whye Teh, Erik Learned-Miller, David A. Forsyth

Here

Main ppt

Intrinsic Images

Friday

Apr. 21 (A)

NSH 4201

Mohit Gupta (P)

Mohit Gupta (D)

 

Malola Prasath (P)

David Lee (D)

Nik Melchoir (O)

Deriving intrinsic images from image sequences

 

 

Recovering Intrinsic Images from a Single Image

Yair Weiss

 

 

Marshall F Tappen, William T Freeman, Edward H Adelson

Here

 

Here

Main and Demo ppt

Demo ppt

Opp (pdf)

Apr. 26 (A)

Stephan Zickler (P)

The Perception of Shading and Reflectance

Adelson & Pentland

Here

Main ppt

Manifold Learning

May 1

Dave Thompson (P)

Jonathan Huang (O)

Nik Melchoir (D)

A global geometric framework for nonlinear dimensionality reduction

 

Nonlinear dimensionality reduction by locally linear embedding

J. B. Tenenbaum, V. De Silva, and J. C. Langford

Sam Roweis & Lawrence Saul

Here

Main pdf

May 3

Ramnath & Gupta

Batra & Kim

Huang & Malisiewicz

Melchior & Lee

Stefan Zickler

 

Project Presentations – Part I

 

 

 

 

May 8

Chan & Barnum

Djugash

Thompson & Dunlop

Carlos Vallespi

Project Presentations – Part II

 

 

 

 

RELEVANT TEXTS

Vision Science: Photons to Phenomenology by Stephen E. Palmer
Computer Vision: A Modern Approach, Forsyth and Ponce
Introductory Techniques for 3-D Computer Vision Trucco and Verri
An Invitation to 3D Vision: From Images to Geometric Models, Y. Ma, S. Soatto, J. Kosecka, S. Sastry
Multiple View Geometry in Computer Vision by Hartley & Zisserman
The Geometry of Multiple Images by Faugeras, Luong, and Papadopoulo
Neural Networks for Pattern Recognition, Bishop.


Most recently updated on January. 27, 2006 by David Bradley

Site design courtesy of Serge Belongie.