16-721: Advanced Perception

ROBOTICS INSTITUTE
CARNEGIE MELLON UNIVERSITY

Spring 2006

*NEW* The discussion board for signing up for papers is now available here *NEW*

PAPERS CURRENTLY CHOSEN:

Adelson & Bergen, The Plenoptic Function and the Elements of Early Vision

Cavanagh, P. (1996). Vision is getting easier every day. Perception, 24, 1227-1232.

Cavanagh, P. (1991). What's up in top-down processing? In A. Gorea (ed.) Representations of Vision: Trends and Tacit Assumptions in Vision Research, Cambridge, UK: Cambridge University Press, 295-304.

Cavanagh, P. (1999). Pictorial art and vision. In Robert A. Wilson and Frank C. Keil (Eds.), MIT Encyclopedia of Cognitive Science, (pp. 648-651) Cambridge, MA: MIT Press.

SUGGESTED PAPERS

Part I: Low-level Vision (images as texture)

Olshausen & field, Wavelet-like receptive fields emerge from a network that learns sparse codes for natural images. (1996) Nature, 381: 607-609. (code available)

Y. Rubner and C. Tomasi and L. J. Guibas. The Earth Mover's Distance as a Metric for Image Retrieval. International Journal of Computer Vision, 40(2) November 2000, pages 99--121. (code available)

Y. Rubner,J. Puzicha, C. Tomasi, and J. M. Buhmann. Empirical Evaluation of Dissimilarity Measures for Color and Texture. Computer Vision and Image Understanding Journal, 84(1):25-43, October 2001.

Advocate: Peter Barnum

Martin, Fowlkes, Malik, Learning to Detect Natural Image Boundaries Using Local Brightness, Color, and Texture Cues. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(5):530-549, May 2004. (short version) (code and data available)

Advocate: Heather Dunlop

Scene Models

A. Torralba and A. Oliva. Statistics of Natural Image Categories (2003) Network: Computation in Neural Systems. Vol. 14, 391-412.

A. Torralba, A. Oliva. Depth estimation from image structure (2002) IEEE Transactions on Pattern Analysis and Machine Intelligence. 24(9): 1226-1238.

A. Oliva, A. Torralba. Modeling the shape of the scene: a holistic representation of the spatial envelope. (2001) International Journal of Computer Vision, Vol. 42(3): 145-175.

Advocate: Jonathan Huang

"Bag of Words" Models

Renninger, L.W. & Malik, J. (2004). When is scene recognition just texture recognition? Vision Research, 44, 2301-2311 (data available)

G. Csurka, C. Bray, C. Dance, and L. Fan. Visual categorization with bags of keypoints. In Workshop on Statistical Learning in Computer Vision, ECCV, pages 1-22, 2004.

J. Winn, A. Criminisi and T. Minka. Object Categorization by Learned Universal Visual Dictionary. Proc. IEEE Intl. Conf. on Computer Vision (ICCV), Beijing 2005.

To be briefly covered by Alyosha Efros

Ullman, S., Vidal-Naquet, M. , and Sali, E. Visual features of intermediate complexity and their use in classification. (2002) Nature Neuroscience, 5(7), 1-6

Michel Vidal-Naquet, Shimon Ullman. Object Recognition with Informative Features and Linear Classification. ICCV 2003

Advocate: David Bradley

Fei-Fei and P. Perona. A Bayesian hierarchical model for learning natural scene categories. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, San Diego, CA, volume 2, pages 524-531, June 2005. (code available)

Advocate: Tomasz Malisiewicz

Demo: Ellie Lin

Josef Sivic, Bryan Russell, Alexei A. Efros, Andrew Zisserman, Bill Freeman, Discovering Objects and thier Location in Images, ICCV 2005 (code available)

Advocate: Tomasz Malisiewicz

Part II: Mid-level Vision (Image Segmentation)

Max Wertheimer, Laws of Organization in Perceptual Forms (1923)

Jianbo Shi; Malik, J. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, Aug. 2000, vol.22, (no.8):888-905. (code available)

Advocate: Carlos Vallespi

Demo: Joseph Djugash

Meila, M. and Shi, J. Learning Segmentation with Random Walks. Advances in Neural Information Processing Systems 13 (NIPS 2000).

Weiss, Y. Segmentation using eigenvectors: a unifying view. Proceedings of the Seventh IEEE International Conference on Computer Vision, Kerkyra, Greece, 20-27 Sept. 1999.

advocate: Carlos Vallespi

Andrew Y. Ng, Michael I. Jordan, Yair Weiss, On Spectral Clustering: Analysis and an algorithm (2001) NIPS

Xiaofeng Ren and Jitendra Malik, Learning a Classification Model for Segmentation. in ICCV '03 (superpixel code available)

Tu and Zhu, Image Segmentation by Data-Driven Markov Chain Monte Carlo, PAMI (2002)

Advocate: Tomasz Malisiewicz

D. Comaniciu, P. Meer. Mean Shift: A Robust Approach toward Feature Space Analysis, IEEE Trans. Pattern Analysis Machine Intell., Vol. 24, No. 5, 603-619, 2002

Boykov & Jolly, Interactive Graph Cuts for Optimal Boundary & Region Segmentation of Objects in ND Images. ICCV 01

Yin Li; Jian Sun; Chi-Keung Tang; Heung-Yeung Shum, Lazy Snapping, SIGGRAPH 04

Advocate: Mohit Gupta

Part III: 2D Recognition

Window Scanning Approaches

H. Schneiderman and T. Kanade. Object Detection Using the Statistics of Parts. International Journal of Computer Vision, 2004 (demo available)

Viola, Jones, Robust Real-time Object Detection (2001) Second International Workshop on Statistical and Computational Theories of Vision (short version)

Advocate: Nicolas Chan

Opposition: Tomasz Malisiewicz

Demo: Pete Barnum

Dalal, Triggs, Histograms of Oriented Gradients for Human Detection, CVPR 2005 (data available)

Advocate: Pete Barnum

Correspondence Matching Approaches

Gavrila & Philomin, Real-time Object Detection for Smart Vehicles, ICCV 1999

Advocate: Stephan Zickler

Oppose: ?

Olson & Huttenlocher. Automatic Target Recognition by Matching Oriented Edge Pixels, IEEE Transactions on Image Processing 1997

Erik Learned-Miller, (2005) Data Driven Image Models through Continuous Joint Alignment. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI). (code available)

Belongie, Malik, and Puzicha. Shape Matching and Object Recognition Using Shape Contexts (2002)

Advocate: Carlos Vallespi

Demo: Carlos Vallespi

A Berg, T Berg, J Malik, Shape Matching and Object Recognition using Low Distortion Correspondences, CVPR 2005

Advocate: Gunhee Kim

Opposition: Joseph Djugash

M. Leordeanu and M. Hebert, A Spectral Technique for Correspondence Problems using Pairwise Constraints, ICCV 2005

Demo: Dhruv Batra

David G. Lowe, Object Recognition from Local Scale-Invariant Features, ICCV 1999

Advocate: David Lee

Demo: Heather Dunlop

Fitzgibbon, A. W. and Zisserman, A. On Affine Invariant Clustering and Automatic Cast Listing in Movies, ECCV 2002

T. F. Cootes, G. J. Edwards, C. J. Taylor, Active Appearance Models, PAMI 2001

Recognition with Segmentation

Eran Borenstein, Shimon Ullman. Class-Specific, Top-Down Segmentation. ECCV 2002

Eran Borenstein, Shimon Ullman: Learning to Segment. ECCV 2004

E. Borenstein, E. Sharon, S. Ullman, Combining Top-Down and Bottom-Up Segmentation, Proceedings IEEE workshop on Perceptual Organization in Computer Vision, IEEE Conference on Computer Vision and Pattern Recognition, Washington, DC, June 2004.

Advocate: Joseph Djugash

Opposition: Heather Dunlop

Xiaofeng Ren, Charless Fowlkes and Jitendra Malik, Cue Integration for Figure/Ground Labeling. (2005) NIPS

Stella X. Yu and Jianbo Shi, Object-Specific Figure-Ground Segregation, CVPR 2003

B Leibe, E Seemann, B Schiele, Pedestrian Detection in Crowded Scenes, CVPR 2005

J. Winn and N. Jojic. LOCUS: Learning Object Classes with Unsupervised Segmentation, Proc. IEEE Intl. Conf. on Computer Vision (ICCV), Beijing 2005.

Advocate: Nik Melchior

Opponent: David Lee

Z Tu, X Chen, AL Yuille, SC Zhu. Image Parsing: Unifying Segmentation, Detection, and Recognition. International Journal of Computer Vision, 2005

A. Torralba, K. P. Murphy, W. T. Freeman and M. A. Rubin, Context-based vision system for place and object recognition, ICCV 2003

Advocate: David Lee

Demo: Gunhee Kim

A. Torralba, K. P. Murphy and W. T. Freeman (2004), Contextual Models for Object Detection using Boosted Random Fields. To appear in Adv. in Neural Information Processing Systems (NIPS)

Words and Pictures

Tamara L. Berg, Alexander C. Berg, Jaety Edwards, Michael Maire, Ryan White, Yee Whye Teh, Erik Learned-Miller, David A. Forsyth. Names and Faces. in submission

Advocate: Krishnan Ramnath

Pinar Duygulu, Kobus Barnard, Nando de Freitas, and David Forsyth. Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. ECCV 2002.

Advocate: Heather Dunlop

Kobus Barnard, Pinar Duygulu, Nando de Freitas, David Forsyth, David Blei, and Michael I. Jordan, Matching Words and Pictures. Journal of Machine Learning Research, 2003.

Part IV: Intrinsic Images

HG Barrow, JM Tenenbaum, Recovering Intrinsic Scene Characteristics from Images, 1978 (classic paper!)

Adelson & Pentland, The Perception of Shading and Reflectance, 1996

Advocate: Seth Koterba

Opponent: Stephan Zickler

Sinha & Adelson: Recovering Reflectance in a World of Painted Polyhedra, ICCV 1993

Yair Weiss, Deriving intrinsic images from image sequences, ICCV 2001 (code available)

Advocate: Mohit Gupta

Demo: Mohit Gupta

GD Finlayson, MS Drew, C Lu, Intrinsic Images by Entropy Minimization, ECCV 04

Marshall F Tappen, William T Freeman, Edward H Adelson, Recovering Intrinsic Images from a Single Image. NIPS 2002. (there is also a longer version that was published in the September 2005 issue of IEEE Transactions on Pattern Analysis and Machine Intelligence)

Advocate: Malola Prasath

Hoiem, Efros, Hebert, Geometric Context from a Single Image, ICCV 2005 (code available)

Advocate: Stefan Zickler + demo too?

Ashutosh Saxena, Sung Chung, and Andrew Y. Ng. Learning Depth from Single Monocular Images. NIPS 2005.

Advocate: Malola Prasath

Tenenbaum, & Freeman. Separating Style and Content with Bilinear Models. Neural Computation, 2000.

Part V: Dealing with Data

J. B. Tenenbaum, V. De Silva, and J. C. Langford. A global geometric framework for nonlinear dimensionality reduction. Science 290 (5500): 22 December 2000. (code available)

Sam Roweis & Lawrence Saul. Nonlinear dimensionality reduction by locally linear embedding. Science v.290 no.5500, Dec.22, 2000. (code available)

Advocate: Dave Thompson

Opponent: Jonathan Huang

D. D. Lee and H. S. Seung. Learning the parts of objects by non-negative matrix factorization. Nature 401, 788-791 (1999). (code available)

Part VI: Tracking & Motion Segmentation

Isard & Blake, CONDENSATION conditional density propagation for visual tracking. IJCV, 1998

Toyama & Blake, Probabilistic Tracking with Exemplars in a Metric Space. IJCV, 2002

C. L. Zitnick, N. Jojic, S. B. Kang. Consistent segmentation for optical flow estimation. IEEE Int'l Conf. on Computer Vision, 2005.

Ramanan, Forsyth, Zisserman. Strike a Pose: Tracking People by Finding Stylized Poses, CVPR 2005 (video examples)

MP Kumar, PHS Torr, A Zisserman, Learning Layered Motion Segmentations of Video, ICCV '05

Most recently updated on January. 16, 2006 by Alyosha Efros

Suggested Papers for 16-721: Advanced Machine Perception

Spring 2006