Clustering
Lecturer:
Sriram Sankararaman
Date: Sep 17
[Lecture slides in PDF]
General survey
Jain, A., M. Murty, and P. Flynn (1999).
Data clustering: A review.
Comparing clustering
Rand index
Hubert, L., & Arabie, P. (1985).
Comparing partitions.
.
Meila, M. (2005).
Comparing clusterings: an axiomatic view
.
Dissimilarity function
Read chapter 8.2 of
Data Mining: Concepts and Techniques
by J. Han and M. Kamber for dealing with different types of variables
Eric P. Xing, Andrew Y. Ng, Michael I. Jordan, Stuart Russell (2002).
Distance Metric Learning, with Application to Clustering with Side-information
.
Estimating the number of clusters
Trevor Hastie, Robert Tibshirani and Guenther Walther (2000)
Estimating the number of data clusters visa the Gap Statistic
KD-tree for K-Means Clustering
T. Kanungo, D. M. Mount, N. Netanyahu, C. Piatko, R. Silverman, and A. Y. Wu (2002).
An efficient k-means clustering algorithm: Analysis and implementation
Spectral Clustering
Jianbo Shi and Jitendra Malik (2000).
Normalized Cuts and Image Segmentation
.
Software
Marina Maila and Jianbo Shi (2001)
Learning Segmentation with Random Walk
A.Y.Ng, M.I.Jordan, and Y.Weiss (2002)
On spectral clustering: Analysis and an algorithm.
Ulrike von Luxburg (2006)
A tutorial on Spectral Clustering