Lecture: Tuesdays and Thursdays, 11:00am-12:30pm, 310 Soda
Instructor: Prasad Raghavendra
Office hours: Tuesday 1:30-2:30pm
In this course, we will explore the tractability of computational problems arising in high-dimensional statistics. Specifically, we will study
General techniques for designing efficient (i.e., polynomial-time) algorithms for these problems.
Conjectured computational barriers that arise among these problems and broad approaches to provide evidence of computational intractability.
A tentative list of topics include the following:
Problems:
Robust and heavy-tailed statistics: mean estimation and linear regression.
Mixture models, learning mixtures of Gaussians.
Stochastic block models, community detection.
Sparse PCA.
Matrix and tensor completion.
Algorithmic Tools:
Spectral methods & filtering.
Sums-of-squares semi-definite programming
Belief propagation
Approximate message passing
Evidence of intractability
Computational-statistical gaps.
Low degree hardness
Overlap gap property
Statistical query lower bounds.
Ideas from statistical physics:
Cavity Method
(Aug 25) Heavy-tailed mean estimation in one dimension.
(Aug 30) Heavy-tailed mean estimation in high-dimensions: Lugosi-Mendelson estimator, Rademacher complexity.
Section 4.1 in “High-Dimensional Statistics: A Non-Asymptotic Viewpoint” by Martin Wainwright
(Sept 1) Heavy-tailed mean estimation in high-dimensions: Efficient algorithm for heavy-tailed mean estimation via ellipsoid algorithm.
(Sept 6) Robust Mean Estimation
(Sept 8) Sum of squares SDP: Introduction to semidefinite programming, sum of squares SDPs, univariate polynomial optimization via SoS SDPs
(Sept 13) Sum of squares proofs: Positivestellensatz, SoS SDP relaxation, multi-variate case.
(Sept 15) Sum of squares proofs: Robust Linear regression
Scribe notes from an earlier class: link.
(Sept 20) Sum of squares proofs: Robust Linear regression
Scribe notes from an earlier class: link.
(Sept 22) List-decodable linear regression
(Sept 27) Wrap up of list-decodable linear regression, SVD.
(Sept 29) Tensors, tensor decomposition and its applications
Chapter 3 in Moitra's book
(Oct 4) Jenrich's algorithm, SoS version
Chapter 3 in Moitra's book
Lecture notes by Barak and Steurer
(Oct 6) Tensor decomposition via SoS.
Sect 2.2 and section 5 in Barak-Kelner-Steurer
Lecture notes by Barak and Steurer
Section 2.4 in survey
video of lecture by Pravesh Kothari on the topic
(Oct 11) High-dimensional inference problems, factor graphs and belief propagation.
Chapter 2 in lecture notes by Montanari
Chapter 4 in monograph by Krzakala and Zdeborova
(Oct 13) Free energy, Bethe free energy, Cavity method.
Section 1.2, Appendix 1.A in monograph by Krzakala and Zdeborova
Chapter 2 in lecture notes by Montanari
(Oct 18) Free energy, Bethe free energy, Cavity method.
Section 1.2, Appendix 1.A in monograph by Krzakala and Zdeborova
Chapter 2 in lecture notes by Montanari
(Oct 20) Cavity Method
(Oct 25) Stability of BP fixed points, non-backtracking walk operator.
(Oct 27) Stability of BP fixed points, non-backtracking walk operator, computational intractability at KS threshold.
(Nov 1) Statistical query lower bounds.
Chapter 8 in book by Diakonikolas and Kane.
(Nov 3) Statistical query lower bounds.
Chapter 8 in book by Diakonikolas and Kane.
(Nov 8) Low degree likelihood ratio
survey by Kunisky, Wein and Bandiera.
(Nov 10) Overlap gap property
paper by Wein
(Nov 15) SoS lower bounds
notes on planted clique by Barak and Steurer
(Nov 17) Reductions
(Nov 22) Project presentations
(Nov 29) Project presentations
(Dec 1st) Project presentations
The class is aimed at graduate students specializing in theoretical computer science,statistics or related areas. A solid background in probability and linear algebra is an absolute must.
Scribing: one or two lectures 40%
Project: In groups of 2 or 3, write a survey on a topic of your choice : 40%
Class participation 20%
If you are interested in taking the class, please drop me an email with the subject line 'CS294 Enrollment’ along with details of what graduate program and research group you belong to, and relevant course/research experience if any.
Working in groups of two (or less), you are expected to find a potential direction for future research or an open problem, and write a short note (4-6 pages) on it. At the end of the semester, you will be giving a 15 minute whiteboard presentation of your question, to the rest of the class. Please sign up for your presentation time at the following spreadsheet:
We hope this gives an opportunity for you to explore a sub-area, assimilate the broad contours of existing literature, and most importantly exercise one of the most important skills as a researcher, that of asking questions. There are so many ways in which you can land on a question:
Perhaps, you read a result in the area, and wonder does the techniques extend to this other problem?
You ask why did the techniques stop where they did, what if we used this other technique?
You find a computational problem that seems important, but no one has designed an efficient algorithm for?
You notice a pattern and wonder whether there is an underlying phenomenon?
There are no wrong questions. Your questions don't need to be new either. But we hope that you are able to make a case to your peers that the question is interesting. So explore broadly and exercise your taste, to find a question that you think is interesting.
Here are some topics that can serve as starting points for you:
Private mean estimation
Heavy-tailed statistics
Property testing of distributions
Phyllogeny
Trace reconstruction
SoS lower bounds
Gaussian graphical models
Learning graphical models
Reductions between statistical problems
Overlap gap property
Multi-reference alignment, cryoEM
Balancing covariates in randomized experiments (paper)
Robust learning from batches (paper)
Tensor decomposition
Phase retrieval
more to be added
(The links are meant to be starting points for your research, and may not be exhaustive or latest