CS271 RANDOMNESS & COMPUTATION: Spring 2020
INSTRUCTOR: Alistair Sinclair
(sinclair@cs; 677 Soda)
LECTURES: Tuesday, Thursday 9:30-11:00 in 310 Soda
OFFICE HOURS: Monday 1:00-2:00, Thursday 11:00-12:00 in 677 Soda on Zoom
TA: Kush Bhatia
(kush@cs; 8th Floor Berkeley Way West)
OFFICE HOURS: Tuesday 2:00-3:00, Wednesday 10:00-11:00 in Alcove 347 Soda on Zoom
Recent Announcements
(4/2) The notes for Lecture 20 are posted below; please be sure to read them today (or by next Monday at the latest). Feel free to post questions/clarifications/typos on Piazza. Please also remember to submit your answers to the Quiz questions on Lectures 19 and 20 by Monday.
Takeaways from today's lecture:
- Tight concentration bound for the running time of (the randomized version of) Quicksort. Note that concentration bounds had been proved previously, but the one in this lecture (due to McDiarmid) is much tighter.
- Rather than the result itself, this lecture is included as an illustration of the power of Azuma's inequality in analyzing algorithms. The use of Azuma is captured in Lemma 20.3. Note also that the analysis gets a bit technical at the end, when the various parameters are chosen. However, the conceptual framework that derives the bound including those parameters (inequality (20.2)) is pretty clean, so you should focus on that.
(3/31) Welcome back! I hope you are all staying healthy and were able to get some rest over Spring Break.
The notes for Lecture 19 are posted below; please be sure to read them today (or by the end of tomorrow at the latest). Feel free to post questions/clarifications/typos on Piazza. Also, as previously announced, we've also posted a Quiz (see new section after the Lecture Notes below): this Quiz is due on Monday 4/6, and is designed to encourage you to read the notes in real time. If you've read and understood the notes, the Quiz should take you very little extra time (unlike the homeworks, these questions do not require problem-solving).
Takeaways from today's lecture:
- Use of the edge exposure martingale and Azuma's inequality to prove concentration of the chromatic number of random graphs. Note (i) the reduction to proving the existence of independent sets (or, equivalently, cliques) in G with very high probability; and (ii) the use of Azuma's inequality to prove this fact. Note also the crucial use of edge-disjoint cliques to define the martingale here. (See Quiz question for the role of this.)
- The expected length of an optimal Traveling Salesman tour for n random points in the unit cube in d dimensions (not proved in this class).
- Concentration of the optimal tour length about its expectation, proved using Azuma's inequality. Note in particular the tight bound obtained on the ith martingale difference (as a function of i), which makes the result go through: using a crude constant bound on the differences would give a much weaker result.
(3/24) Please see today's Piazza post for an important announcement about class grading policy. Enjoy the rest of your Spring Break and see you all online again next week.
(3/19) The notes for Lecture 18 are posted below; please be sure to read them today (or by the end of Spring Break at the latest). Feel free to post questions/clarifications/typos on Piazza. Takeaways from today's lecture:
- Definition of a martingale (a stochastic process whose increments always have mean zero, regardless of the history). For the next few lectures we are going to discuss some rather magical properties of martingales that have various applications in computer science and combinatorics. In this first lecture, we focus on Azuma's inequality, which provides a Chernoff-like concentration bound for any martingale whose increments are bounded. Note that martingale increments need not be independent, so this takes us beyond the classical Chernoff-Hoeffding setting of the past few lectures.
- Standard ways of constructing martingales, especially the Doob martingale (and particular examples including edge exposure and vertex exposure martingales for analyzing random graphs).
- Proof of Azuma's inequality (Theorem 18.3). Note that this follows very similar lines to that of the Chernoff bound, with the key difference being the replacement of the independence assumption by a decomposition as a sum of increments, each of which is bounded and has mean zero: see top of page 18-4. The quantitative bounding of these factors is handled by Lemma 18.4, which is just an application of convexity (essentially as in Chernoff).
- Various simple applications, such as independent Bernouilli trials (binomial distribution), gambling with bounded stakes, balls and bins, and concentration of the chromatic number of a random graph (without even knowing the mean!).
(3/17) I won't be able to hold my office hour this Thursday (3/19) 11-12 due to our (virtual) faculty retreat. Instead, I'll hold it on Friday 1-2pm. See the Piazza forum for zoom coordinates for both Kush and me.
(3/17) The notes for Lecture 17 are posted below; please be sure to read them today (or by the end of tomorrow at the latest). Feel free to post questions/clarifications/typos on Piazza. Takeaways from today's lecture:
- The notion of embeddings of metrics as an algorithm design tool, with (at least) two distinct flavors: (1) reducing the dimension of the data; and (2) simplifying its structure by mapping it to a simpler metric. In both cases, for the embedding to be useful, the distortion of distances introduced by the embedding must be small.
- The Johnson-Lindenstrauss Lemma for dimension reduction (Theorem 17.1): any set of n points in d-dimensional real space can be mapped to k-dimensional space, where k=O(ln n/eps^2), such that the distortion is only eps. Note that the dimension is independent of d! The proof in the notes uses a custom version of the Chernoff bound, tailored to ratios of sums of independent Gaussians (Claim 17.2).
- The Bourgain embedding of an arbitrary metric on n points (which you can think of as any weighted graph on n vertices) into a much simpler l_1 metric (of dimension O(log^2 n)) with distortion O(log n) (Theorem 17.3). This is a nice example of a randomized construction (the coordinates of a point in the embedding are just the distances of the point from a sequence of random subsets of the points), in which the distortion is analyzed by a Chernoff bound (Lemma 17.5).
Both of these fundamental results, and their relatives, have been highly influential in the design of algorithms for high-dimensional data and for weighted graph problems.
(3/14) Effective immediately, we'll be moving to Zoom for office hours, in recognition of the fact that many students are not coming onto campus regularly at the moment. Zoom URLs for Kush and myself are posted on Piazza. Feel free to connect during any of our regularly scheduled office hours.
(3/13) Problem Set 2 has been graded and you can view your graded solutions on Gradescope; sample solutions are posted below, and include notes on the most common errors. I recommend that you take a few minutes to work through the sample solutions, even for problems on which you got full points. If you have any questions about the grading, please either post on Piazza or ask Kush by email or at office hours. As with the first set, these problems were non-trivial and most students did pretty well: don't worry too much about missing a few points here and there.
(3/13) Problem Set 3 is posted below; it is due at 5pm on Thursday April 9th.
This set is a little longer than the last one, but you have plenty of time for it and you already have all the material you need. As always, you are advised to start early, and to work with others (but please be sure to write up your solutions on your own). Please remember to start each problem on a new page.
(3/12) The notes for Lecture 16 are posted below; please be sure to read them today (or before next Tuesday at the latest). Feel free to post questions/clarifications/typos on Piazza. Takeaways from today's lecture:
- The emergence of the "giant component" in a random graph G_{n,p}, which happens at the threshold value p=1/n. Below this threshold the largest connected component has size only O(log n), while above it the largest component makes up a constant fraction of the graph. (See Theorem 16.1.)
- The Galton-Watson branching process, used to model the spread of infectious diseases and other phenomena in which each current individual produces a random number of new individuals, these random variables being iid with mean m. Essentially, if m<=1 then the process dies out, while if m>1 then it continues indefinitely with positive probability. (See Theorem 16.2.) Note that this theorem is very relevant to current concerns over Covid-19!
- The application of the GW branching process to prove the above claim about the giant component in random graphs: model the component containing vertex v as a branching process starting from v, with the branching given (roughly) by a Bin(n-1,p) random variable. When p<1/n the process dies out before visiting more than O(log n) vertices; when p>1/n it either dies out at size O(log n) or, with constant probability, continues until it has visited a constant fraction of the vertices. (Section 16.2.2.)
- The use of Chernoff bounds to translate the branching process intuition to the random graph setting, where the "population" n is finite and the branching random variables are not quite uniform. (Section 16.2.)
(3/10) The notes for Lecture 15 are posted below; please be sure to read them today (or before the end of tomorrow at the latest). Feel free to post questions/clarifications/typos on Piazza. Takeaways from today's lecture:
- The "Poisson paradigm", which allows one to get bounds on probabilities in the balls-and-bins setting in terms of independent Poisson random variables (which is much simpler). (See Theorem 15.1.)
- Proof that the maximum bin load in balls-and-bins with n balls and n bins is asymptotically (1+o(1))(ln n/ln ln n). (See Theorem 15.2.)
- The "power of two choices" phenomenon, whereby very limited coordination (checking the loads on only two bins) leads to an exponential reduction in the maximum load (which becomes O(ln ln n)). (See Section 15.2.) Example: for n = 1,000,000, you can observe by a simple experiment that the maximum load drops from around 8-10 in the basic model to around 4 in the two-choices model. You may be interested to know that this same idea leads to analogous improvements in a host of other settings, such as various queuing models.
- Proof of the above result on power of two choices using a sequence of binomial distributions that successively bound the number of bins with i or more balls, for i = 1,2,... For each one, a Chernoff bound is used to bound the probability of the bound being violated, so that the sum of these error probabilities is small.
(3/9) The campus announced earlier today that all in-person lectures are cancelled through the end of Spring Break, March 29. (A further decision will be taken later about how to proceed after that.) As announced in class last week, this will impact CS271 as follows:
- Do not come to class tomorrow! Effective immediately, CS271 will move to a "guided reading" model, in which live lectures are replaced by scheduled reading of the lecture notes, with online support via Piazza. Full lecture notes will be posted on this web page at or before the corresponding scheduled lecture time. Students should aim to read the notes in detail during the day of the lecture (e.g., you may want to use the original lecture time for this), and in any event before the next scheduled lecture time. This will help to ensure that the class deviates as little as possible from live lectures, that you can follow online discussions, and that you don't fall behind.
- The reading of the notes will be supported by a Piazza forum, which has been set up today. (Please notify Kush if you did not receive a notification from Piazza, or if you have any difficulty accessing the forum.) Students are encouraged to ask questions about any aspect of the notes; we prefer public questions so that others can benefit from the answers, but you may post privately if you have some reason to do so. You are of course welcome to make your posts anonymous.
- We will continue to hold office hours as scheduled as far as possible: there may be some timing changes if conditions necessitate, but we will announce those on this web page.
- The schedule for homeworks will continue as planned, since these are already fully online. The Piazza forum can also be used to ask questions about homeworks (but please avoid public posts that give away solutions!). There will be two more homeworks, one due in early April (after Spring Break) and one due at the end of the semester.
- Students should be sure to monitor this web page daily for updates, as things may change quickly.
Thank you for your understanding during this difficult time.
Lecture Notes
Quizzes
Problem Sets
Course Description
One of the most remarkable developments in Computer Science over the
past 50 years has been the realization that allowing computers to toss
coins can lead to algorithms that are more efficient, conceptually
simpler and more elegant than their best known deterministic counterparts.
Randomization has since become such a ubiquitous tool in algorithm design
that any kind of encyclopedic treatment in one course is impossible.
Instead, I will attempt to survey several of the most widely used
techniques, illustrating them with examples taken from both algorithms
and random structures. A tentative and very rough course outline,
just to give you a flavor of the course, is the following:
- Elementary examples: e.g., checking identities, fingerprinting and
pattern matching, primality testing.
- Moments and deviations: e.g., linearity of expectation, universal hash
functions, second moment method, unbiased estimators, approximate counting.
- The probabilistic method: e.g., threshold phenomena in random graphs
and random k-SAT formulas; Lovász Local Lemma.
- Chernoff/Hoeffding tail bounds: e.g., Hamilton cycles in a random
graph, randomized routing, occupancy problems and load balancing,
the Poisson approximation.
- Martingales and bounded differences: e.g., Azuma's inequality,
chromatic number of a random graph, sharp concentration of Quicksort,
optional stopping theorem and hitting times.
- Random spatial data: e.g, subadditivity, Talagrand's inequality,
the TSP and longest increasing subsequences.
- Random walks and Markov chains: e.g., hitting and cover times,
probability amplification by random walks on expanders, Markov chain
Monte Carlo algorithms.
- Miscellaneous additional topics as time permits: e.g., statistical
physics, reconstruction problems, rigorous analysis of black-box optimization
heuristics,...
Prerequisites
Mathematical maturity, and a solid grasp of undergraduate material
on Algorithms and Data Structures, Discrete Probability and
Combinatorics. If you are unsure about the suitability of your
background, please talk to me before committing to the class.
As this is a graduate class, students are responsible for filling
in any gaps in their knowledge as needed.
Registration
Following department policy, all students - including auditors -
are required to register for the class. Auditors should register
S/U; an S grade will be awarded for regular class participation.
Since the class is already over-subscribed, there may not be space for
auditors in lectures: if you are auditing the class, or on the waitlist,
please be prepared to give up your seat to an enrolled student.
If you decide to drop the class, please do so as early as possible so
that another student may take your place.
Suggested References
There is no required text for the class, and no text that covers
more than about one third of the topics. However, the following
books cover significant portions of the material, and
are useful background references.
- Noga Alon and Joel Spencer, The Probabilistic Method (3rd ed.),
Wiley, 2008.
- Svante Janson, Tomasz Łuczak and Andrzej Ruciński, Random Graphs,
Wiley, 2000.
- Geoffrey Grimmett and David Stirzaker, Probability and Random
Processes (3rd ed.), Oxford Univ Press, 2001.
- Michael Mitzenmacher and Eli Upfal, Probability and Computing:
Randomized Algorithms and Probabilistic Analysis (2nd ed.), Cambridge Univ Press, 2017.
- Rajeev Motwani and Prabhakar Raghavan, Randomized Algorithms,
Cambridge Univ Press, 1995.
Lecture Notes
Notes for most or all lectures will be posted on this web page shortly
after each lecture. The notes will cover most, but not necessarily all
of the material presented in class.
Assessment etc.
The assessment mechanism will depend on the final composition of the class
and will be announced later. A major (and possibly the only) component will
be a small number of sets of homework problems distributed through the
semester. You are encouraged to read the lecture notes and do the exercises
sprinkled through them as we go along, as these will ensure that you absorb
the material in real time and should make the homeworks more manageable.
If the class is not too large, students may also be asked to present a
paper at the end of the semester.
You are strongly encouraged to solve the homework problems on your own,
or in consultation with classmates. If you find it necessary to consult
other sources, these must be acknowledged. In all cases, it is essential
that you write up your homework solutions on your own!