LECTURES: Tuesday, Thursday 9:30-11:00 in 310 Soda

OFFICE HOURS: Monday 1:00-2:00, Thursday 11:00-12:00

TA: Kush Bhatia (kush@cs; 8th Floor Berkeley Way West)

OFFICE HOURS: Tuesday 2:00-3:00, Wednesday 10:00-11:00

- The mixing time of a Markov chain measures the number of steps until the chain (almost) reaches equilibrium, starting from any initial state. Applications of mixing time include: algorithms for randomly sampling objects from a desired distribution (usually using the Metropolis process if the distribution is non-uniform); randomized algorithms for approximate counting (which can be reduced in a simple way to random sampling); time to equilibrium of natural processes on systems of interacting particles (where the state of a particle gets updated randomly according to the current states of its neighbors)---such processes are widely studied in statistical physics, where they are known as "Glauber dynamics".
- The method of coupling to bound the mixing time of Markov chains. For two copies of a Markov chain, starting from arbitrary initial states and using the same source of randomness (so that their evolutions are not necessarily independent, but coupled), the expected time for them to collide gives an upper bound on the mixing time. Thus the goal is to construct the best possible coupling.
- Simple application of coupling to get an upper bound on the mixing time of random transposition shuffle.
- More sophisticated application to the natural local Markov chain on graph colorings, where the number of colors is sufficiently larger than the maximum degree of the graph. This is among the most important applications of the Markov chain Monte Carlo method, because of its connections to approximately counting colorings and the study of phase transitions in the so-called "anti-ferromagnetic Potts model" of statistical physics.

- Definition of Markov chains (a generalization of random walk on a graph, to allow arbitrary transition probabilities and directed edges).
- Definition of stationary distribution, and the Fundamental Theorem which says that any MC that is aperiodic and irreducible has a unique stationary distribution to which it converges, starting from any initial state.
- Examples: random walk on a graph; card shuffling; graph coloring; Metropolis process. Importance of reversibility in determining the stationary distribution explicitly. (Non-reversible Markov chains can be very complicated to deal with.)
- Definition of the mixing time (= time to reach stationarity from worst initial state). Analysis of top-in-at-random shuffle and riffle shuffle using strong stationary times. (These analyses are very elegant, but their simplicity is due to the high degree of symmetry in the processes. In the next lecture we'll see an example of how to deal with much more complicated chains.)

- Definition of random walk on a graph, an important process that shows up in many applications. Definitions of hitting time and cover time. Simple application of cover time to a randomized algorithm for graph searching that uses no memory.
- The electrical network analogy for analyzing random walks, including the key concept of "effective resistance". For this, you may need to dust off your high school physics notebooks. The results in this lecture can be proved without the electrical analogy, but they are much more intuitive with it.
- Proof of the beautiful connection between commute time and resistance (Lemma 24.7). Unlike most other results in this area, this is an exact formula relating the two, not a bound.
- Two upper bounds on cover time: Theorems 24.4 and 24.8.
- Applications of all the above results to various standard extremal graphs: the line, the complete graph, the lollipop, etc.

- Introduction to the program of making the Lovasz Local Lemma algorithmic. The basic Lemma guarantees the existence of an object avoiding a collection of bad events, but gives no clue as to how to find one. For many years people tried to design algorithms for this, culminating in the breakthrough by Moser, and Moser/Tardos, about ten years ago. Since then there has been a flurry of activity in this area.
- Statement of the basic algorithmic LLL due to Moser and Tardos, which holds in the slightly restricted "variable setting", where there is an underlying family of independent random variables that describe the events. This theorem shows that, under the same condition on probabilities/dependencies as in the (general) LLL, the simple randomized local search algorithm on page 1 of the note finds a good object in expected linear time.
- This lecture is mainly devoted to a proof of the above theorem, which follows a beautiful strategy of coupling the execution of the algorithm to certain witness trees, and then relating the witness trees to a Galton-Watson process. You will likely need to read over the proof a couple of times, and hand-turn small examples for the tree construction, before you get it.

- The Lovasz Local Lemma (LLL), a powerful tool of the probabilistic method that is able to prove the existence of objects with desired properties even in cases where this probability (in some underlying space) is very small. The key to applications of the LLL is that the "desired properties" can be expressed as the absence of certain bad events, which in turn have limited dependence. (If they were mutually independent, the result would be trivial; the power of the LLL lies in the ability to relax the independence assumption.) The standard proof of the LLL is inductive and gives little insight; more recent constructive versions (see the next lecture) are more technical but arguably more intuitive.
- A canonical application, to the k-SAT problem in which clauses share a limited number of variables.
- Two more sophisticated applications: packet routing in networks, and frugal graph coloring. The latter uses an asymmetric form of the LLL that is often useful.

- The Optional Stopping Theorem (OST), which allows one to conclude that E[X_T] = E[X_0] for a martingale (X_t) and random stopping time T, under mild conditions. This seemingly simple theorem is a surprisingly powerful tool with numerous applications in computer science and mathematics, a small sample of which are covered in this lecture. You should be sure to add it to your toolbox. If you are interested in a proof of the OST, nice treatments can be found in the book by Grimmett and Stirzaker (see below), or the wonderful monograph "Probability with Martingales" by David Williams.
- Basic application to estimating the exit probabilities for random walk on an interval, and the duration of the walk. Note (i) the use of sub- and super-martingales to generalize to arbitrary jumps (with bounds on mean and variance); (ii) the trick of introducing a quadratic function of the martingale to bound the duration.
- Less obvious applications: randomized search algorithm for 2-SAT; ballot theorem; Wald's equation; percolation on d-regular graphs. In each of these cases, the OST either provides the only known proof, or is much simpler than other proofs.

- Tight concentration bound for the running time of (the randomized version of) Quicksort. Note that concentration bounds had been proved previously, but the one in this lecture (due to McDiarmid) is much tighter.
- Rather than the result itself, this lecture is included as an illustration of the power of Azuma's inequality in analyzing algorithms. The use of Azuma is captured in Lemma 20.3. Note also that the analysis gets a bit technical at the end, when the various parameters are chosen. However, the conceptual framework that derives the bound including those parameters (inequality (20.2)) is pretty clean, so you should focus on that.

- Use of the edge exposure martingale and Azuma's inequality to prove concentration of the chromatic number of random graphs. Note (i) the reduction to proving the existence of independent sets (or, equivalently, cliques) in G with very high probability; and (ii) the use of Azuma's inequality to prove this fact. Note also the crucial use of
*edge-disjoint*cliques to define the martingale here. (See Quiz question for the role of this.) - The expected length of an optimal Traveling Salesman tour for n random points in the unit cube in d dimensions (not proved in this class).
- Concentration of the optimal tour length about its expectation, proved using Azuma's inequality. Note in particular the tight bound obtained on the ith martingale difference (as a function of i), which makes the result go through: using a crude constant bound on the differences would give a much weaker result.

- Definition of a martingale (a stochastic process whose increments always have mean zero, regardless of the history). For the next few lectures we are going to discuss some rather magical properties of martingales that have various applications in computer science and combinatorics. In this first lecture, we focus on Azuma's inequality, which provides a Chernoff-like concentration bound for any martingale whose increments are bounded. Note that martingale increments need not be independent, so this takes us beyond the classical Chernoff-Hoeffding setting of the past few lectures.
- Standard ways of constructing martingales, especially the Doob martingale (and particular examples including edge exposure and vertex exposure martingales for analyzing random graphs).
- Proof of Azuma's inequality (Theorem 18.3). Note that this follows very similar lines to that of the Chernoff bound, with the key difference being the replacement of the independence assumption by a decomposition as a sum of increments, each of which is bounded and has mean zero: see top of page 18-4. The quantitative bounding of these factors is handled by Lemma 18.4, which is just an application of convexity (essentially as in Chernoff).
- Various simple applications, such as independent Bernouilli trials (binomial distribution), gambling with bounded stakes, balls and bins, and concentration of the chromatic number of a random graph (without even knowing the mean!).

- Lecture 1 (1/21)
- Lecture 2 (1/23)
- Lecture 3 (1/28)
- Lecture 4 (1/30)
- Lecture 5 (2/4)
- Lecture 6 (2/6)
- Lecture 7 (2/11)
- Lecture 8 (2/13)
- Lecture 9 (2/18)
- Lecture 10 (2/20)
- Lecture 11 (2/25)
- Lecture 12 (2/27)
- Lecture 13 (3/3)
- Lecture 14 (3/5)
- Lecture 15 (3/10)
- Lecture 16 (3/12)
- Lecture 17 (3/17)
- Lecture 18 (3/19)
- Lecture 19 (3/31)
- Lecture 20 (4/2)
- Lecture 21 (4/7)
- Lecture 22 (4/9)
- Lecture 23 (4/14)
- Lecture 24 (4/16)
- Lecture 25 (4/21)
- Lecture 26 (4/23)

- Quiz 1 (Out 3/31; Due 4/6)
- Quiz 1 solutions
- Quiz 2 (Out 4/7; Due 4/13)
- Quiz 2 solutions
- Quiz 3 (Out 4/14; Due 4/20)
- Quiz 3 solutions
- Quiz 4 (Out 4/21; Due 4/27)
- Quiz 4 solutions

- Problem Set 1 (Out 1/31; Due 2/13)
- Problem Set 1 solutions
- Problem Set 2 (Out 2/21; Due 3/5)
- Problem Set 2 solutions
- Problem Set 3 (Out 3/13; Due 4/9)
- Problem Set 3 solutions
- Problem Set 4 (Out 4/16; Due 4/30)
- Problem Set 4 solutions

- Elementary examples: e.g., checking identities, fingerprinting and pattern matching, primality testing.
- Moments and deviations: e.g., linearity of expectation, universal hash functions, second moment method, unbiased estimators, approximate counting.
- The probabilistic method: e.g., threshold phenomena in random graphs and random k-SAT formulas; Lovász Local Lemma.
- Chernoff/Hoeffding tail bounds: e.g., Hamilton cycles in a random graph, randomized routing, occupancy problems and load balancing, the Poisson approximation.
- Martingales and bounded differences: e.g., Azuma's inequality, chromatic number of a random graph, sharp concentration of Quicksort, optional stopping theorem and hitting times.
- Random spatial data: e.g, subadditivity, Talagrand's inequality, the TSP and longest increasing subsequences.
- Random walks and Markov chains: e.g., hitting and cover times, probability amplification by random walks on expanders, Markov chain Monte Carlo algorithms.
- Miscellaneous additional topics as time permits: e.g., statistical physics, reconstruction problems, rigorous analysis of black-box optimization heuristics,...

- Noga Alon and Joel Spencer,
*The Probabilistic Method*(3rd ed.), Wiley, 2008. - Svante Janson, Tomasz Łuczak and Andrzej Ruciński,
*Random Graphs*, Wiley, 2000. - Geoffrey Grimmett and David Stirzaker,
*Probability and Random Processes*(3rd ed.), Oxford Univ Press, 2001. - Michael Mitzenmacher and Eli Upfal,
*Probability and Computing: Randomized Algorithms and Probabilistic Analysis*(2nd ed.), Cambridge Univ Press, 2017. - Rajeev Motwani and Prabhakar Raghavan,
*Randomized Algorithms*, Cambridge Univ Press, 1995.