CS174: COMBINATORICS & DISCRETE PROBABILITY

INSTRUCTOR: Alistair Sinclair (sinclair at eecs, 677 Soda)
LECTURES: Tuesday, Thursday 9:30-11:00 in 310 Soda
OFFICE HOURS: Tuesday 2:30-3:30, Thursday 11:00-12:00 in 677 Soda

TEACHING ASSISTANT: Siva Tanikonda (siva_tanikonda at berkeley)
DISCUSSION SECTIONS: Wednesday 10:00-11:00 in 20 Wheeler, Wednesday 12:00-1:00 in 320 Soda
OFFICE HOURS: Wednesday 3:00-4:00 in 504 Cory, Thursday 1:00-2:00 in 504 Cory

ED DISCUSSION: https://edstem.org/us/courses/73993/discussion/

RECENT ANNOUNCEMENTS


LECTURE SCHEDULE

Lecture, Date Topic Reading Other
Lecture 1, Tue. Jan. 21 Admin; course overview; testing polynomial identities This web page; MU Section 1.1
Lecture 2, Thu. Jan. 23 Basic probability; checking matrix multiplication MU Sections 1.2, 1.3 Fri. Jan. 24: HW 1 out
Lecture 3, Tue. Jan. 28 Karger's MinCut algorithm; random variables & expectation MU Sections 1.5, 2.1
Lecture 4, Thu. Jan. 30 Random variables & expectation; binomial & geometric distributions; Quicksort; Markov's inequality  MU Sections 2.2, 2.4, 2.5, 3.1 Fri. Jan. 31: HW 1 due; HW 2 out
Lecture 5, Tue. Feb. 4 Variance; Chebyshev's inequality; randomized algorithm for median finding MU Sections 3.2, 3.3, 3.5
Lecture 6, Thu. Feb. 6 Randomized algorithm for median finding (cont.); moment generating functions MU Sections 3.5, 4.1 Fri. Feb. 7: HW 2 due; HW 3 out
Lecture 7, Tue. Feb. 11 Chernoff bounds; Hoeffding bounds MU Sections 4.2, 4.5
Lecture 8, Thu. Feb. 13 Randomized routing on the hypercube Supplementary Note 1 (see below) Fri. Feb. 14: HW 3 due; HW 4 out
Lecture 9, Tue. Feb. 18 Randomized routing cont.; Birthday problem; balls & bins; Poisson distribution MU Sections 5.1, 5.2
Lecture 10, Thu. Feb. 20 Poisson approximation MU Sections 5.3, 5.4, 5.5 Fri. Feb. 21: HW 4 due; HW 5 out
Lecture 11, Tue. Feb. 25 Random graphs; Hamilton cycles MU Section 5.6
Lecture 12, Thu. Feb. 27 No lecture: work on HW5 Fri. Feb. 28: HW 5 due; HW 6 out
Lecture 13, Tue. Mar. 4 Hamilton cycles cont.; intro to the probabilistic method MU Sections 5.6, 6.1
Lecture 14, Thu. Mar. 6 The probabilistic method MU Sections 6.2, 6.3, 6.4 Fri. Mar. 7: HW 6 due; no HW next week
Lecture 15, Tue. Mar. 11 No lecture: midterm this evening
Lecture 16, Thu. Mar. 13 Second moment method: thresholds in random graphs MU Section 6.5 Fri. Mar. 14: HW 7 out
Lecture 17, Tue. Mar. 18 Pairwise independent random variables: construction and motivation MU Sections 15.1, 15.2
Lecture 18, Thu. Mar. 20 Universal hash functions: perfect hashing, heavy hitters in data streams MU Sections 15.3, 15.4 Fri. Mar. 21: HW 7 due; HW 8 out
Tue. Mar. 25 Spring Break
Thu. Mar. 27 Spring Break
Lecture 19, Tue. Apr. 1 [Luke Gretta] Markov chains: basic properties, stationary distributions MU Sections 7.1, 7.2, 7.3
Lecture 20, Thu. Apr. 3 Markov chains continued: convergence, random walks on graphs, hitting and cover times MU Sections 7.3, 7.4 Fri. Apr. 4: HW 8 due; HW 9 out
Lecture 21, Tue. Apr. 8 Mixing times; coupling; card shuffling; random walk on hypercube; graph colorings MU Sections 12.1, 12.2, 12.3, 12.5
Lecture 22, Thu. Apr. 10 Graph colorings (cont.); applications of Markov chain Monte Carlo MU Sections 12.5, 11.1, 11.4 Fri. Apr. 11: HW 9 due; HW 10 out
Lecture 23, Tue. Apr. 15 Approximate counting MU Sections 11.1, 11.3
Lecture 24, Thu. Apr. 17 Martingales; optional stopping theorem and applications MU Sections 13.1, 13.2, 13.3 Fri. Apr. 18: HW 10 due; HW 11 out
Lecture 25, Tue. Apr. 22 Martingales: optional stopping continued; Azuma's inequality MU Sections 13.2, 13.3
Lecture 26, Thu. Apr. 24 Azuma's inequality & applications MU Sections 13.4, 13.5
Lecture 27, Tue. Apr. 29 Fingerprinting and pattern matching Supplementary Note 2 (see below)
Lecture 28, Thu. May 1 Primality testing Supplementary Note 3 (see below) Fri. May 2: HW 11 due

HOMEWORKS

All homeworks are due by Friday at 5:00pm, unless otherwise stated. You should submit your homework on Gradescope before this deadline. Please be sure to correctly match each of your answers with the template. Your solutions can either be typeset in Latex, or written legibly by hand: no credit will be given for illegible or poorly written homeworks. The lowest two homework scores will be dropped. No late homeworks will be accepted.


DISCUSSIONS

Posted below are notes for discussion sections, provided by Siva.


OTHER HANDOUTS

Here you will find supplementary notes or other materials, when needed.

COURSE INFORMATION & POLICIES

Course Overview

One of the most remarkable developments in Computer Science over the past 50 years has been the realization that allowing computers to toss coins can lead to algorithms and data structures that are more efficient, conceptually simpler and more elegant than their best known deterministic counterparts. This course will cover numerous examples of this phenomenon. Typically, the algorithms we cover will be extremely simple: the challenge lies in analyzing them to show that they have the properties that we want. We will also cover the related topic of properties of random data (e.g., random graphs), and see that even very hard problems (such as the Traveling Salesman Problem) can be easy to solve on random inputs. Throughout the course we will make use of many elementary probabilistic concepts introduced in earlier courses such as CS70, Math55 or EECS126. The course will follow selected topics from the textbook by Mitzenmacher and Upfal (see below), supplemented occasionally by additional notes. A very tentative course outline (not in chronological order), just to give you a flavor of the course, is the following:

Activities

There are two lectures each week (TuTh 9:30-11:00). The lectures will not be recorded, and attendance is expected. No time conflicts are allowed. There will also be one discussion section each week (W 10-11 or W 12-1), run by TA Siva Tanikonda: you are strongly advised to attend these discussions every week, as they are designed to be complementary to the lectures; in particular, they will focus on developing your problem-solving skills rather than introducing new lecture material. There will also be regular weekly homeworks (see "Assessment" below for more details).

Materials

The following book is required, and we shall use it extensively: Probability and Computing: Randomization and Probabilistic Techniques in Algorithms and Data Analysis, (2nd ed.) by Michael Mitzenmacher and Eli Upfal, Cambridge University Press, 2017. It is essential that all students have regular access to this book. Pointers to the relevant sections of the book will be provided as we go along. Note: The first edition of this book (from 2005) is quite similar, but omits some topics. You should also be aware that references to pages, sections, etc. will differ from the 2nd edition that we will use. Use at your own risk!

It is highly recommended that you use the book as follows: Read the material for the next lecture ahead of time at a high level (don't worry about all the details); then re-read this material in detail as soon as possible after class. Clarify any misunderstandings early in discussion sections or office hours. A pointer to the reading material for the next lecture will be posted on the Lecture Schedule section of the web page shortly after each lecture.

I will occasionally supply lecture notes for additional topics not covered in the book, or covered differently from lectures. These will be posted on the web page and linked from the Lecture Schedule, and are required reading.

Prerequisites

You should have taken an upper division course in algorithms (CS170 or equivalent) and a course on discrete mathematics and basic probability (CS70, Math 55, EECS126 or equivalent). If you struggled in any of those classes, then you will likely find this class harder and you should consider carefully whether it is the right class for you. If you have any doubts about your background you should come and talk to me as soon as possible. Note: Some students ask about potential overlap between this class and EECS126. While both classes make heavy use of probability, the emphasis in CS174 is much more on the application of discrete probability to algorithms and data structures, in contrast to EECS126 which teaches a variety of (discrete and continuous) probabilistic concepts with a focus on applications in signal processing, artificial intelligence, control theory, etc.

Assessment

There will be weekly homeworks, one midterm and one final. The overall grade will be based 20% on the homeworks, 30% on the midterm, and 50% on the final. The midterm will take place on Tuesday March 11, 7-9pm in Room 60, Evans Hall. The final exam (subject to confirmation) is on Wednesday May 14th, 11:30-2:30. Please note that no alternate final will be offered, and an alternate midterm will be offered only in documented cases of academic conflicts.

Students are advised that, although the homeworks do not carry a very large numerical weight, they are an essential aid to understanding the material as the course proceeds, and to performing well on the midterms and the final. Typically, homeworks will be circulated no later than Friday of each week, and will be due by 5pm the following Friday. All submissions will be on Gradescope, and must be either typset in Latex or written legibly and scanned: illegible solutions will receive zero credit as we don't have time to decipher them. We will make every effort to return graded homeworks within one week. The lowest two homework scores will be automatically dropped. Late homework will not be accepted for any reason (this is what the two dropped homeworks are for!).

Grading

Regrading of homeworks or exams will only be undertaken in cases where you believe there has been a genuine error or misunderstanding. Bear in mind that our primary aim in grading is consistency, so that all students are treated the same; for this reason, we will not adjust the score of one student on an issue of partial credit unless the score allocated clearly deviates from the grading policy we adopted for that problem. We will open each assignment for regrade requests for a short window after it is returned. If you don't understand the solution to a problem, you should address this in an office hour or on Ed rather than through a regrade request.

Website and Discussion Forum

The instructor and TA will post announcements, clarifications, hints, etc. to this website and/or to the Ed Discussion site. Hence you must check both this website and Ed frequently throughout the semester. You are encouraged to use the Ed forum to discuss administrative or technical topics directly related to the class. However, please refrain from the following: Please also be aware that Ed is primarily a forum for student discussions, and we hope that you will contribute by answering questions posted by your classmates. We will monitor the forum and post responses where needed, but you should not expect rapid responses to all questions.

If you have a more personal question that is not of interest to other students, you should instead send email to the instructor or TA. If your question requires more extensive discussion, please come to office hours or make an appointment with one of us over email. Please reserve email for questions you can't get answered in office hours, in discussion sections, or through the Ed forum.

In any class, it can be challenging for the instructor to gauge how smoothly the class is going. We always welcome any feedback on what we could be doing better. If you would like to send anonymous comments or criticisms, please feel free to use an anonymous remailer like this one to avoid revealing your identity.

Course Climate

UC Berkeley is committed to creating a learning environment welcoming of all students that supports a diversity of thoughts, perspectives and experiences, and respects the identities and backgrounds of all. Should you feel unsafe, disrespected or unwelcome in any aspect of this course (or elsewhere on campus), you should feel free to approach the instructor or TA, or alternatively to fill out this Incident Reporting Form provided by the CS department.

Collaboration and Academic (Dis)Honesty

You are strongly encouraged to work together in small study groups (2-4 people) to help each other to understand the material. For homeworks, you are encouraged to discuss the meaning of a problem and ways to solve it; however, you must always write up the solutions on your own. Similarly, you may use books and other sources to help solve homework problems, but you must write up the solution on your own and cite any sources you use.

A succinct way of saying the above is the following: Exchanging ideas is fine, but any such ideas must lead to a good enough understanding on your part that you can fully and completely write up the solution on your own.

Copying another person's work or presenting it as your own constitutes cheating, whether that person is a friend, a student in this class or a previous semester's class, ChatGPT, or an anonymous person on the Internet who happens to have solved the problem you've been asked to solve. Everything you turn in must be your own work. Your attention is drawn to the Department's Policy on Academic Dishonesty. Note in particular that cheating on a homework or a midterm will result in a grade of zero for that assignment, and possibly a report to the Office of Student Conduct. Cheating on the final, or repeated offenses, will be reported to the Office of Student Conduct and will result in an F on the course.

A Note on ChatGPT

This class does not pretend that ChatGPT does not exist, nor that it is not capable of solving some (or many) of the homework problems (though sometimes erroneously). Rather than banning the use of ChatGPT (or other similar models), we allow you to use it under the following conditions:


You are also advised that you will not have access to ChatGPT in the exams, so if you have relied on it for the Homeworks you will have missed the valuable opportunity to develop your own problem-solving skills needed for those exams.

SOME HELPFUL HINTS

The following tips are offered based on our experience with Upper Division classes in CS Theory. If you follow these guidelines, you will make life much easier for yourself in this class.

1. Don't fall behind! In a conceptual class such as this, it is particularly important to maintain a steady effort throughout the semester, rather than hope to cram just before homework deadlines or exams. This is because it takes time and practice for the ideas to sink in. Make sure you allocate a sufficient number of hours every week to the class, including enough time for reading and understanding the material as well as for doing assignments. (As a rough guide, you should expect to do at least one hour of reading and two hours of problem solving for each hour of lecture.) Even though this class does not have any major projects, you should plan to spend as much time on it as on any of your other Upper Division technical classes.

2. Take the homeworks seriously! The homeworks are explicitly designed to help you to learn the material as you go along. Although the numerical weight of the homeworks is not huge, there is usually a strong correlation between homework scores and final grades in the class. Also, regardless of how well you did on the homework, read the sample solutions, even for the problems you got right. You may well learn a different way of looking at the problem, and you may also benefit from emulating the style of the solutions. (In science people learn a lot from emulating the approach of more experienced scientists.)

3. Make use of office hours! The instructor and TA hold office hours expressly to help you. It is often surprising how many students do not take advantage of this service. You are free to attend as many office hours as you wish. You will also likely get more out of an office hour if you have spent a little time in advance thinking about the questions you have, and formulating them precisely. (In fact, this process can often lead you to a solution yourself!)

4. Take part in discussion sections! Discussion sections are not auxiliary lectures. They are an opportunity for interactive learning, through guided group problem solving and other activities. The success of a discussion section depends largely on the willingness of students to participate actively in it. As with office hours, the better prepared you are for the discussion, the more you are likely to get out of it.

5. Form study groups! As stated above, you are encouraged to form small groups (two to four people) to work together on homeworks and on understanding the class material on a regular basis. In addition to being fun, this can save you a lot of time by generating ideas quickly and preventing you from getting hung up on some point or other. Of course, it is your responsibility to ensure that you contribute actively to the group; passive listening will likely not help you much. And recall the caveat above that you must write up your solutions on your own.