CS174: COMBINATORICS & DISCRETE PROBABILITY

INSTRUCTOR: Alistair Sinclair (sinclair at eecs, 677 Soda)
LECTURES: Tuesday, Thursday 9:30-11:00 in 310 Soda
OFFICE HOURS: Monday 10:00-11:00, Thursday 11:00-12:00 in 677 Soda

TEACHING ASSISTANT: Milind Jagota (milind_jagota at berkeley, 378B Stanley)
DISCUSSION SECTION: W 10:00-11:00 in 3111 Etcheverry
OFFICE HOURS: Tuesday 2:00-3:00 in 411 Soda, Wednesday 1:00-2:00 in 347 Soda

READER: Jonathan Guo (jonguo6 at berkeley)

ED DISCUSSION: https://edstem.org/us/courses/33144/discussion/

RECENT ANNOUNCEMENTS


LECTURE SCHEDULE

Lecture, Date Topic Reading Other
Lecture 1, Tue. Jan. 17 Admin; course overview; testing polynomial identities This web page; MU Section 1.1
Lecture 2, Thu. Jan. 19 Basic probability MU Section 1.2 Fri. Jan. 20: HW 1 out
Lecture 3, Tue. Jan. 24 Testing matrix multiplication; Karger's min cut algorithm; random variables & expectation MU Sections 1.3, 1.5, 2.1
Lecture 4, Thu. Jan. 26 Random variables & expectation; binomial & geometric distributions; Quicksort MU Sections 2.1, 2.2, 2.4, 2.5 Fri. Jan. 25: HW 1 due; HW 2 out
Lecture 5, Tue. Jan. 31 Jensen's inequality; moments; Markov and Chebyshev inequalities MU Sections 2.1.2, 3.1, 3.2, 3.3
Lecture 6, Thu. Feb. 2 Randomized algorithm for computing the median MU Sections 3.4, 3.5 Fri. Feb. 3: HW 2 due; HW 3 out
Lecture 7, Tue. Feb. 7 Moment generating functions; Chernoff bounds MU Sections 4.1, 4.2
Lecture 8, Thu. Feb. 9 Discrepancy/Set balancing; randomized routing on the hypercube MU Section 4.4, Supplementary Note 1 Fri. Feb. 10: HW 3 due; HW 4 out
Lecture 9, Tue. Feb. 14 Finish up randomized routing; birthday problem; balls & bins Supp. Note 1; MU Sections 5.1,5.2
Lecture 10, Thu. Feb. 16 Poisson distribution; Poisson approximation MU Sections 5.3, 5.4 Fri. Feb. 17: HW 4 due; HW 5 out
Lecture 11, Tue. Feb. 21 Random graphs; Hamilton cycles MU Section 5.6
Lecture 12, Thu. Feb. 23 The Probabilistic Method: Ramsey numbers, crossing number MU Section 6.1 Fri. Feb. 24: HW5 due. [No HW next week.]   
Lecture 13, Tue. Feb. 28 Review session (Milind) Sample midterm
Lecture 14, Thu. Mar. 2 The Probabilistic Method: MaxCut, MaxSAT, Derandomization; Thresholds in random graphs   MU Sections 6.2, 6.3, 6.4, 6.5 [No HW next week.]
Lecture 15, Tue. Mar. 7 Cancelled due to midterm
Lecture 16, Thu. Mar. 9 Pairwise independent random variables: constructions and applications MU Sections 15.1, 15.2 Fri. Mar. 10: HW6 out
Lecture 17, Tue. Mar. 14 Universal hash functions; perfect hashing; heavy hitters MU Sections 15.3, 15.4
Lecture 18, Thu. Mar. 16 Finish up heavy hitters; Markov chain basics MU Sections 7.1, 7.2 Fri. Mar. 17: HW6 due; HW7 out
Lecture 19, Tue. Mar. 21 Lecture by Milind on Sampling Functional Proteins See weekly bulletin
Lecture 20, Thu. Mar. 23   Markov chains: stationary distributions, hitting and cover times MU Sections 7.3, 7.4 Fri. Mar. 24: HW7 due
Tue. Mar. 28 Spring Break
Thu. Mar. 30 Spring Break
Lecture 21, Tue. Apr. 4 Conclude hitting & cover times; preview of the MCMC method MU Section 7.4
Lecture 22, Thu. Apr. 6 Mixing times; coupling; card shuffling; random walk on hypercube; graph colorings MU Sections 12.1, 12.2, 12.3, 12.5 Fri. Apr. 7: HW8 out
Lecture 23, Tue. Apr. 11 The Monte Carlo method MU Chapter 11; also Q3 on HW4
Lecture 24, Thu. Apr. 13 Martingales: Stopping times; Optional Stopping Theorem MU Sections 13.1, 13.2, 13.3 Fri. Apr. 14: HW8 due; HW9 out
Lecture 25, Tue. Apr. 18 Martingales: Azuma's inequality and large deviations MU Section 13.4
Lecture 26, Thu. Apr. 20 Azuma's inequality: Applications MU Section 13.5 Fri. Apr. 21: HW9 due; HW10 out
Lecture 27, Tue. Apr. 25 Fingerprinting Supplementary Note 2
Lecture 28, Thu. Apr. 27 Primality testing Supplementary Note 3 Fri. Apr. 28: HW10 due

HOMEWORKS

All homeworks are due by Friday at 5:00pm, unless otherwise stated. You should submit your homework on Gradescope before this deadline. Please be sure to correctly match each of your answers with the template. Your solutions can either be typeset in Latex, or written legibly by hand: no credit will be given for illegible or poorly written homeworks. The lowest two homework scores will be dropped. No late homeworks will be accepted.

DISCUSSIONS

Posted below are notes for discussion sections, provided by Milind.

OTHER HANDOUTS

Here you will find supplementary notes or other materials, when needed.

COURSE INFORMATION & POLICIES

Course Overview

One of the most remarkable developments in Computer Science over the past 50 years has been the realization that allowing computers to toss coins can lead to algorithms and data structures that are more efficient, conceptually simpler and more elegant than their best known deterministic counterparts. This course will cover numerous examples of this phenomenon. Typically, the algorithms we cover will be extremely simple: the challenge lies in analyzing them to show that they have the properties that we want. We will also cover the related topic of properties of random data (e.g., random graphs), and see that even very hard problems (such as the Traveling Salesman Problem) can be easy to solve on raondom inputs. Throughout the course we will make use of many elementary probabilistic concepts introduced in earlier courses such as CS70, Math55 or EECS126. The course will follow selected topics from the textbook by Mitzenmacher and Upfal (see below), supplemented occasionally by additional notes. A very tentative course outline (not in chronological order), just to give you a flavor of the course, is the following:

Activities

There are two lectures each week (TuTh 9:30-11:00). The lectures will not be recorded, and attendance is expected. No time conflicts are allowed. While masks are not required, you are requested to wear a mask in class for everybody's safety and comfort (including your own). There will also be one discussion section each week (W 10-11), run by GSI Milind Jagota: you are strongly advised to attend these discussions, as they are designed to be complementary to the lectures; in particular, they will focus on developing your problem-solving skills rather than introducing new lecture material. There will also be regular weekly homeworks (see "Assessment" below for more details).

Materials

The following book is required, and we shall use it extensively: Probability and Computing: Randomization and Probabilistic Techniques in Algorithms and Data Analysis, (2nd ed.) by Michael Mitzenmacher and Eli Upfal, Cambridge University Press, 2017. It is essential that all students have regular access to this book. Pointers to the relevant sections of the book will be provided as we go along. Note: The first edition of this book (from 2005) is quite similar, but omits some topics. You should also be aware that references to pages, sections, etc. will differ from the 2nd edition that we will use. Use at your own risk!

It is highly recommended that you use the book as follows: Read the material for the next lecture ahead of time at a high level (don't worry about all the details); then re-read this material in detail as soon as possible after class. Clarify any misunderstandings early in discussion sections or office hours. A pointer to the reading material for the next lecture will be posted on the Lecture Schedule section of the web page shortly after each lecture.

I will occasionally supply lecture notes for additional topics not covered in the book, or covered differently from lectures. These will be posted on the web page and linked from the Lecture Schedule, and are required reading.

Prerequisites

You should have taken an upper division course in algorithms (CS170 or equivalent) and a course on discrete mathematics and basic probability (CS70, Math 55, EECS126 or equivalent). If you struggled in any of those classes, then you will likely find this class harder and you should consider carefully whether it is the right class for you. If you have any doubts about your background you should come and talk to me as soon as possible. Note: Some students ask about potential overlap between this class and EECS126. While both classes make heavy use of probability, the emphasis in CS174 is much more on the application of discrete probability to algorithms and data structures, in contrast to EECS126 which teaches a variety of (discrete and continuous) probabilistic concepts with a focus on applications in signal processing, artificial intelligence, etc.

Assessment

There will be weekly homeworks, one midterm and one final. The overall grade will be based 20% on the homeworks, 30% on the midterm, and 50% on the final. The midterm will take place on Tuesday March 7th, 7-9pm. The final exam (subject to confirmation) is on Wednesday May 10th, 11:30-2:30. Please note that no alternate final will be offered, and an alternate midterm will be offered only in documented cases of academic conflicts.

Students are advised that, although the homeworks do not carry a very large numerical weight, they are an essential aid to understanding the material as the course proceeds, and to performing well on the midterms and the final. Typically, homeworks will be circulated no later than Friday of each week, and will be due by 5pm the following Friday. All submissions will be on Gradescope, and must be either typset in Latex or written legibly and scanned: illegible solutions will receive zero credit as we don't have time to decipher them. We will make every effort to return graded homeworks within one week. The lowest two homework scores will be automatically dropped. Late homework will not be accepted for any reason (this is what the two dropped homeworks are for!).

Grading

Regrading of homeworks or exams will only be undertaken in cases where you believe there has been a genuine error or misunderstanding. Bear in mind that our primary aim in grading is consistency, so that all students are treated the same; for this reason, we will not adjust the score of one student on an issue of partial credit unless the score allocated clearly deviates from the grading policy we adopted for that problem. We will open each assignment for regrade requests for a short window after it is returned. If you don't understand the solution to a problem, you should address this in an office hour or on Ed rather than through a regrade request.

Website and Discussion Forum

The instructor and TA will post announcements, clarifications, hints, etc. to this website and/or to the Ed Discussion site. Hence you must check both this website and Ed frequently throughout the semester. You are encouraged to use the Ed forum to discuss administrative or technical topics directly related to the class. However, please refrain from the following: Please also be aware that Ed is primarily a forum for student discussions, and we hope that you will contribute by answering questions posted by your classmates. We will monitor the forum and post responses where needed, but you should not expect rapid responses to all questions.

If you have a more personal question that is not of interest to other students, you should instead send email to the instructor or TA. If your question requires more extensive discussion, please come to office hours or make an appointment with one of us over email. Please reserve email for questions you can't get answered in office hours, in discussion sections, or through the Ed forum.

In any class, it can be challenging for the instructor to gauge how smoothly the class is going. We always welcome any feedback on what we could be doing better. If you would like to send anonymous comments or criticisms, please feel free to use an anonymous remailer like this one to avoid revealing your identity.

Course Climate

UC Berkeley is committed to creating a learning environment welcoming of all students that supports a diversity of thoughts, perspectives and experiences, and respects the identities and backgrounds of all. Should you feel unsafe, disrespected or unwelcome in any aspect of this course (or elsewhere on campus), you should feel free to approach the instructor or TA, or alternatively to fill out this Incident Reporting Form provided by the CS department.

Collaboration and Academic (Dis)Honesty

You are strongly encouraged to work together in small study groups (2-4 people) to help each other to understand the material. For homeworks, you are encouraged to discuss the meaning of a problem and ways to solve it; however, you must always write up the solutions on your own. Similarly, you may use books and other sources to help solve homework problems, but you must write up the solution on your own and cite any sources you use.

A succinct way of saying the above is the following: Exchanging ideas is fine, but any such ideas must lead to a good enough understanding on your part that you can fully and completely write up the solution on your own.

Copying another person's work or presenting it as your own constitutes cheating, whether that person is a friend, a student in this class or a previous semester's class, or an anonymous person on the Internet who happens to have solved the problem you've been asked to solve. Everything you turn in must be your own work. Your attention is drawn to the Department's Policy on Academic Dishonesty. Note in particular that cheating on a homework or a midterm will result in a grade of zero for that assignment, and possibly a report to the Office of Student Conduct. Cheating on the final, or repeated offenses, will be reported to the Office of Student Conduct and will result in an F on the course.

SOME HELPFUL HINTS

The following tips are offered based on our experience with Upper Division classes in CS Theory. If you follow these guidelines, you will make life much easier for yourself in this class.

1. Don't fall behind! In a conceptual class such as this, it is particularly important to maintain a steady effort throughout the semester, rather than hope to cram just before homework deadlines or exams. This is because it takes time and practice for the ideas to sink in. Make sure you allocate a sufficient number of hours every week to the class, including enough time for reading and understanding the material as well as for doing assignments. (As a rough guide, you should expect to do at least one hour of reading and two hours of problem solving for each hour of lecture.) Even though this class does not have any major projects, you should plan to spend as much time on it as on any of your other Upper Division technical classes.

2. Take the homeworks seriously! The homeworks are explicitly designed to help you to learn the material as you go along. Although the numerical weight of the homeworks is not huge, there is usually a strong correlation between homework scores and final grades in the class. Also, regardless of how well you did on the homework, read the sample solutions, even for the problems you got right. You may well learn a different way of looking at the problem, and you may also benefit from emulating the style of the solutions. (In science people learn a lot from emulating the approach of more experienced scientists.)

3. Make use of office hours! The instructor and TA hold office hours expressly to help you. It is often surprising how many students do not take advantage of this service. You are free to attend as many office hours as you wish. You will also likely get more out of an office hour if you have spent a little time in advance thinking about the questions you have, and formulating them precisely. (In fact, this process can often lead you to a solution yourself!)

4. Take part in discussion sections! Discussion sections are not auxiliary lectures. They are an opportunity for interactive learning, through guided group problem solving and other activities. The success of a discussion section depends largely on the willingness of students to participate actively in it. As with office hours, the better prepared you are for the discussion, the more you are likely to get out of it.

5. Form study groups! As stated above, you are encouraged to form small groups (two to four people) to work together on homeworks and on understanding the class material on a regular basis. In addition to being fun, this can save you a lot of time by generating ideas quickly and preventing you from getting hung up on some point or other. Of course, it is your responsibility to ensure that you contribute actively to the group; passive listening will likely not help you much. And recall the caveat above that you must write up your solutions on your own.