CSE 521: Design and Analysis of Algorithms I Spring 2025 Lecture 1: The Probabilistic Method Lecturer: Shayan Oveis Gharan 04-01-2025 Scribe:
Disclaimer: These notes have not been subjected to the usual scrutiny reserved for formal publications.
1.1 Introduction to the Probabilistic Method
An old math puzzle goes: Suppose there are six people in a room; some of them shake hands. Prove that there are at least three people who all shook each others’ hands or three people such that no pair of them shook hands. Generalized a bit, this is the classic Ramsey problem. The diagonal Ramsey numbers are defined as follows. is the smallest integer such that in every two-coloring of the edges of the complete graph by red and blue, there is a monochromatic copy of , i.e. there are k nodes such that all of the edges between them are red or all of the edges are blue. A solution to the puzzle above asserts that (and it is easy to check that, in fact, ).
In 1929, Ramsey proved that is finite for every . We want to show that must grow pretty fast; in fact, we’ll prove that for , we have . This requires finding a coloring of that doesn’t contain any monochromatic . To do this, we’ll use the probabilistic method: We’ll give a random coloring of and show that it satisfies our desired property with positive probability. This proof appeared in a paper of Erdös from 1947, and this is the example that starts Alon and Spencer’s famous book devoted to the probabilistic method which will be one of the main resources for this coruse.
Lemma 1.1.
If , then . In particular, for .
Proof.
Consider a uniformly random 2-coloring of the edges of . Every edge is colored red or blue independently with probability half each. For any fixed set of vertices , let denote the event that the induced subgraph on is monochromatic. An easy calculation yields
Since there are possible choices for , we can use the union bound:
Thus if , then with positive probability, no event occurs. Thus there must exist at least one coloring with no monochromatic . One can check that if and , then this is satisfied. ∎
In the proof, we employed the following fundamental tool:
Fact 1.2 (Union Bound).
If are arbitrary events, then .
1.2 Linearity of Expectations
Fact 1.3 (Linearity of Expectation).
If are real-valued random variables, then
The great fact about this inequality is that we don’t need to know anything about the relationships between the random variables; linearity of expectation holds no matter what the dependence structure is.
Let’s consider a 3-CNF formula over the variables . Such a formula has the form where each clause is an or of three literals involving distinct variables: . A literal is a variable or its negation. For instance, is a 3-CNF formula.
Lemma 1.4.
If is a 3-CNF formula with clauses, then there exists an assignment that makes at least clauses evaluate to true.
Proof.
We will prove this using the probabilistic method. For every variable independently, we choose a uniformly random truth assignment: true or false each with probability 1/2. Let equal 1 if clause is satisfied by our random assignment, and equal 0 otherwise. Then because there are 7 ways to satisfy a clause out of the 8 possible truth values for its literals. Let denote the total number of satisfied clauses. By linearity of expectation,
So, there must be an assignment that satisfies this many clauses. ∎
1.3 Method of Conditional Expectations
The above lemma asserts that there exists an assignment satisfying at least many clauses, but what if we wish to actually find one? One way is to randomly sample from the underlying distribution and then check the resulting assignment. Analyzing the probability of success will require our tail bounds which we will discuss in future lectures.
In this section, we will discuss a generic method that can turn many of the probabilistic method proofs into even deterministic algorithms. Let denote the expected number of satisfied clauses given a partial truth assignment to the input variables, where we choose the unassigned variables uniformly at random. We will use T to denote true, F to denote false, and * to denote that no assignment has been chosen for that variable. For instance, denotes the expected number of satisfied clauses in a random assignment, and we have already seen that
Note that a simple linear-time algorithm can estimate for any partial assignment by simply going through the clauses one by one.
As an example, consider the clause . The probability that a random assignment satisfies this is 7/8. If we assign , then the probability becomes , and if we set , then the probability becomes 1. Observe that
Consequently, it must hold that
As we have just argued, it’s possible to compute both these quantities and figure out which is larger. We can then set to the corresponding value and keep assigning truth values recursively. Since the value of never goes down and it starts at , when the algorithm finishes we must satisfy at least 7m/8 fraction of clauses. Note that the algorithm may indeed satisfy more than fraction of clauses.
1.4 Choosing the Right Distribution
Here is a more complicated example in which the choice of distribution requires a preliminary lemma. Let , where the ’s are disjoint sets, each of size . Let be a two-coloring of the -sets. A -set E is crossing if it contains precisely one point from each . For set
| (1.1) |
Theorem 1.5.
Suppose for all crossing -sets . Then there is an for which
Here is a positive constant, which is independent of .
Perhaps, the first attempt is to choose each element of in , independently, with probability . It turns for such a distribution can be even negative, e.g., assume for every non-crossing -set. If you think about it deeply, you would wonder why ? As we will see, choosing elements of independently is right, but we need to be careful on the marginals; we want to choose the marginals based on the function given to us.
But how? Let be the marginals of elements of to be determined, i.e., we sample elements in independently but elements from the same are chosen with the same marignals. Given , we define a random set where for every element , we add to with probability , independent of every other element.
Define a random variable
| (1.2) |
It turns out that we can write as a -homogeneous polynomial in :
In the third equality, we classify all -sets by their ”type” namely the size of their intersections with .
To prove the theorem, we need to show that there is a choice of such that . By the probabilistic method it is enough to show that . By the above equation it is enough to show that there is a choice of such that . That is what we show in the rest of the proof.
In the above equations, we get the multivariate polynomial in terms of . The following properties of are immediate:
-
•
is -homogeneous; i.e., every monomial of has degree .
-
•
Since for all crossing sets, we have .
-
•
For every with , we have
Finally, by the following fact, there exists a choice of such that as desired.
Fact 1.6.
Let denote the set of all -homogeneous polynomials such that all coefficients of have absolute value at most one and the monomial have coefficient exactly one. Then, for all
where is an absolute constant only as a function of .
Proof.
Set
The main observation is that for any , . This is simply because is not the identically zero polynomial (it has one non-zero monomial. So over a field of size at least the degree of , it cannot evaluate to zero. Lastly, we observe that is compact and is a continuous map. So must have a minimum value that is non-zero, i.e., . ∎
1.5 The Alteration Method
Sometimes in our probabilistic method proof, we may not directly obtain the object of interest. Instead, we may try to sample a “good enough” object and then show that by a small number tweaks we can turn the object into a feasible object.
Recall that is the smallest integer such that in every two coloring of the edges of the complete graph by red and blue there is a monochromatic copy of . The following is a stronger variant of 1.1
Theorem 1.7.
For any integer and , we have .
Proof.
As in 1.1, consider a uniformly random 2-coloring of the edges of . Let be the even that the subgraph on is monochromatic. Let
be the number of monochromatic copies of in our two-colored graph. By linearity of expectations,
Now, it follows that there must exists a two-coloring such that the number of monochromatic copies of is at most . Consider such a coloring.
Now, we discuss the alteration part: We know that we have (at most) copies of . We are going to delete one (arbitrary) vertex from each of these copies. Note that in principal these copies may share vertices so we may be able to delete all of them by removing a few vertices, but in the worst case, these copies are disjoint. So, we can delete all of them by removing at most vertices of . The resulting graph has at least vertices and has no copies of . ∎
Now, we are left with the ”calculus” problem of for what values of , can we optimize the inequality. It turns out with a bit of calucluations that
This is slightly better than what we can show with 1.1, that .
In future lectures, we will see how to use a more sophisticated technique, called the Lovasz Local lemma, to get a slightly better bound.