CSE 525: Randomized Algorithms Fall 2026 Lecture 3: DNF Counting and Unreliability Lecturer: Shayan Oveis Gharan 04/02/2026 Scribe:

Disclaimer: These notes have not been subjected to the usual scrutiny reserved for formal publications.

3.1 Recap from CSE 521: Unbiased Estimators

We say a random variable X is an unbiased estimator of μ if

𝔼[X]=μ.

It turns out the the number of samples is proportional to the relative variance of X.

Definition 3.1 (Relative Variance).

Say X is an unbiased estimator of μ, then, the relative variance of X is defined as

σ2(X)μ2, (3.1)

where by σ2(X)=𝔼[X]2(𝔼[X])2 is the variance of X. We typically use t to denote the relative variance.

The following theorem is the main result of this section.

Theorem 3.2.

Given ϵ,δ>0, and an unbiased estimator of μ, X. We can approximate μ within 1±ϵ multiplicative factor using only O(tϵ2log1δ) independent samples of X with probability 1δ.

Monte Carlo Simulation.

Suppose we want to estimate the size of a set S. There is a generic approach called the Monte Carlo simulation. First, we find a univere US that has the following two properties:

  • We know the size of U

  • We can efficiently sample from U.

We sample an element randomly from U and test whether it belongs to S. Let X be the corresponding indicator random variable. It follows that Y=|X||U| is an unbiased estimator of S with relative variance, |U||S|. So, we can estimate |S| within a 1±ϵ multiplicative factor by generating O(|U|ϵ2|S|log1δ) many samples from U.

3.2 FPRAS for DNF Counting

A DNF is conjunction of clauses where each of them is a disjunction of literals, e.g., (x1x2¯x4)(x3¯x2). Consider a DNF with n variables. Obviously the problem of finding a satisfying assignment for a DNF is in P. We want to count the number of satisfying assignment to a DNF. Here we prove the following theorem:

Theorem 3.3 ([KLM89]).

There is an FPRAS for the DNF counting problem.

Following the Monte Carlo sampling method, we can let S be the set of satisfying assignments to the given DNF. If U is the set of all truth assignments to the n variables then it has the desired properties. So, by the discussion in the previous section we would need to generate O(2nϵ2|S|) many samples. So, this would work only if |S| is within a polynomial factor of 2n; but what if it is exponentially smaller?

The ida of Karp, Luby and Madras [KLM89] is to choose the univese U carefully such that |S| is within a polynomial factor of |U|. In fact, they consider a more general problem. Suppose we have sets S1,S2,,Sm of a ground set of elements and we want to estimate |iSi|. First, of all observe that we always have

1m|iSi|i|Si|1

So, the idea is to construct an artificial universe U of size |U|=i|Si|. How can we do that? It is enough that for each ground element e, and for every set Si where eSi put a distinct copy of e in U:

U:={(e,i):eSi}.

By definition |U|=i|Si|.

Now suppose we know the sizes of all Si’s and we can sample a u.r. element from each efficiently. Then, we claim that U is an ideal universe:

  • We can compute the size of |U|=i|Si| exactly.

  • To sample an element uniformly at random from U, we first sample i with probability |Si||U|. Then, we sample a uniformly random element from Si.

Now, let S=iSi and notice that US. So, we need identify some of the elements of U with S; in particular, for each element eS, let i be the smallest index such that eSi; we identify eS with (e,i). So, to run the the Monte Carlo method, we first sample an element (e,i) from U and then we check whether (e,i)S by simply checking wether i is the smallest index where Si contains e.

In our DNF counting problem each Si corresponds to the set of assignments that satisfy the i-th clause. Obviously if the i-th clause has k literals, |Si|=2nk, and we can sample efficiently from Si simply by fixing every literal that appears in Si to be true and choosing rest uniformly at random.

3.3 Network Unreliability

As an application we discuss an algorithm for the network unreliability problem. Given a network of n vertices where each edge e disappears independently with probability pe determine the probability that the surviving network is disconnected.

In this lecture for simplicity we assume that all pe’s are equal to p and we define Fail(p) to denote the failure probability of the whole network. We prove the following theorem of Karger [Kar95]. Also, see [Kar16] for a much faster algorithm.

Theorem 3.4 ([Kar95]).

There is an FPRAS for the network reliability problem. That is given G, p and ϵ the algorithm returns a 1+ϵ multiplicative approximation to Fail(p) with probability 3/4.

Note that the success probability can be easily boosted up to 1δ by running O(log(1/δ)) independent copies of the algorithm and returning the median of the estimates.

First, let us discuss how the two problems are related. Suppose our graph have r cuts, C1,,Cr. Then, we can use the above algorithm to estimate the probability that all of the edges in one of these cuts fail. In particular, we write

1ireCiXe

where Xe is the indicator (random variable) that edge e fails. Then, every realization of failures that makes G disconnected correspond to a satisfying assignment for the above DNF formula. So, we can use our FPRAS to get a 1+ϵ approximation to the probability that at least one of the cuts C1,,Cr fail.

Remark 3.5.

Note that there is a technical problem here, because each edge fails with probability p whereas in the proof of Theorem 3.3 we assumed that every variable is true/false with 1/2 probability. We leave this as an exercise.

By the above discussion if G has polynomially many cuts then we are already done. The problem is that G has exponentially many cuts; i.e., we need to find the union of exponentially many sets. Karger’s idea is as follows: Let k denote the size of the minimum cut of G. Then, obviously,

Fail(p)pk=:q. (3.2)

Now, we consider two cases

Case 1: q>1/poly(n)

In this case, we can simply use Monte Carlo method to obtain a 1+ϵ approximation of Fail(p) using only O(1ϵ2qlog1δ) many samples. In other words, we let U be the universe of all realizations of G.

Case 2: q<1/poly(n)

As it will be clear we need to let poly(n) be n4. Here the idea is to divide all cuts of G into two groups: (i) Near minimum cuts and (ii) Large cuts. We use the following theorem of Karger to prove that there are only polynomialy many near minimum cut in any graph G. So, we can use Theorem 3.3 to estimate the probability that at least one of these cuts fails. To deal with large cuts we also use the following theorem to argue that it is very unlikely that any of the large cuts fails.

Theorem 3.6 (Karger [Kar95]).

For any graph G with n vertices and with minimum cut k, and for any α1, the number of cuts of size at most αk in is at most n2α.

The above theorem can be proved by Karger’s contraction algorithm, see 521-Notes. Also, note that the statement of the above theorem is tight, as in a cycle of length n there are O(n2α) cuts with 2α edges.

Now, let α be a parameter that we fix later. We want to show that with probability at most ϵq all cuts of size at least αk survive.

Lemma 3.7.

Let C1,,Cr be all cuts of G and let us sort them in the order of their size

|C1||C2||Cr|.

For any α1, and q=nβ we have

[in2α:Ci fails]n2α(β/2+1)β/21
Proof.

Firstly, by Theorem 3.6, for any i=n2x, we have |Ci|xk. In other words,

|Ci|logi2lognk (3.3)

By union bound can write

[in2α:Ci fails] in2α[Ci fails]
= in2αp|Ci|
(3.3) in2αplogi2lognk
=(3.2) in2αqlog(i)/2log(n).

Say q=nβ. We can write qlog(i)/2log(n)=eβlog(i)/2=iβ/2. Therefore,

[in2α:Ci fails]xn2αxβ/2dx=xβ/2+1β/2+1|n2α=n2α(β/2+1)β/21

So, to make sure that all large cuts survive it is enough to choose α such that the above probability is at most ϵq. Observe that if we choose α=2 and making sure that qϵn4 we get β4, so

[in2α:Ci fails]nβα+2α=α=2n2β+4=q=nβq2n4qϵn4ϵq.

So, it is enough to solve the DNF counting problem for n2α=n4 smallest cuts. These cuts can be found in polynomial time using Karger’s contraction algorithm.