4.1 Recap from CSE 521: Unbiased Estimators

CSE 525: Randomized Algorithms Spring 2026 Lecture 4: DNF Counting and Unreliability Lecturer: Shayan Oveis Gharan 04/06/2026 Scribe:

Disclaimer: These notes have not been subjected to the usual scrutiny reserved for formal publications.

4.1 Recap from CSE 521: Unbiased Estimators

We say a random variable $X$ is an unbiased estimator of $\mu$ if

\mathbb{E}\left[X\right]=\mu.

It turns out the the number of samples is proportional to the relative variance of $X$ .

Definition 4.1 (Relative Variance).

Say $X$ is an unbiased estimator of $\mu$ , then, the relative variance of $X$ is defined as

\displaystyle\frac{\sigma^{2}(X)}{\mu^{2}},

(4.1)

where by $\sigma^{2}(X)=\mathbb{E}\left[X\right]^{2}-(\mathbb{E}\left[X\right])^{2}$ is the variance of $X$ . We typically use $t$ to denote the relative variance.

The following theorem is the main result of this section.

Theorem 4.2.

Given $\epsilon,\delta>0$ , and an unbiased estimator of $\mu$ , $X$ . We can approximate $\mu$ within $1\pm\epsilon$ multiplicative factor using only $O(\frac{t}{\epsilon^{2}}\log\frac{1}{\delta})$ independent samples of $X$ with probability $1-\delta$ .

Monte Carlo Simulation.

Suppose we want to estimate the size of a set $S$ . There is a generic approach called the Monte Carlo simulation. First, we find a univere $U\supseteq S$ that has the following two properties:

•

We know the size of $U$
•

We can efficiently sample from $U$ .

We sample an element randomly from $U$ and test whether it belongs to $S$ . Let $X$ be the corresponding indicator random variable. It follows that $Y=|X|\cdot|U|$ is an unbiased estimator of $S$ with relative variance, $\frac{|U|}{|S|}$ . So, we can estimate $|S|$ within a $1\pm\epsilon$ multiplicative factor by generating $O(\frac{|U|}{\epsilon^{2}|S|}\log\frac{1}{\delta})$ many samples from $U$ .

4.2 FPRAS for DNF Counting

A DNF is conjunction of clauses where each of them is a disjunction of literals, e.g., $(x_{1}\wedge\overline{x_{2}}\wedge x_{4})\vee(\overline{x_{3}}\wedge x_{2})\dots$ . Consider a DNF with $n$ variables. Obviously the problem of finding a satisfying assignment for a DNF is in P. We want to count the number of satisfying assignment to a DNF. Here we prove the following theorem:

Theorem 4.3 ([KLM89]).

There is an FPRAS for the DNF counting problem.

Following the Monte Carlo sampling method, we can let $S$ be the set of satisfying assignments to the given DNF. If $U$ is the set of all truth assignments to the $n$ variables then it has the desired properties. So, by the discussion in the previous section we would need to generate $O(\frac{2^{n}}{\epsilon^{2}|S|})$ many samples. So, this would work only if $|S|$ is within a polynomial factor of $2^{n}$ ; but what if it is exponentially smaller?

The ida of Karp, Luby and Madras [KLM89] is to choose the univese $U$ carefully such that $|S|$ is within a polynomial factor of $|U|$ . In fact, they consider a more general problem. Suppose we have sets $S_{1},S_{2},\dots,S_{m}$ of a ground set of elements and we want to estimate $|\cup_{i}S_{i}|$ . First, of all observe that we always have

\frac{1}{m}\leq\frac{|\cup_{i}S_{i}|}{\sum_{i}|S_{i}|}\leq 1

So, the idea is to construct an artificial universe $U$ of size $|U|=\sum_{i}|S_{i}|$ . How can we do that? It is enough that for each ground element $e$ , and for every set $S_{i}$ where $e\in S_{i}$ put a distinct copy of $e$ in $U$ :

U:=\{(e,i):e\in S_{i}\}.

By definition $|U|=\sum_{i}|S_{i}|$ .

Now suppose we know the sizes of all $S_{i}$ ’s and we can sample a u.r. element from each efficiently. Then, we claim that $U$ is an ideal universe:

•

We can compute the size of $|U|=\sum_{i}|S_{i}|$ exactly.
•

To sample an element uniformly at random from $U$ , we first sample $i$ with probability $\frac{|S_{i}|}{|U|}$ . Then, we sample a uniformly random element from $S_{i}$ .

Now, let $S=\cup_{i}S_{i}$ and notice that $U\not\supset S$ . So, we need identify some of the elements of $U$ with $S$ ; in particular, for each element $e\in S$ , let $i$ be the smallest index such that $e\in S_{i}$ ; we identify $e\in S$ with $(e,i)$ . So, to run the the Monte Carlo method, we first sample an element $(e,i)$ from $U$ and then we check whether $(e,i)\in S$ by simply checking wether $i$ is the smallest index where $S_{i}$ contains $e$ .

In our DNF counting problem each $S_{i}$ corresponds to the set of assignments that satisfy the $i$ -th clause. Obviously if the $i$ -th clause has $k$ literals, $|S_{i}|=2^{n-k}$ , and we can sample efficiently from $S_{i}$ simply by fixing every literal that appears in $S_{i}$ to be true and choosing rest uniformly at random.

4.3 Network Unreliability

As an application we discuss an algorithm for the network unreliability problem. Given a network of $n$ vertices where each edge $e$ disappears independently with probability $p_{e}$ determine the probability that the surviving network is disconnected.

In this lecture for simplicity we assume that all $p_{e}$ ’s are equal to $p$ and we define $Fail(p)$ to denote the failure probability of the whole network. We prove the following theorem of Karger [Kar95]. Also, see [Kar16] for a much faster algorithm.

Theorem 4.4 ([Kar95]).

There is an FPRAS for the network reliability problem. That is given $G$ , $p$ and $\epsilon$ the algorithm returns a $1+\epsilon$ multiplicative approximation to $Fail(p)$ with probability $3/4$ .

Note that the success probability can be easily boosted up to $1-\delta$ by running $O(\log(1/\delta))$ independent copies of the algorithm and returning the median of the estimates.

First, let us discuss how the two problems are related. Suppose our graph have $r$ cuts, $C_{1},\dots,C_{r}$ . Then, we can use the above algorithm to estimate the probability that all of the edges in one of these cuts fail. In particular, we write

\vee_{1\leq i\leq r}\wedge_{e\in C_{i}}X_{e}

where $X_{e}$ is the indicator (random variable) that edge $e$ fails. Then, every realization of failures that makes $G$ disconnected correspond to a satisfying assignment for the above DNF formula. So, we can use our FPRAS to get a $1+\epsilon$ approximation to the probability that at least one of the cuts $C_{1},\dots,C_{r}$ fail.

Remark 4.5.

Note that there is a technical problem here, because each edge fails with probability $p$ whereas in the proof of Theorem 4.3 we assumed that every variable is true/false with 1/2 probability. We leave this as an exercise.

By the above discussion if $G$ has polynomially many cuts then we are already done. The problem is that $G$ has exponentially many cuts; i.e., we need to find the union of exponentially many sets. Karger’s idea is as follows: Let $k$ denote the size of the minimum cut of $G$ . Then, obviously,

Fail(p)\geq p^{k}=:q.

(4.2)

Now, we consider two cases

Case 1: $q>1/\operatorname{poly}(n)$

In this case, we can simply use Monte Carlo method to obtain a $1+\epsilon$ approximation of $Fail(p)$ using only $O(\frac{1}{\epsilon^{2}q}\log\frac{1}{\delta})$ many samples. In other words, we let $U$ be the universe of all realizations of $G$ .

Case 2: $q<1/\operatorname{poly}(n)$

As it will be clear we need to let $\operatorname{poly}(n)$ be $n^{4}$ . Here the idea is to divide all cuts of $G$ into two groups: (i) Near minimum cuts and (ii) Large cuts. We use the following theorem of Karger to prove that there are only polynomialy many near minimum cut in any graph $G$ . So, we can use Theorem 4.3 to estimate the probability that at least one of these cuts fails. To deal with large cuts we also use the following theorem to argue that it is very unlikely that any of the large cuts fails.

Theorem 4.6 (Karger [Kar95]).

For any graph $G$ with $n$ vertices and with minimum cut $k$ , and for any $\alpha\geq 1$ , the number of cuts of size at most $\alpha k$ in is at most $n^{2\alpha}$ .

The above theorem can be proved by Karger’s contraction algorithm, see 521-Notes. Also, note that the statement of the above theorem is tight, as in a cycle of length $n$ there are $O(n^{2\alpha})$ cuts with $2\alpha$ edges.

Now, let $\alpha$ be a parameter that we fix later. We want to show that with probability at most $\epsilon q$ all cuts of size at least $\alpha k$ survive.

Lemma 4.7.

Let $C_{1},\dots,C_{r}$ be all cuts of $G$ and let us sort them in the order of their size

|C_{1}|\leq|C_{2}|\leq\dots\leq|C_{r}|.

For any $\alpha\geq 1$ , and $q=n^{-\beta}$ we have

\mathbb{P}\left[\exists i\geq n^{2\alpha}:C_{i}\text{ fails}\right]\leq\frac{n% ^{2\alpha(-\beta/2+1)}}{\beta/2-1}

Proof.

Firstly, by Theorem 4.6, for any $i=n^{2x}$ , we have $|C_{i}|\geq xk$ . In other words,

|C_{i}|\geq\frac{\log i}{2\log n}k

(4.3)

By union bound can write

$\displaystyle\mathbb{P}\left[\exists i\geq n^{2\alpha}:C_{i}\text{ fails}\right]$	$\displaystyle\leq$	$\displaystyle\sum_{i\geq n^{2\alpha}}\mathbb{P}\left[C_{i}\text{ fails}\right]$
	$\displaystyle=$	$\displaystyle\sum_{i\geq n^{2\alpha}}p^{\|C_{i}\|}$
	$\displaystyle\underset{\eqref{eq:Cilowerbound}}{\leq}$	$\displaystyle\sum_{i\geq n^{2\alpha}}p^{\frac{\log i}{2\log n}k}$
	$\displaystyle\underset{\eqref{eq:defq}}{=}$	$\displaystyle\sum_{i\geq n^{2\alpha}}q^{\log(i)/2\log(n)}.$

Say $q=n^{-\beta}$ . We can write $q^{\log(i)/2\log(n)}=e^{-\beta\log(i)/2}=i^{-\beta/2}.$ Therefore,

\mathbb{P}\left[\exists i\geq n^{2\alpha}:C_{i}\text{ fails}\right]\leq\int_{x% \geq n^{2\alpha}}x^{-\beta/2}dx=\frac{x^{-\beta/2+1}}{-\beta/2+1}\Big{|}_{n^{2% \alpha}}^{\infty}=\frac{n^{2\alpha(-\beta/2+1)}}{\beta/2-1}

∎

So, to make sure that all large cuts survive it is enough to choose $\alpha$ such that the above probability is at most $\epsilon q$ . Observe that if we choose $\alpha=2$ and making sure that $q\leq\epsilon\cdot n^{-4}$ we get $\beta\geq 4$ , so

\mathbb{P}\left[\exists i\geq n^{2\alpha}:C_{i}\text{ fails}\right]\leq n^{-% \beta\alpha+2\alpha}\underset{\alpha=2}{=}n^{-2\beta+4}\underset{q=n^{-\beta}}% {=}q^{2}\cdot n^{4}\underset{q\leq\epsilon n^{-4}}{\leq}\epsilon q.

So, it is enough to solve the DNF counting problem for $n^{2\alpha}=n^{4}$ smallest cuts. These cuts can be found in polynomial time using Karger’s contraction algorithm.

4.4 Congestion Minimization Problem

A classical technique in the field of approximation algorithms is to write down a linear programming relaxation of a combinatorial problem. The linear program (LP) is then solved in polynomial time, and one rounds the fractional solution to an integral solution that is, hopefully, not too much worse than the optimal solution.

A classical example goes back to Raghavan and Thompson [RT87]. Let $G=(V,A)$ be a directed network, and suppose that we are given a sequence of terminal pairs $(s_{1},t_{1}),(s_{2},t_{2}),\dots,(s_{k},t_{k})$ where $\{s_{i}\},\{t_{i}\}\subseteq V$ . The goal is to choose, for every $i$ , a directed $s_{i}$ - $t_{i}$ path $P_{i}$ in $G$ so as to minimize the maximum congestion of an arc $e\in A$ :

OPT=\min\{\max_{e\in A}\#\{i:e\in P_{i}\}\}

This problem is NP-hard. Our goal will be to design an approximation algorithm that outputs a solution so that the congestion of every edge is at most $\alpha\cdot OPT$ , for $\alpha$ as small as possible. The number $\alpha$ is called the approximation factor of our algorithm. We will see that for this problem we will be able to achieve $\alpha=O(\frac{\log n}{\log\log n})$ .

We start by writing a linear programming relaxation for this problem. Let ${\cal{P}}_{i}$ be the set of (directed) paths from $s_{i}$ to $t_{i}$ and let ${\cal{P}}=\cup_{i}{\cal{P}}_{i}$ . For every path $P$ , we have a variable $x_{P}$ to denote the amount of flow that we route along $P$ .

$\displaystyle\min$	$\displaystyle t$		(4.4)
s.t.	$\displaystyle\sum_{P\in{\cal{P}}_{i}}y_{P}=1$	$\displaystyle\forall 1\leq i\leq k$
	$\displaystyle\sum_{P\in{\cal{P}}:e\in P}y_{P}\leq t$	$\displaystyle\forall e\in A$
	$\displaystyle y_{P}\geq 0$	$\displaystyle\forall P$

A few observations are in order:

•

$OPT(LP)\leq OPT$ . This is simply because the optimum solution is a feasible solution in the above program. Note that the optimum solution satisfies all of the above constraints with the additional property that $y_{P}\in\{0,1\}$ for all paths $P$ .
•
Although the above program has exponentially many variable one for every directed path connecting $s_{i}\to t_{i}$ (for all $i$ ), its optimum solution can be computed in polynomial time. To do that we need two observations:
1. i)
  
  We can write a linear program to find a flow of value 1 from $s_{i}$ to $t_{i}$ . We have a variable $f^{(i)}_{e}$ to denote the flow of every edge.
  
  $\displaystyle\sum_{s_{i}\to e}f^{(i)}_{e}=1$
  
  $\displaystyle\sum_{e\to v}f^{(i)}_{e}=\sum_{v\to e}f^{(i)}_{e}$ $\displaystyle\forall v\neq s_{i},t_{i}$
  
  $\displaystyle f^{(i)}_{e}\geq 0$ $\displaystyle\forall e.$
  
  Having that, the congestion of $e$ due to the flow routed between the $i$ -th pair is $f^{(i)}_{e}$ ; so the total congestion of $e$ is $\sum_{i}f^{(i)}_{e}$ .
2. ii)
  
  A (fractional) flow (of value 1) from $s_{i}$ to $t_{i}$ can be decomposed into a distribution of paths from $s_{i}$ to $t_{i}$ . To see that, given the solution $\{f^{(i)}_{e}\}_{e\in A}$ , greedily find a path $P$ from $s_{i}$ to $t_{i}$ on edges with positive flow; let $y_{P}$ be $\min_{e\in P}f^{(i)}_{e}$ . Then, subtract $y_{P}$ from the flow of all edges along $P$ . We will obtain a new flow of value $1-y_{P}$ from $s_{i}$ to $t_{i}$ . So we repeat this procedure until we get the $0$ flow.

4.5 Independent Rounding

Given a solution $y$ to (4.4), we want to round it to an integral solution. Namely, we want to choose exactly one path from each ${\cal{P}}_{i}$ such that the union of the chosen paths have small congestion, at most $\alpha OPT$ .

We follow the independent rounding method. Recall that, by feasiblity of $y$ , for every $1\leq i\leq k$ , we know that $\sum_{P\in{\cal{P}}_{i}}y_{P}=1$ . So, we can think of $\{y_{P}\}_{P\in{\cal{P}}_{i}}$ as providing a probability distribution over $s_{i}$ - $t_{i}$ paths. For every $i$ , independently, we choose one of the paths $P\in{\cal{P}}_{i}$ with probability $y_{P}$ . This procedure, by definition, gives a feasible set of directed paths from $s_{i}$ to $t_{i}$ for all $i$ . So, it remains to bound the maximum congestion. We prove the following theorem.

Theorem 4.8.

With probability at least $1-1/n$ the above algorithm produces a integral set of paths connecting all terminals with maximum congest at most

C\frac{\log n}{\log\log n}OPT

Let $Y_{P}$ be the indicator random variable that $P$ is chosen. For an edge $e$ , let $X_{e}$ be the random variable that is the congestion of edge $e$ . So,

X_{e}=\sum_{P\in{\cal{P}}:e\in P}Y_{P}.

By linearity of expectations,

\mathbb{E}\left[X_{e}\right]=\sum_{P\in{\cal{P}}:e\in P}y_{P}\leq OPT(LP).

So, the expectations are small. We just need to use a Chernoff bound/union bound argument. Unfortunately, the random variables $Y_{P}$ are not independent. So, we need to use a slightly different random variables that are truly independent.

The idea is to note that we always have exactly one path from $s_{i}$ to $t_{i}$ . So, let $Y_{e,i}$ be the indicator random variable that the unique path from $s_{i}$ to $t_{i}$ uses edge $e$ . So, $X_{e}=\sum_{i}Y_{e,i}$ . It follows that $Y_{e,i}$ ’s are independent. Let $\beta=(1+\delta)\frac{OPT}{OPT(LP)}$ , and note that $\beta>1$ since $OPT\geq OPT(LP)$ . So, by Chernoff bound,

\mathbb{P}\left[X_{e}\geq(1+\delta)OPT\right]=\mathbb{P}\left[X_{e}\geq\beta% \cdot OPT(LP)\right]\leq\left(\frac{e^{\beta}}{\beta^{\beta}}\right)^{OPT(LP)}% \leq\left(\frac{e^{1+\delta}}{(1+\delta)^{1+\delta}}\right)^{OPT}\leq\left(% \frac{e^{1+\delta}}{(1+\delta)^{1+\delta}}\right)

where in the last inequality we simply use that $OPT\geq 1$ . Now, to get the strong concentration bound we need to choose $\delta$ large enough such that the RHS is at most $n^{-3}$ . It turns out that for that purpose it is enough to let $1+\delta=C\frac{\log n}{\log\log n}$ for a large enough constant $C>1$ .

Since, $G$ has at most $n^{2}$ edges, $|A|\leq n^{2}$ , by union bound

\mathbb{P}\left[\exists e:X_{e}\geq(1+\delta)OPT\right]\leq n^{2}\cdot n^{-3}% \leq\frac{1}{n}.

So, the algorithm succeeds with probability at least $1-1/n$ .

4.6 Future Works and Open Problems

Chuzhoy, Guruswami, Khanna and Talwar showed that min-congestion problem is NP-hard to approximate within any factor better than $\frac{\log n}{\log\log n}$ when the underlying graph is directed. Note that the same algorithm that we discussed here works if the underlying graph is undirected (we can just put two copies of every edge one in every direction). However, for undirected, the result of Raghavan-Thompson is still the best known approximation factor. The best hardness result is $\log\log n$ by Andrews-Zhang. It is a fundamental open problem in the field of network routing to beat the Raghavan-Thompson’s classical algorithm.

		$\displaystyle\sum_{s_{i}\to e}f^{(i)}_{e}=1$
		$\displaystyle\sum_{e\to v}f^{(i)}_{e}=\sum_{v\to e}f^{(i)}_{e}$	$\displaystyle\forall v\neq s_{i},t_{i}$
		$\displaystyle f^{(i)}_{e}\geq 0$	$\displaystyle\forall e.$