CSE 525: Randomized Algorithms Spring 2026 Lecture 18: Hypergraph Sparsification Lecturer: Shayan Oveis Gharan 06/02/26
Disclaimer: These notes have not been subjected to the usual scrutiny reserved for formal publications.
18.1 Max Eigenvalue of Random Matrices
We will now use generic chaining to show that the largest eigenvalue of a random symmetric matrix with Rademacher entries is . This is certainly not the simplest way of proving such a result, but it will give a sense of how these techniques can be applied.
Our “sub-Gaussian” random process is to pick Rademacher random variables , for each , define the matrix and let
for every . (we assume the diagonal is 0 for simplicity. It thus follows that
So, we let be the set of vectors
Let be a Radamacher random variable By Hoeffding’s inequality for any vector we can write
so, this distribution is -subgaussian.
It follows that for with
Having this we can write
Now we need to apply generic chaining to . We can conclude that an -net over the unit Euclidean sphere is also a -net for the metric space . For the unit Euclidean sphere there is an -net of size at most . To apply generic chaining, let be an arbitrary subset of of cardinality if , and an -net with otherwise. Applying the generic chaining inequality,
Consider a weighted hypergraph where are nonnegative edge weights. We associated to the quadratic expression
The main observation is that If were a graph i.e., for every edge we had for, this would correspond to the quadratic form of the graph Laplacian.
As our main application of chaining we will explain algorithms to sparsify hypergraphs: That is we want to construct another hypergraph such that and such that
| (18.1) |
where as usual is the accuracy parameter of our sparsifier. Furthermore, similar to the graph case we would like to make as small as possible, ideally near-linear in (while —E— could be as large as in this case).
The following theorem is proved in a paper by James Lee
Theorem 18.1.
For any -vertex weighted hypergraph and , there is a spectral -sparsifier for such that
where .
18.2 Independent random sampling
For any edge let defined as
Therefore, .
Suppose we have a probability distribution , i.e., and . Similar to the graphic case, we let be an unbiased estimator: Namely we let with probability . Then, it follows that
As usual the difficulty would be in choosing the probabilities .
Then, we form by sampling , many times and taking the empricial mean of the samples. In particular, we can write,
Observe that
for all . So, as before the main question is how to choose the probabilities ?
18.3 Auxilury Graph
Define the edge set
and let be a weighted graph, where we will choose the edge conductances later. Let
Let denote the effective resistance between in . For a hyperedge , we let
Having this we define
where is the normalizing constant. Note that in the special case that is a graph . Now, the question is how to choose the conductances of the edges of ?
The following is the main lemma:
Lemma 18.2.
Suppose it holds that
| (18.2) |
then for any and , with a constant probability is a sparsifier of .
The proof of this lemma uses the chaining machinery. But, let us first discuss how to satisfy assumptions of this lemma?
Roughly speaking this auxiluary graph , puts into isotropic position. Of course, this step is very straightforward for matrices but as you will see this is fairly more complicated for these non-linear operators.
18.4 Choosing Conductances
We are therefore left to find edge conductances in the graph so that (18.2) holds and is small. To this end, let us choose nonnegative numbers
such that
| (3.11) |
For , we then define our edge conductance
| (18.3) |
In this case,
Lemma 18.3 (Foster’s Network Theorem).
It holds that
Proof.
The observation is that
∎
Now, define
Then, .
Lemma 18.4.
We can choose conductances such that (18.3) is satisfied and .
Proof.
The conductances can be computed by solving the following convex program:
| (18.4) | ||||
| s.t., | ||||
Note that this program is convex as log of the determinant is a concave function. Equivalently, the objective can be written as log of the generating polynomial of all spanning trees of : . The convexity follows by the fact that any real stable polynomials is log-concave.
We don’t go into the details here. The proof uses writes the ”KKT” condition for the optimality of the convex program and deduces the bound on from that. We remark that ∎
18.5 Notes on Proof of 18.2
The proof of 18.2 is technical but at high-level it uses the generic chaining machinery. The set and the proof uses (18.2) which says that is a subset of the unit ball.
We just explain the first few steps: We let be an independent copy of ,
where we have used that and .
The point of this first few steps is to make the process ”centered”. The second step is to avoid binary 0/1 random variables. Such random variables are very annoying to run a chaining argument for.
The idea is that the distribution of is symmetric around the origin, i.e., centered. So, and have the same distribution.
First, notice
where and . If we let
Then,
Similarly, the object we need to study is
for Radamacher random variables . So, we just need to the expected value of this quantity over the space . This is where the chaining is used. But the details is beyond the scope of this course.