CSE 525: Randomized Algorithms Spring 2026 Lecture 19: Chaining for Norms Lecturer: Shayan Oveis Gharan 06/04/25

Disclaimer: These notes have not been subjected to the usual scrutiny reserved for formal publications.

The content of these notes are based on https://homes.cs.washington.edu/~jrl/cse599wi23/notes/lec4.html.

Entropy-number convention.

For a metric space (S,ρ), write eh(S,ρ) for the smallest radius r such that S is coverable by at most 22h balls of radius r in metric ρ.

19.1 Norms and the main estimate

Definition 19.1 (Norms and seminorms).

A map N:n+ is a norm when, for all x,yn and λ,

  1. 1.

    N(λx)=|λ|N(x),

  2. 2.

    N(x+y)N(x)+N(y),

  3. 3.

    N(x)=0 if and only if x=0.

When only the first two properties are used, N is a seminorm. The arguments below use the word “norm” in this broad sense.

Let N1,,Nm be norms on n, and let

TB2n,B2n{xn:x21}.

For a standard Gaussian gN(0,In), define

κ𝔼maxk=1,,mNk(g).
Theorem 19.2.

If ϵ1,,ϵm are independent random signs, then

𝔼maxxTk=1mϵkNk(x)2κlog(n)maxxTk=1mNk(x)2. (19.1)

19.1.1 Example: sums of random matrices

Let

A=k=1mϵkAkTAk,

with each AkTAk positive semidefinite. Then

Aop =maxx21x,Ax
=maxx21k=1mϵkx,AkTAkx
=maxx21k=1mϵkAkx22.

This is the preceding setting with T=B2n and Nk(x)=Akx2.

Translator note.

The source text appears to phrase the final identification in squared form. The normalization above is the one for which kϵkNk(x)2=kϵkAkx22.

19.2 Dudley’s inequality and metric reduction

The process

{k=1mϵkNk(x)2:xT}

is subgaussian with respect to

d(x,y)(k=1m(Nk(x)2Nk(y)2)2)1/2.

Dudley’s entropy inequality therefore gives

𝔼maxxTk=1mϵkNk(x)2h02h/2eh(T,d). (19.2)

Both sides of (19.1) are homogeneous of degree two in the family (Nk)k=1m. Thus one may rescale and assume

maxxTk=1mNk(x)2=1. (19.3)

Define

xNmaxk=1,,mNk(x).

For x,yT, use a2b2=(ab)(a+b) to obtain

d(x,y) =(k=1m(Nk(x)Nk(y))2(Nk(x)+Nk(y))2)1/2
(k=1m(Nk(xy))2(Nk(x)+Nk(y))2)1/2
(k=1mmaxkNk(xy)2(Nk(x)+Nk(y))2)1/2
=xyN(k=1m(Nk(x)+Nk(y))2)1/2
xyN(k=1m2Nk(x)2+2Nk(y)2)1/2
=2xyN.

The first inequality uses |Nk(x)Nk(y)|Nk(xy), and the last equality follows from (19.3). Consequently, eh(T,d)2eh(T,N), and (19.2) implies

𝔼maxxTk=1mϵkNk(x)2h02h/2eh(T,N). (19.4)

We now split the right-hand side into the ranges h4logn and h>4logn.

19.3 The large-entropy tail

Let

BN{xn:xN1}.

For xT, the normalization (19.3) gives

xNk=1mNk(x)21,

so TBN, hence eh(T,N)eh(BN,N).

Claim 19.3.

For any norm on n, and any h1,

eh(BN,N)422h/n.
Proof.

Fix δ(0,1), and choose a maximal collection x1,,xsBN with pairwise distances at least 2δ in N. Maximality gives the cover

BNj=1s(xj+2δBN).

The sets xj+δBN are pairwise disjoint and contained in 2BN, so

voln(2BN)svoln(δBN)=s(δ/2)nvoln(2BN).

Therefore s(2/δ)n. Taking δ=222h/n yields s22h and gives a cover of BN by at most 22h balls of radius 2δ=422h/n. ∎

Using 19.3, the large-h part of Equation 19.4 obeys

h>4logn2h/2eh(T,N)4h>4logn2h/222h/nO(1).

Thus

𝔼maxxTk=1mϵkNk(x)2O(1)+0h4logn2h/2eh(T,N).

19.4 The relevant entropy range and dual Sudakov

Since TB2n,

eh(T,N)eh(B2n,N).

The required ingredient is the following dual Sudakov bound.

Lemma 19.4 (Dual Sudakov).

For any norm on n and every h0,

eh(B2n,)2h/2𝔼g,

where gN(0,In).

Applying 19.4 with =N gives

𝔼gN=𝔼maxk=1,,mNk(g)=κ.

Therefore

0h4logn2h/2eh(T,N)κlogn.

Under the normalization (19.3), this proves (19.1); undoing the rescaling gives the stated form.

19.5 Gaussian shift lemma

Lemma 19.5 (Gaussian shift).

Let Kn be symmetric and convex, and let γn denote standard Gaussian measure on n. For every xn,

γn(K+x)exp(x222)γn(K).
Proof.

Using symmetry of K and writing σ for a uniform random sign,

γn(K+x) =(2π)n/2Kexp(x+z222)𝑑z
=(2π)n/2K𝔼σ{1,1}exp(σx+z222)𝑑z.

Since

𝔼σσx+z22=x22+z22,

Jensen’s inequality yields

γn(K+x) (2π)n/2Kexp(𝔼σσx+z222)𝑑z
=(2π)n/2Kexp(x22+z222)𝑑z
=exp(x222)γn(K).

Translator note.

The displayed conclusion above matches the Gaussian-shift bound used in the proof of dual Sudakov; constants are immaterial for the subsequent estimate.

19.6 Proof of the dual Sudakov lemma

Let

{xn:x1}

be the unit ball of the norm . Choose x1,,xsB2n maximally so that the translated sets xj+δ are pairwise disjoint. Then

B2nj=1s(xj+2δ), (19.5)

so B2n is covered by s balls of radius 2δ in the norm .

For any λ>0, the scaled sets λ(xj+δ) are pairwise disjoint. Therefore

1 γn(j=1sλ(xj+δ))
=j=1sγn(λxj+λδ)
j=1sexp(λ2xj222)γn(λδ)
sexp(λ22)γn(λδ),

where 19.5 is used in the third line and xjB2n in the final line.

Choose

λ2δ𝔼g.

Then, by Markov’s inequality,

γn(λδ)=(gλδ)=(g2𝔼g)12.

Combining the previous inequalities gives

1s2exp(12(2𝔼gδ)2).

Equivalently, up to universal constants,

δ𝔼glog(s/2).

With s=22h, the cover in (19.5) has radius 2δ2h/2𝔼g. Hence

eh(B2n,)2h/2𝔼g,

which proves 19.4.