\documentclass[12pt]{article}
\usepackage{scribe}
\Scribe{Ryan O'Donnell}
\Lecturer{Ryan O'Donnell}
\LectureNumber{1}
\LectureDate{Sep.~28, 2005}
\LectureTitle{Introduction}
\begin{document}
\MakeScribeTop
\section{Proof, and The PCP Theorem}
The PCP Theorem is concerned with the notion of ``proofs'' for
mathematical statements. We begin with a somewhat informal
definition:
\begin{definition} A \emph{traditional proof system} works as follows:
\begin{itemize}
\item A statement is given. (e.g., ``this graph $G = \cdots$ is 3-colorable'' or ``this CNF formula
$F = (x_1 \vee \overline{x}_2 \vee \overline{x}_{10}) \wedge \cdots$
is satisfiable'').
\item A \emph{prover} writes down a proof, in some agreed-upon format.
\item A \emph{verifier} checks the statement and the proof, and
accepts or rejects.
\end{itemize}
\end{definition}
From the perspective of theory of computer science, we usually fix a
constant size alphabet and assume the statement and proof are
strings over this alphabet; we also usually write $n$ for the length
of the statement and measure lengths of strings and running times in
terms of $n$. The familiar complexity class NP can be cast in this
setup:
\begin{remark} Language $L$ is in NP iff there is a \emph{polynomial time} deterministic verifier
$V$ (a Turing Machine, say) and an arbitrarily powerful prover $P$,
with the following properties:
\begin{itemize}
\item ``Completeness'': For every $x \in L$, $P$ can write a proof
of length $\poly(|x|)$ that $V$ accepts.
\item ``Soundness'': For every $x \not \in L$, no matter what
$\poly(|x|)$-length proof $P$ writes, $V$ rejects.
\end{itemize}
\end{remark}
To equate the notion of the verifier being efficient with it being a
deterministic polynomial time algorithm is nowadays a bit quaint;
ever since the late '70s we have been quite happy to consider
randomized polynomial time algorithms to be efficient. As it turns
out, when proof systems are allowed to have randomized verifiers,
some very surprising things can happen. This line of research was
begun in the early-to-mid '80s by Goldwasser, Micali, and
Rackoff~\cite{GMR89} and also independently by
Babai~\cite{Bab85,BM88}. See the accompanying notes on the history
of the PCP theorem. One pinnacle of research in this area is
\emph{The PCP Theorem}:
\begin{theorem} \label{thm:pcp} (due to Arora-Safra~(AS)~\cite{AS98} and Arora-Lund-Motwani-Sudan-Szegedy~(ALMSS)~\cite{ALMSS98}) \emph{``The PCP (Probabilistically Checkable Proof)
Theorem''}:\\
All languages $L \subseteq NP$ have a P.C.P. system wherein on input
$x \in \bits^n$:
\begin{itemize}
\item Prover $P$ writes down a $\poly(n)$-bit-length proof.
\item Verifier $V$ looks at $x$ and does polynomial-time deterministic
computation. Then $V$ uses $O(\log n)$ bits of randomness to choose
$C$ random locations in the proof. Here $C$ is a absolute universal
constant; say, 100. $V$ also uses these random bits to produce a
deterministic test (predicate) $\phi$ on $C$ bits.
\item $V$ reads the bits in the $C$ randomly chosen locations from the proof
and does the test $\phi$ on them, accepting or rejecting.
\item Completeness: If $x \in L$ then $P$ can write a proof
that $V$ accepts with probability $1$.
\item Soundness: For every $x \not \in L$, no matter what
proof $P$ writes, $V$ accepts with probability at most $1/2$.
\end{itemize}
\end{theorem}
\begin{remark} This P.C.P.~system has ``one-sided error'': true statements are
always accepted, but there is a chance a verifier might accept a
bogus proof. Note that this chance can be made an arbitrarily small
constant by naive repetition; for example, $V$ can repeat its same
spot-check 100 times independently, thus reading $100C$ bits and
accepting false proofs with probability at most $2^{-100}$.
\end{remark}
The first time one sees this theorem, it seems a little hard to
conceive how it can be true. It's even more striking when one
learns that essentially $C$ may be taken to be 3. (See
Theorem~\ref{thm:hastad} below.) How could you possibly be
convinced a proof is true just by spot-checking it in 3
bits?
\begin{remark}
By the classical theory NP-completeness, it suffices to prove the
PCP Theorem for one particular NP-complete language --- say,
3-COLORING or 3-SAT --- since poly-time reductions can be built into
the verifier's initial step (and into the prover's plans).
\end{remark}
\begin{remark}
The PCP that the prover needs to write down can be obtained in
deterministic polynomial time from the ``standard'' proofs for $x
\in L$ (i.e., the coloring for 3-COLORING, the assignment for
3-SAT).
\end{remark}
\begin{remark}
Sometimes enthusiastic descriptions of the PCP Theorem make it seem
like it greatly reduces the \emph{time} a verifier needs to spend to
check a proof. This is not accurate since the verifier still dies
polynomial-time deterministic pre-computations; these may already
take more time then it would have taken to simply check a classical
proof. What the PCP Theorem saves is \emph{proof accesses}. There
is other work on developing proof systems that let the verifier save
on time or space (see the accompanying notes on the history of the
PCP Theorem); however it seems to have had fewer interesting
applications.
\end{remark}
Our first task in this course will be to prove Theorem~\ref{thm:pcp}
completely. The fact that this will be possible is only due to a
very recent development. The original proof of the PCP Theorem was
very intricate and difficult; it might have been up to 100 pages,
with subsequent simplifications bringing it down to a very densely
packed 30 pages or so. However in April 2005, Irit Dinur gave a new
proof~\cite{Din05} which is elegant and clear and only a dozen pages
or so long. This is the proof we will see in the course.\\
Subsequent to the PCP Theorem were many more ``PCP Theorems'' that
strengthened certain parameters or extended the result in different
directions. What follows are a few of these:
\begin{theorem} \label{thm:raz} (Feige-Kilian~\cite{FK00}, Raz~\cite{Raz98}) (Raz's strong
version of this result is sometimes called ``the Raz Verifier'' or
``hardness of Label Cover''): For every constant $\eps > 0$, there
is a poly-size PCP for NP that reads \emph{two} random proof entries
written with $O(1)$-size alphabet and has completeness 1, soundness
$\eps$.
\end{theorem}
\begin{remark}
Note that in this theorem, the poly-size of the PCP and the alphabet
size both depend on $\eps$; with Raz's version, the proof has length
$n^{O(\log 1/\eps)}$ and the alphabet has size $\poly(1/\eps)$.
\end{remark}
\begin{remark}
The result proven is actually stronger in a technically subtle but
important way: One can additionally have the verifier use a
predicate $\phi(x,y)$ with the ``projection property'', namely, that
for every choice of $x$ there is exactly one choice of $y$ that
makes $\phi(x,y)$ true.
\end{remark}
\begin{remark} Comparing this result to the basic PCP Theorem, we
see that it uses a constant-size alphabet and two queries to get
arbitrarily small constant soundness, whereas the basic PCP Theorem
uses constantly many queries to a size-two alphabet. It might not
be immediately clear which is better, but it is indeed the former.
There are several ways to look at this: For example, with fewer
queries you have fewer opportunities to ``cross-check''; as an
extreme, it's clear that a verifier that made only one query (to a
constant size alphabet) could always be fooled. Or suppose that you
tried to encode every triple of bits in a proof with a single
character from an alphabet of size 8 --- although you could now read
three bits with just one query, the prover can cheat you by encoding
a single bit in different ways in different triples.
\end{remark}
We hope to prove Theorem~\ref{thm:raz} in this course --- at least,
the Feige-Kilian version without the projection property.\\
The following result essentially shows that we can take $C = 3$ in
the original PCP Theorem:
\begin{theorem} \label{thm:hastad} (H{\aa}stad~\cite{Has01}) ``3-LIN hardness'':
For every constant $\eps > 0$, there is a poly-size PCP for NP that
reads just \emph{three} random \emph{bits} and tests their XOR. Its
completeness is $1-\eps$ and its soundness is $1/2 + \eps$.
\end{theorem}
\begin{remark}
This result has ``imperfect completeness''. However, if one is
willing to allow an \emph{adaptive} three-bit-querying verifier
(i.e., the verifier does not have to pick the three bits in advance
but can base what bit it reads next on what it's seen so far) then
one can get completeness $1$. This is due to Guruswami, Lewin,
Sudan, and Trevisan~\cite{GLST98}.
\end{remark}
This result, which we will prove in the course, requires
Theorem~\ref{thm:raz}.\\
Finally, here is one more PCP Theorem which we \emph{won't} prove:
\begin{theorem} (due to Dinur~\cite{Din05}, based heavily on a result of Ben-Sasson and
Sudan~\cite{BS05}): In the basic PCP Theorem, the proof length can
be made $n \cdot \polylog(n)$ rather than $\poly(n)$.
\end{theorem}
\subsection{Hardness of approximation}
Perhaps the most important consequence of the PCP theorems and the
most active area of research in the area are results about
``hardness of approximation''. These will be the major focus of the
second half of this course. To be able to state hardness of
approximation results, we need to understand the notion of
\emph{(NP) combinatorial optimization problems}. Instead of making a
formal definition we will just give some examples. Briefly, these
are ``find the best solution'' versions of classic NP-complete
problems.
\begin{definition}
MAX-E3SAT: Given an E3CNF formula --- i.e., a conjunction of
``clauses'' over boolean variables $x_1, \dots, x_n$, where a clause
is an OR of exactly 3 literals, $x_i$ or $\overline{x_i}$ --- find
an assignment to the variables satisfying as many clauses as
possible.
\end{definition}
\begin{definition}
SET-COVER: Given a bunch of sets $S_1, \dots, S_m \subseteq \{1,
\dots, n\}$, find the fewest number of them whose union covers all
of $\{1, \dots, n\}$. (We assume that every ground element $i$ is
in at least one set $S_j$.)
\end{definition}
\begin{definition}
MAX-CLIQUE: Given an undirected graph, find the largest clique in
it, where a clique is a subset of vertices which contain all
possible edges.
\end{definition}
\begin{definition}
KNAPSACK: Given are ``weights'' $w_1, \dots, w_n \geq 0$ of $n$
items and also ``values'' $v_1, \dots, v_n \geq 0$. Also given is a
``capacity'' $C$. Find a set of items $S$ such that $\sum_{i \in S}
w_i \leq C$ while maximizing $\sum_{i \in S} v_i$.
\end{definition}
\begin{remark} Each of these is associated to a classic NP-complete
decision problem; e.g., ``CLIQUE: Given $G$ and $k$, does $G$ have a
clique of size at least $k$?'' Notice that frequently the NP
decision problem is a contrived version of the more natural
optimization problem.
\end{remark}
\begin{remark} Combinatorial optimization problems can be divided into
two categories: Maximization problems (like MAX-3SAT, MAX-CLIQUE,
KNAPSACK) and minimization problems (like SET-COVER).
\end{remark}
It is well-known that these problems are all NP-hard. However,
suppose that for, say, MAX-E3SAT, there was a polynomial time
algorithm with the following guarantee: Whenever the input instance
has optimum OPT --- i.e., there is an assignment satisfying OPT many
clauses --- the algorithm returns a solution satisfying $99.9\%
\times OPT$ many clauses. Such an algorithm would be highly useful,
and would tend to refute the classical notion that the NP-hardness
of MAX-E3SAT means there is no good algorithm for it.
Indeed, such results are known for the KNAPSACK problem. As early
as 1975, Ibarra and Kim~\cite{IK75} showed that for every $\eps > 0$
there is an algorithm for KNAPSACK that runs in time $\poly(n/\eps)$
and always returns a solution which is within a $(1-\eps)$ factor of
the optimal solution. So, although KNAPSACK is NP-complete, in some
sense it's very easy. Let us make some definitions to capture these
notions:
\begin{definition} Given a combinatorial optimization \emph{maximization} problem, we
say algorithm $A$ is an \emph{$\alpha$-approximation algorithm} (for
$0 < \alpha \leq 1$) if whenever the optimal solution to an instance
has value $\OPT$, $A$ is guaranteed to return a solution with value
at \emph{least} $\alpha \cdot \OPT$. We make the analogous
definition for \emph{minimization} problems, with $\alpha \geq 1$
and $A$ returning a solution with value at \emph{most} $\alpha \cdot
\OPT$. Unless otherwise specified, we will also insist that $A$
runs in polynomial time.
\end{definition}
\begin{remark}
Our definition for maximization problems is sometimes considered
unconventional; some like to always have $\alpha \geq 1$, in which
case their notion is that the algorithm $A$ returns a solution with
value at least $\OPT/\alpha$.
\end{remark}
\begin{definition} A maximization (resp., minimization)
combinatorial optimization problem is said to have a \emph{PTAS}
(Polynomial Time Approximation Scheme) if it has a
$(1-\eps)$-approximation algorithm (resp., $(1+\eps)$-approximation
algorithm) for every constant $\eps > 0$.
\end{definition}
As mentioned, the KNAPSACK problem has a PTAS, and this is true of
certain other combinatorial optimization problems, mostly related to
scheduling and packing. But what about, say, MAX-E3SAT? It is a
remarkable consequence of the PCP Theorem that MAX-E3SAT has no PTAS
unless $\P = \NP$. In fact, the two statements are basically
equivalent!
\begin{theorem} \label{thm:no-ptas} (credited to an unpublished 1992 result of Arora-Motwani-Safra-Sudan-Szegedy): Speaking roughly, the PCP Theorem is equivalent to the statement,
``MAX-E3SAT has no PTAS assuming $\P \neq \NP$''.
\end{theorem}
We will precisely formulate and prove this theorem in the next
lecture.\\
Indeed, most work in the PCP area these days is centered on proving
``hardness of approximation'' results like
Theorem~\ref{thm:no-ptas}. We will state here a few striking
``optimal hardness-of-approximation results'' that have followed
from work on PCP theorems.
\begin{theorem} (follows from H{\aa}stad's Theorem~\ref{thm:hastad}):
MAX-E3SAT has no $(7/8 + \eps)$-approximation algorithm for any
constant $\eps > 0$ unless $\P = \NP$.
\end{theorem}
We will see the proof of this in the course.
\begin{remark}
There is a very easy $7/8$-approximation algorithm for MAX-E3SAT:
Just picking a random assignment gives a $7/8$-approximation in
expectation, and this algorithm is easily derandomized.
\end{remark}
Regarding SET-COVER:
\begin{theorem} (Feige~\cite{Fei98}): SET-COVER has no $(1-\eps) \ln n$ approximation algorithm
for any constant $\eps > 0$ unless $\NP \subseteq \DTIME(n^{\log
\log n})$.
\end{theorem}
\begin{remark} The greedy algorithm achieves a $(\ln n +
1)$-approximation algorithm. (Johnson~\cite{Joh74})
\end{remark}
\begin{remark} The fact that we have the conclusion ``unless $\NP \subseteq \DTIME(n^{\log \log
n})$'' is due to technical difficulties; however since the
conclusion is almost as unlikely as $\NP = \P$, we don't really mind
much.
\end{remark}
We won't prove Feige's theorem about SET-COVER in this course but we
will prove a due to Lund and Yannakakis~\cite{LY94},
that shows hardness of giving a $\Omega(\log n)$-approximation.\\
The situation for MAX-CLIQUE is direst of all:
\begin{theorem} (due to H{\aa}stad~\cite{Has99}, with a significant simplification by Samorodnitsky-Trevisan~\cite{ST00}, another simplification by H{\aa}stad and Wigderson~\cite{HW03},
and a slight improvement by Zuckerman in September
2005~\cite{Zuc05}): MAX-CLIQUE has no $(1/n^{1-\eps})$-approximation
for any constant $\eps > 0$ unless $\P = \NP$.
\end{theorem}
\begin{remark} There is a trivial $1/n$-approximation: Output a
single vertex.
\end{remark}
We won't prove this theorem in the course, but the weaker result
that $(1/n^{\Omega(1)})$-approximating is hard follows relatively
easily from the main PCP Theorem, via a reduction given by
Feige-Goldwasser-Lovasz-Sudan-Szegedy~\cite{FGLSS96}. We will give
this reduction later in the course.
\bibliographystyle{abbrv}
\bibliography{lecture1}
\end{document}