MODULE 8:   P vs NP
Recitation 6
1
Definitions
• We say a language $$L$$ is in $$\mathsf{NP}$$ if there exists a polynomial time verifier TM $$V$$ and a constant $$k > 0$$ such that for all $$x \in \Sigma^*$$: If $$x \in L$$, then there exists a certificate $$u$$ with $$\lvert u \rvert \leq \lvert x \rvert^k$$ such that $$V(x, u)$$ accepts. If $$x \notin L$$, then for all $$u \in \Sigma^*$$, $$V(x, u)$$ rejects.

• We say there is a polynomial-time mapping reduction from $$A$$ to $$B$$ if there is a polynomial-time computable function $$f : \Sigma^* \rightarrow \Sigma^*$$ such that $$x \in A$$ if and only if $$f(x) \in B$$. We write this as $$A \le^P_m B$$. (We also refer to these reductions as Karp reductions.)

• A language $$Y$$ is $$\mathsf{NP}$$-hard if for every language $$X \in \mathsf{NP}$$, $$X$$ polynomial-time reduces to $$Y$$.

• One can define hardness using either a Cook reduction (polynomial-time Turing reduction) or a Karp reduction. These lead to different definitions for $$\mathsf{NP}$$-hardness. In this course, we use Karp reductions to define $$\mathsf{NP}$$-hardness. • A language is $$\mathsf{NP}$$-complete if it is in $$\mathsf{NP}$$ and is $$\mathsf{NP}$$-hard.

2
New Point of View

Imagine there existing an untrustworthy, omnipotent (computationally unbounded) Prover who likes to make claims about membership in a language $$L$$. On the other hand, you are a Verifier who can merely compute things that run in polynomial time. You are interested in verifying if a string is in $$L$$.

The Prover claims to you that a certain $$x \in L$$. In order to convince you, the Prover uses its unlimited computational power to provide a polynomial length (with respect to $$x$$) certificate/proof to you. You then use the certificate to verify whether $$x$$ is truly in $$L$$. If $$L \in \mathsf{NP}$$ then:

1. Can the Prover convince you for every $$x \in L$$ that $$x$$ truly is a member of $$L$$?

2. Can the Prover ever fool you into thinking some $$x \in L$$ when really $$x \not \in L$$?

Conversely, if $$L$$ is such a language so that Prover can always provide you with polynomial length proofs for $$x \in L$$, and is never able to deceive you for $$x \not \in L$$ then is $$L \in \mathsf{NP}$$?

Solution

If $$L \in \mathsf{NP}$$ there exists some poly-time verifier TM $$V$$ that the Verifier can use to verify whether a string $$x \in L$$. If $$x \in L$$, then there exists some poly-length certificate such that the verifier TM accepts, so Prover should be able to convince Verifier by providing this certificate/proof. On the other hand, if $$x \not\in L$$, the verifier TM $$V$$ is such that $$\forall u \in \Sigma^{*}$$, $$V(x,u)$$ rejects, so if the Verifier uses $$V$$ then there is nothing $$\textbf{Prover}$$ can do to produce a proof which convinces $$\textbf{Verifier}$$ that $$x \in L$$ when really $$x \not \in L$$.

For the converse note that by definition, the algorithm used by Verifier constitutes a poly-time verifier for $$L$$. If $$x \in L$$ then there exists a poly-length certificate, namely one that the Prover could use to convince the Verifier. If $$x \not \in L$$ then regardless of what proof Prover sends over, the algorithm used by Verifier will not accept.

3
Network Pairs

Let $$G = (V, E)$$ be a graph. A clique in $$G$$ is a set $$S \subseteq V$$ such that for any $$u, v \in S$$, $$\{u,v\} \in E$$ (so a clique is a set of vertices such that any two vertices in the set is connected by an edge). We are interested in the following problem that we call $$\text{DCLIQUE}$$ (double clique): Given a graph $$G=(V,E)$$ and $$k \in \mathbb{N}^+$$, does $$G$$ contain two vertex-disjoint cliques of size $$k$$ each? As a language, this corresponds to: $\text{DCLIQUE} = \{ \langle G, k \rangle \text{ : G is a graph, k \in \mathbb{N}^+, G contains two vertex-disjoint cliques of size k} \}$

Show $$\text{DCLIQUE}$$ is $$\mathsf{NP}$$-complete.

Solution

This problem is in $$\mathsf{NP}$$. Here is the description of a polynomial-time verifier TM.

\begin{aligned} &\textbf{def}\;\; A(\langle \text{graph } G = (V,E), \text{positive natural } k \rangle, \; u ):\\ &~~~~\texttt{If u does not correspond to two sets S, T \subseteq V with |S| = |T| = k, reject.}\\ &~~~~\texttt{For each pair of vertices u, v \in S:}\\ &~~~~~~~~\texttt{If \{u,v\} \not \in E, reject.}\\ &~~~~\texttt{For each pair of vertices u, v \in T:}\\ &~~~~~~~~\texttt{If \{u,v\} \not \in E, reject.}\\ &~~~~\texttt{For each vertex u \in S:}\\ &~~~~~~~~\texttt{If u \in T, reject.}\\ &~~~~\texttt{Accept.}\\\end{aligned}

To prove its correctness, we have to argue the following.

1. If $$x \in \text{DCLIQUE}$$, there is some proof string $$u$$ of polynomial-length that makes $$A$$ accept.

If $$x \in \text{DCLIQUE}$$, then $$x = \langle G, k \rangle$$ is a valid encoding and $$G$$ contains 2 disjoint cliques of size $$k$$. When $$u$$ is a valid encoding of a list of the two disjoint cliques, the verifier will accept (as in this case, none of the reject instructions will be reached).

2. If $$x \notin \text{DCLIQUE}$$, no matter what $$u$$ is, $$A$$ REJECTS.

If $$x \notin \text{DCLIQUE}$$, then there are 2 options: $$x$$ is not a valid encoding $$\langle G, k \rangle$$, or $$x$$ is valid but $$G$$ does not contain 2 disjoint cliques of size $$k$$. In the first case, $$A$$ rejects (this is the implicit “type checker”). In the second case, $$A$$ also rejects, because by design, the only way the verifier can get to the last line and accept is if the graph contains two disjoint cliques of size $$k$$ each.

3. $$A$$ is polynomial-time.

It is relatively easy to see that each line of the algorithm can be carried out in polynomial time with respect to the number of vertices in the input graph.

To show $$\text{DCLIQUE}$$ is $$\mathsf{NP}$$-hard, we will Karp-reduce $$\text{CLIQUE}$$ to $$\text{DCLIQUE}$$. So given an instance of $$\text{CLIQUE}$$, $$\langle G, k \rangle$$, we want to transform it to an instance of $$\text{DCLIQUE}$$, $$\langle G', k' \rangle$$, so that $$G$$ has a clique of size $$k$$ if and only if $$G'$$ has two disjoint cliques of size $$k'$$. The main idea behind the reduction is to define $$G'$$ as two disjoint copies of $$G$$. This way for any clique in $$G$$, there will be two disjoint copies of it in $$G'$$. We set $$k' = k$$.

Here is the definition of our Karp reduction $$f: \Sigma^* \rightarrow \Sigma^*$$.

\begin{aligned} &\textbf{def}\;\; f(\langle \text{graph } G = (V,E), \; \text{positive natural } k\rangle):\\ &~~~~\texttt{V^1 = \{v^1 : v \in V\}.}\\ &~~~~\texttt{E^1 = \{\{u^1,v^1\} : \{u,v\} \in E\}.}\\ &~~~~\texttt{V^2 = \{v^2 : v \in V\}.}\\ &~~~~\texttt{E^2 = \{\{u^2,v^2\} : \{u,v\} \in E\}.}\\ &~~~~\texttt{Return \langle G' = (V^1 \cup V^2, E^1 \cup E^2), k \rangle.}\\\end{aligned}

We now prove the correctness of the reduction.

• If $$x \in \text{CLIQUE}$$, then $$x = \langle G, k \rangle$$ where $$G$$ contains a clique of size $$k$$. Since $$G'$$ contains two disjoint copies of $$G$$ and each of those copies will contain a clique of size $$k$$, $$G'$$ contains two disjoint cliques of size $$k$$. This means that $$f(x) \in \text{DCLIQUE}$$.

• If $$f(x) \in \text{DCLIQUE}$$, then $$f(x) = \langle G', k \rangle$$ and $$G'$$ contains two disjoint cliques of size $$k$$ each. We know by construction of $$f$$ that $$x = \langle G, k \rangle$$ and $$G'$$ consists of two disjoint copies $$G^1 = (V^1, E^1)$$ and $$G^2 = (V^2, E^2)$$ of $$G$$. Since the copies are disjoint, a clique in $$G'$$ of size $$k$$ is either completely contained in $$V^1$$ or is completely contained in $$V^2$$. This means the original graph $$G$$ must have a clique of size $$k$$. That is, $$x = \langle G, k \rangle \in \text{CLIQUE}$$.

• The function $$f$$ is polynomial time computable since given a graph $$G$$, we can create two copies of it in polynomial time.

Since $$\text{DCLIQUE}$$ is both in $$\mathsf{NP}$$ and $$\mathsf{NP}$$-hard, it is $$\mathsf{NP}$$-complete.

4
No Peeking

Let $$G = (V, E)$$ be a graph. A vertex cover in $$G$$ is a set $$C \subseteq V$$ such that for every edge $$\{x,y\} \in E$$, either $$x \in C$$ or $$y \in C$$ (a set of vertices such that every edge is incident to at least one vertex in the set). An independent set in $$G$$ is a set $$S \subseteq V$$ such that for any $$u, v \in S$$, $$\{u,v\} \not\in E$$ (a set of vertices such that no edge connects two vertices in the set). Define the following languages:

VERTEX-COVER $$= \{ \langle G, k \rangle \text{ : G is a graph, k \in \mathbb{N}, G contains a vertex cover of size k} \}$$

IND-SET $$= \{ \langle G, k \rangle \text{ : G is a graph, k \in \mathbb{N}, G contains an independent set of size k} \}$$

Show that $$\text{VERTEX-COVER} \le_m^P \text{IND-SET}$$.

Solution

We will define a polynomial-time computable function $$f: \Sigma^* \rightarrow \Sigma^*$$ such that $x \in \text{VERTEX-COVER} \leftrightarrow f(x) \in \text{IND-SET}.$

Here is the definition of our reduction: \begin{aligned} &\textbf{def}\;\; f(\langle \text{graph } G = (V,E), \; \text{positive natural } k\rangle):\\ & ~~~~\texttt{Return \langle G, |V| - k \rangle.}\end{aligned}

To prove the correctness of this reduction, we need to argue that (i) if $$x \in \text{VERTEX-COVER}$$, then $$f(x) \in \text{IND-SET}$$; (ii) if $$f(x) \in \text{IND-SET}$$, then $$x \in \text{VERTEX-COVER}$$; (iii) $$f$$ is polynomial-time computable. The first two points will follow from the observation that in a graph $$G = (V,E)$$, a set of vertices $$S\subseteq V$$ is a vertex cover if and only if $$V \backslash S$$ is an independent set.

• (i) If $$x \in \text{VERTEX-COVER}$$ then $$x$$ is a valid encoding $$\langle G = (V, E), k \rangle$$. In addition, there exists a vertex cover of size $$k$$. Let $$S$$ be the subset of vertices in the vertex cover and consider $$V \setminus S$$. Observe that $$V \setminus S$$ is an independent set. To see why, consider any pair of vertices $$u, v \in V \setminus S$$. Because neither $$u$$ nor $$v$$ are in $$S$$ but $$S$$ is a vertex cover, we know that $$(u, v) \notin E$$. In addition, $$\lvert V \setminus S \rvert = |V| - k$$, so there exists a independent set of size $$|V| - k$$. Hence $$f(x) \in \text{IND-SET}$$

• (ii) If $$f(x) \in \text{IND-SET}$$, then $$G$$ must has an independent set of size $$|V| - k$$. Let $$S$$ be the subset of vertices in the independent set. Similar to the previous direction, $$V \setminus S$$ is a vertex cover. To show this, consider an arbitrary edge $$(u, v) \in E$$. Because $$S$$ is an independent set, we know one of $$u, v \notin S$$. Equivalently, one of $$u, v \in V \setminus S$$. Thus, all edges are covered by some vertex in $$V \setminus S$$ so $$G$$ has a vertex cover of size $$|V| - (|V| - k) = k$$. So $$x \in \text{VERTEX-COVER}$$.

• (iii) $$f$$ is poly-time since we just need verify that the encoding is valid, count the number of vertices, and then compute $$|V| - k$$, which can all be done in poly-time.

5
(Extra) Never Pausing

Prove that HALTS is $$\mathsf{NP}$$-hard.

Solution

Recall that $$\text{HALTS} = \{\langle M,x \rangle: \text{M is a TM, x \in \Sigma^*, and M(x) halts}\}$$.

To show HALTS is $$\mathsf{NP}$$-hard, we will reduce 3SAT to HALTS. First note that 3SAT is decidable since it is in $$\mathsf{NP}$$. In particular, we can do a brute-force search: Given a 3CNF formula with $$n$$ variables, try all $$2^n$$ possibilities for the variable assignment, then check if that assignment satisfies all the clauses.

We now present a Karp reduction from 3SAT to HALTS. \begin{aligned} &\textbf{def}\;\; f(\langle \text{3CNF formula } \varphi \rangle):\\ & ~~~~\texttt{def HELP(y):}\\ & ~~~~~~~~\texttt{brute force to check if \varphi is satisfiable}\\ & ~~~~~~~~\texttt{if \varphi is satisfiable, then halt}\\ & ~~~~~~~~\texttt{if \varphi is not satisfiable, then loop}\\ & ~~~~\texttt{return } \langle \texttt{HELP}, \varepsilon \rangle.\end{aligned}

This transformation can be carried out in polynomial time since the description of HELP can be created in polynomial time. This is because creating the description of HELP only requires $$\varphi$$ to be hard-coded; the remainder of the description is just a constant.

Note: Notice that HELP itself is not a polynomial-time algorithm. This does not contradict the fact that $$f$$ is polynomial time computable since $$f$$ does not run HELP. It merely creates the string representation of it.

We now prove that the reduction is correct: \begin{aligned} x \in \text{3SAT} & \Longleftrightarrow x = \langle \varphi \rangle \text{ where \varphi is a satisfiable 3CNF formula} \\ & \Longleftrightarrow \text{HELP} \text{ halts no matter what its input is}\\ & \Longleftrightarrow \langle \text{HELP}, \varepsilon \rangle \in \text{HALTS}.\end{aligned}