We say a language \(L\) is in \(\mathsf{NP}\) if there exists a polynomial time verifier TM \(V\) and a constant \(k > 0\) such that for all \(x \in \Sigma^*\): If \(x \in L\), then there exists a certificate \(u\) with \(\lvert u \rvert \leq \lvert x \rvert^k\) such that \(V(x, u)\) accepts. If \(x \notin L\), then for all \(u \in \Sigma^*\), \(V(x, u)\) rejects.

We say there is a

**polynomial-time mapping reduction**from \(A\) to \(B\) if there is a**polynomial-time**computable function \(f : \Sigma^* \rightarrow \Sigma^*\) such that \(x \in A\) if and only if \(f(x) \in B\). We write this as \(A \le^P_m B\). (We also refer to these reductions as Karp reductions.)A language \(Y\) is \(\mathsf{NP}\)-hard if for every language \(X \in \mathsf{NP}\), \(X\) polynomial-time reduces to \(Y\).

One can define hardness using either a Cook reduction (polynomial-time Turing reduction) or a Karp reduction. These lead to different definitions for \(\mathsf{NP}\)-hardness. In this course, we use Karp reductions to define \(\mathsf{NP}\)-hardness.

A language is \(\mathsf{NP}\)-complete if it is in \(\mathsf{NP}\) and is \(\mathsf{NP}\)-hard.

Imagine there existing an untrustworthy, omnipotent (computationally unbounded) **Prover** who likes to make claims about membership in a language \(L\). On the other hand, you are a **Verifier** who can merely compute things that run in polynomial time. You are interested in verifying if a string is in \(L\).

The **Prover** claims to you that a certain \(x \in L\). In order to convince you, the **Prover** uses its unlimited computational power to provide a polynomial length (with respect to \(x\)) certificate/proof to you. You then use the certificate to verify whether \(x\) is truly in \(L\). If \(L \in \mathsf{NP}\) then:

Can the

**Prover**convince you for every \(x \in L\) that \(x\) truly is a member of \(L\)?Can the

**Prover**ever fool you into thinking some \(x \in L\) when really \(x \not \in L\)?

Conversely, if \(L\) is such a language so that **Prover** can always provide you with polynomial length proofs for \(x \in L\), and is never able to deceive you for \(x \not \in L\) then is \(L \in \mathsf{NP}\)?

If \(L \in \mathsf{NP}\) there exists some poly-time verifier TM \(V\) that the **Verifier** can use to verify whether a string \(x \in L\). If \(x \in L\), then there exists some poly-length certificate such that the verifier TM accepts, so **Prover** should be able to convince **Verifier** by providing this certificate/proof. On the other hand, if \(x \not\in L\), the verifier TM \(V\) is such that \(\forall u \in \Sigma^{*}\), \(V(x,u)\) rejects, so if the **Verifier** uses \(V\) then there is nothing \(\textbf{Prover}\) can do to produce a proof which convinces \(\textbf{Verifier}\) that \(x \in L\) when really \(x \not \in L\).

For the converse note that by definition, the algorithm used by **Verifier** constitutes a poly-time verifier for \(L\). If \(x \in L\) then there exists a poly-length certificate, namely one that the **Prover** could use to convince the **Verifier**. If \(x \not \in L\) then regardless of what proof **Prover** sends over, the algorithm used by **Verifier** will not accept.

Let \(G = (V, E)\) be a graph. A clique in \(G\) is a set \(S \subseteq V\) such that for any \(u, v \in S\), \(\{u,v\} \in E\) (so a clique is a set of vertices such that any two vertices in the set is connected by an edge). We are interested in the following problem that we call \(\text{DCLIQUE}\) (double clique): Given a graph \(G=(V,E)\) and \(k \in \mathbb{N}^+\), does \(G\) contain two vertex-disjoint cliques of size \(k\) each? As a language, this corresponds to: \[\text{DCLIQUE} = \{ \langle G, k \rangle \text{ : $G$ is a graph, $k \in \mathbb{N}^+$, $G$ contains two vertex-disjoint cliques of size $k$} \}\]

Show \(\text{DCLIQUE}\) is \(\mathsf{NP}\)-complete.

This problem is in \(\mathsf{NP}\). Here is the description of a polynomial-time verifier TM.

\[\begin{aligned} &\textbf{def}\;\; A(\langle \text{graph } G = (V,E), \text{positive natural } k \rangle, \; u ):\\ &~~~~\texttt{If $u$ does not correspond to two sets $S, T \subseteq V$ with $|S| = |T| = k$, reject.}\\ &~~~~\texttt{For each pair of vertices $u, v \in S$:}\\ &~~~~~~~~\texttt{If $\{u,v\} \not \in E$, reject.}\\ &~~~~\texttt{For each pair of vertices $u, v \in T$:}\\ &~~~~~~~~\texttt{If $\{u,v\} \not \in E$, reject.}\\ &~~~~\texttt{For each vertex $u \in S$:}\\ &~~~~~~~~\texttt{If $u \in T$, reject.}\\ &~~~~\texttt{Accept.}\\\end{aligned}\]

To prove its correctness, we have to argue the following.

If \(x \in \text{DCLIQUE}\), there is some proof string \(u\) of polynomial-length that makes \(A\) accept.

If \(x \in \text{DCLIQUE}\), then \(x = \langle G, k \rangle\) is a valid encoding and \(G\) contains 2 disjoint cliques of size \(k\). When \(u\) is a valid encoding of a list of the two disjoint cliques, the verifier will accept (as in this case, none of the

`reject`

instructions will be reached).If \(x \notin \text{DCLIQUE}\), no matter what \(u\) is, \(A\) REJECTS.

If \(x \notin \text{DCLIQUE}\), then there are 2 options: \(x\) is not a valid encoding \(\langle G, k \rangle\), or \(x\) is valid but \(G\) does not contain 2 disjoint cliques of size \(k\). In the first case, \(A\) rejects (this is the implicit “type checker”). In the second case, \(A\) also rejects, because by design, the only way the verifier can get to the last line and accept is if the graph contains two disjoint cliques of size \(k\) each.

\(A\) is polynomial-time.

It is relatively easy to see that each line of the algorithm can be carried out in polynomial time with respect to the number of vertices in the input graph.

To show \(\text{DCLIQUE}\) is \(\mathsf{NP}\)-hard, we will Karp-reduce \(\text{CLIQUE}\) to \(\text{DCLIQUE}\). So given an instance of \(\text{CLIQUE}\), \(\langle G, k \rangle\), we want to transform it to an instance of \(\text{DCLIQUE}\), \(\langle G', k' \rangle\), so that \(G\) has a clique of size \(k\) if and only if \(G'\) has two disjoint cliques of size \(k'\). The main idea behind the reduction is to define \(G'\) as two disjoint copies of \(G\). This way for any clique in \(G\), there will be two disjoint copies of it in \(G'\). We set \(k' = k\).

Here is the definition of our Karp reduction \(f: \Sigma^* \rightarrow \Sigma^*\).

\[\begin{aligned} &\textbf{def}\;\; f(\langle \text{graph } G = (V,E), \; \text{positive natural } k\rangle):\\ &~~~~\texttt{$V^1 = \{v^1 : v \in V\}$.}\\ &~~~~\texttt{$E^1 = \{\{u^1,v^1\} : \{u,v\} \in E\}$.}\\ &~~~~\texttt{$V^2 = \{v^2 : v \in V\}$.}\\ &~~~~\texttt{$E^2 = \{\{u^2,v^2\} : \{u,v\} \in E\}$.}\\ &~~~~\texttt{Return $\langle G' = (V^1 \cup V^2, E^1 \cup E^2), k \rangle$.}\\\end{aligned}\]

We now prove the correctness of the reduction.

If \(x \in \text{CLIQUE}\), then \(x = \langle G, k \rangle\) where \(G\) contains a clique of size \(k\). Since \(G'\) contains two disjoint copies of \(G\) and each of those copies will contain a clique of size \(k\), \(G'\) contains two disjoint cliques of size \(k\). This means that \(f(x) \in \text{DCLIQUE}\).

If \(f(x) \in \text{DCLIQUE}\), then \(f(x) = \langle G', k \rangle\) and \(G'\) contains two disjoint cliques of size \(k\) each. We know by construction of \(f\) that \(x = \langle G, k \rangle\) and \(G'\) consists of two disjoint copies \(G^1 = (V^1, E^1)\) and \(G^2 = (V^2, E^2)\) of \(G\). Since the copies are disjoint, a clique in \(G'\) of size \(k\) is either completely contained in \(V^1\) or is completely contained in \(V^2\). This means the original graph \(G\) must have a clique of size \(k\). That is, \(x = \langle G, k \rangle \in \text{CLIQUE}\).

The function \(f\) is polynomial time computable since given a graph \(G\), we can create two copies of it in polynomial time.

Since \(\text{DCLIQUE}\) is both in \(\mathsf{NP}\) and \(\mathsf{NP}\)-hard, it is \(\mathsf{NP}\)-complete.

Let \(G = (V, E)\) be a graph. A vertex cover in \(G\) is a set \(C \subseteq V\) such that for every edge \(\{x,y\} \in E\), either \(x \in C\) or \(y \in C\) (a set of vertices such that every edge is incident to at least one vertex in the set). An independent set in \(G\) is a set \(S \subseteq V\) such that for any \(u, v \in S\), \(\{u,v\} \not\in E\) (a set of vertices such that no edge connects two vertices in the set). Define the following languages:

VERTEX-COVER \(= \{ \langle G, k \rangle \text{ : $G$ is a graph, $k \in \mathbb{N}$, $G$ contains a vertex cover of size $k$} \}\)

IND-SET \(= \{ \langle G, k \rangle \text{ : $G$ is a graph, $k \in \mathbb{N}$, $G$ contains an independent set of size $k$} \}\)

Show that \(\text{VERTEX-COVER} \le_m^P \text{IND-SET}\).

We will define a polynomial-time computable function \(f: \Sigma^* \rightarrow \Sigma^*\) such that \[x \in \text{VERTEX-COVER} \leftrightarrow f(x) \in \text{IND-SET}.\]

Here is the definition of our reduction: \[\begin{aligned} &\textbf{def}\;\; f(\langle \text{graph } G = (V,E), \; \text{positive natural } k\rangle):\\ & ~~~~\texttt{Return $\langle G, |V| - k \rangle$.}\end{aligned}\]

To prove the correctness of this reduction, we need to argue that (i) if \(x \in \text{VERTEX-COVER}\), then \(f(x) \in \text{IND-SET}\); (ii) if \(f(x) \in \text{IND-SET}\), then \(x \in \text{VERTEX-COVER}\); (iii) \(f\) is polynomial-time computable. The first two points will follow from the observation that in a graph \(G = (V,E)\), a set of vertices \(S\subseteq V\) is a vertex cover if and only if \(V \backslash S\) is an independent set.

(i) If \(x \in \text{VERTEX-COVER}\) then \(x\) is a valid encoding \(\langle G = (V, E), k \rangle\). In addition, there exists a vertex cover of size \(k\). Let \(S\) be the subset of vertices in the vertex cover and consider \(V \setminus S\). Observe that \(V \setminus S\) is an independent set. To see why, consider any pair of vertices \(u, v \in V \setminus S\). Because neither \(u\) nor \(v\) are in \(S\) but \(S\) is a vertex cover, we know that \((u, v) \notin E\). In addition, \(\lvert V \setminus S \rvert = |V| - k\), so there exists a independent set of size \(|V| - k\). Hence \(f(x) \in \text{IND-SET}\)

(ii) If \(f(x) \in \text{IND-SET}\), then \(G\) must has an independent set of size \(|V| - k\). Let \(S\) be the subset of vertices in the independent set. Similar to the previous direction, \(V \setminus S\) is a vertex cover. To show this, consider an arbitrary edge \((u, v) \in E\). Because \(S\) is an independent set, we know one of \(u, v \notin S\). Equivalently, one of \(u, v \in V \setminus S\). Thus, all edges are covered by some vertex in \(V \setminus S\) so \(G\) has a vertex cover of size \(|V| - (|V| - k) = k\). So \(x \in \text{VERTEX-COVER}\).

(iii) \(f\) is poly-time since we just need verify that the encoding is valid, count the number of vertices, and then compute \(|V| - k\), which can all be done in poly-time.

Recall that \(\text{HALTS} = \{\langle M,x \rangle: \text{$M$ is a TM, $x \in \Sigma^*$, and $M(x)$ halts}\}\).

To show HALTS is \(\mathsf{NP}\)-hard, we will reduce 3SAT to HALTS. First note that 3SAT is decidable since it is in \(\mathsf{NP}\). In particular, we can do a brute-force search: Given a 3CNF formula with \(n\) variables, try all \(2^n\) possibilities for the variable assignment, then check if that assignment satisfies all the clauses.

We now present a Karp reduction from 3SAT to HALTS. \[\begin{aligned} &\textbf{def}\;\; f(\langle \text{3CNF formula } \varphi \rangle):\\ & ~~~~\texttt{def HELP(y):}\\ & ~~~~~~~~\texttt{brute force to check if $\varphi$ is satisfiable}\\ & ~~~~~~~~\texttt{if $\varphi$ is satisfiable, then halt}\\ & ~~~~~~~~\texttt{if $\varphi$ is not satisfiable, then loop}\\ & ~~~~\texttt{return } \langle \texttt{HELP}, \varepsilon \rangle.\end{aligned}\]

This transformation can be carried out in polynomial time since the description of HELP can be created in polynomial time. This is because creating the description of HELP only requires \(\varphi\) to be hard-coded; the remainder of the description is just a constant.

**Note:** Notice that HELP itself is not a polynomial-time algorithm. This does not contradict the fact that \(f\) is polynomial time computable since \(f\) does not run HELP. It merely creates the string representation of it.

We now prove that the reduction is correct: \[\begin{aligned} x \in \text{3SAT} & \Longleftrightarrow x = \langle \varphi \rangle \text{ where $\varphi$ is a satisfiable 3CNF formula} \\ & \Longleftrightarrow \text{HELP} \text{ halts no matter what its input is}\\ & \Longleftrightarrow \langle \text{HELP}, \varepsilon \rangle \in \text{HALTS}.\end{aligned}\]