
In this chapter, we give the formal definition of the complexity class (which stands for “non-deterministic polynomial time”), along with various examples. We then present examples of -complete languages, which represent the hardest languages in .
Fix some alphabet . We say that a language can be decided in non-deterministic polynomial time if there exists
a polynomial-time decider TM that takes two strings as input, and
a constant ,
such that for all :
if , then there exists with such that accepts,
if , then for all , rejects.
If , a string that makes accept is called a proof (or certificate) of being in . The TM is called a verifier for .
We denote by the set of all languages which can be decided in non-deterministic polynomial time.
To show 3COL is in , we need to show that there is a polynomial-time verifier TM with the properties stated in Definition (Non-deterministic polynomial time, complexity class ) (we are using to denote the verifier and not because we will use to denote the vertex set of a graph). Recall that an instance of the 3COL problem is an undirected graph . The description of is as follows.
We now show that satisfies the two conditions stated in Definition (Non-deterministic polynomial time, complexity class ). If is in the language, that means is a valid encoding of a graph and this graph is -colorable. When is a -coloring of , . And for this and , the verifier accepts. On the other hand, if is not in the language, then either (i) is not a valid encoding of a graph or (ii) it is a valid encoding of a graph which is not -colorable. In case (i), the verifier rejects (which is done implicitly since the input is not of the correct type). In case (ii), any that does not correspond to a -coloring of the vertices makes the verifier reject. Furthermore, any that does correspond to a -coloring of the vertices must be such that there is an edge whose endpoints are colored with the same color. Therefore, in this case, the verifier again rejects, as desired.
Now we show that the machine is polynomial-time. To check whether is a -coloring of the vertices takes polynomial time since you just need to check that you are given colors, each being one of colors. To check that it is indeed a legal -coloring is polynomial time as well since you just need to go through every edge once and check for each that the endpoints are different colors.
This completes the proof that 3COL is in .
Showing that a language is in involves the following steps:
Present a TM (that takes two inputs and ).
Argue that has polynomial running time.
Argue that works correctly, which involves arguing the following for some constant :
for all , there exists with such that accepts;
for all and for all , rejects.
To show CIRCUIT-SAT is in , we need to show that there is a polynomial-time verifier with the properties stated in Definition (Non-deterministic polynomial time, complexity class ). We start by presenting .
We first show that the verifier satisfies the two conditions stated in Definition (Non-deterministic polynomial time, complexity class ). If is in the language, that means that corresponds to a valid encoding of a circuit and there is some -assignment to the input gates that makes the circuit output . When is such a -assignment, then (where is the length of ), and the verifier accepts the input . On the other hand, if is not in the language, then either (i) is not a valid encoding of a circuit or (ii) it is a valid encoding of a circuit which is not satisfiable. In case (i), the verifier rejects (which is done implicitly since the input is not of the correct type). In case (ii), any that does not correspond to a -assignment to the input gates makes the verifier reject. Furthermore, any that does correspond to a -assignment to the input gates must be such that, with this assignment, the circuit evaluates to . Therefore, in this case, the verifier again rejects, as desired.
Now we show the verifier is polynomial-time. To check whether is a valid -assignment to the input gates takes polynomial time since you just need to check that you are given bits, where is the number of input gates. The output of the circuit can be computed in polynomial time since it takes constant number of steps to compute each gate.
This completes the proof of CIRCUIT-SAT is in .
To show CLIQUE is in , we need to show that there is a polynomial-time verifier with the properties stated in Definition (Non-deterministic polynomial time, complexity class ). We start by presenting . (We are using to denote the verifier and not because we will use to denote the vertex set of a graph.)
We first show that the verifier satisfies the two conditions stated in Definition (Non-deterministic polynomial time, complexity class ). If is in the language, that means that corresponds to a valid encoding of a graph together with a number . Furthermore, it must be the case that there is some with such that forms a -clique. When is the encoding of such an , then where is the length of , and the verifier accepts . On the other hand, if is not in the language, then either (i) is not a valid encoding of a graph together with a number or (ii) it is a valid encoding , but there is no , , that forms a -clique. In case (i), the verifier rejects (which is done implicitly since the input is not of the correct type). In case (ii), any that does not correspond to a set with makes the verifier reject. Furthermore, any that does correspond to such an cannot form a -clique. In this case, the for loop will detect this and the verifier again rejects, as desired.
Now we show the verifier is polynomial-time. Checking whether is a valid encoding of a set with is polynomial time. If this check passes, then the for-loop repeats at most many times, and the body of the loop can be carried out in polynomial time. So in total, the work being done is polynomial time.
The argument is very similar to the one above. Here is the verifier:
We skip the arguments for correctness and running time.
To show 3SAT is in , we need to show that there is a polynomial-time verifier with the properties stated in Definition (Non-deterministic polynomial time, complexity class ). We start by presenting .
We first show that the verifier satisfies the two conditions stated in Definition (Non-deterministic polynomial time, complexity class ). If is in the language, that means that corresponds to a valid encoding of a 3CNF formula. Furthermore, there is some -assignment to the variables that makes the formula output . When is this -assignment, then (where is the length of ), and the verifier accepts the input . On the other hand, if is not in the language, then either (i) is not a valid encoding of a CNF formula in which every clause has exactly 3 literals or (ii) it is a valid encoding but the formula is not satisfiable. In case (i), the verifier rejects (which is done implicitly since the input is not of the correct type). In case (ii), any that does not correspond to a -assignment to the variables makes the verifier reject. Furthermore, any that does correspond to a -assignment to the variables must be such that, with this assignment, the formula evaluates to . Therefore, in this case, the verifier again rejects, as desired.
Now we show the verifier is polynomial-time. To check whether is a valid -assignment to the variables takes polynomial time since you just need to check that you are given bits, where is the number of variables. The output of the formula can be computed in polynomial time since it takes constant number of steps to compute each clause (and the number of clauses is bounded by the length of the input).
Given a language , we want to show that . Since is in , we know that there is a polynomial-time decider that decides . To show that , we need to describe a polynomial-time verifier that has the properties described in Definition (Non-deterministic polynomial time, complexity class ). The description of is as follows.
First, note that since is a polynomial time decider, the line running takes polynomial time, and so is polynomial-time. We now check that satisfies the two conditions stated in Definition (Non-deterministic polynomial time, complexity class ). If , then accepts, so for any , accepts. For example, accepts, and clearly . If , then rejects, so no matter what is, rejects, as desired. This shows that .
We denote by the set of all languages that can be decided in at most exponential-time, i.e., in time for some constant .
To show , it suffices to argue that any is also in . So take an arbitrary . By the definition of we know that there is some polynomial-time verifier TM and constant such that for all :
if , then there is some with such that accepts;
if , then for all , rejects.
We construct a decider for as follows.
Correctness: If , then we know that for some with , accepts. Therefore accepts as well. If on the other hand , then for all , rejects. Therefore rejects as well.
Running time: The running time of is for an appropriately chosen constant . We omit the details of this part.
Recall Definition (-hard, -complete). We use this definition in this section with being . However, our notions of -hardness and -completeness will be restricted to Karp reductions. As mentioned in Note (-hardness with respect to Cook and Karp reductions), hardness and completeness is often defined using Karp reductions. We may explore some of the reasons for this in a homework problem.
To show that a language is -hard, you should be using Karp reductions.
is -complete.
Cook and Levin actually proved that is -complete, however, the theorem is often stated using rather than because the version above is easier to prove.
To show that a language is -hard, by the transitivity of polynomial-time reductions, it suffices to show that for some language which is known to be -hard. In particular, using Cook-Levin Theorem, it suffices to show that .
We have already done all the work to prove that is -complete. First of all, in Proposition (3COL is in ), we have shown that . To show that is -hard, by the transitivity of reductions, it suffices to show that , which we have done in Theorem (CIRCUIT-SAT reduces to 3COL).
We sketch the main ideas in the proof. To show that is -complete, we have to show that it is in and it is -hard. We leave the proof of membership in as an exercise.
To show that is -hard, by the transitivity of reductions, it suffices to show that . Given an instance of , i.e. a Boolean circuit , we will construct an instance of , i.e. a Boolean CNF formula in which every clause has exactly 3 literals. The reduction will be polynomial-time and will be such that is a Yes instance of (i.e. is satisfiable) if and only if is a Yes instance of (i.e. is satisfiable).
The construction is as follows. A circuit has three types of gates (excluding the input-gates): NOT, OR, AND.
We will convert each such gate of the circuit into a formula. It is easy to verify that So the behavior of each gate can be represented using a Boolean formula. As an example, consider the circuit below.
In this case, we would let and the Boolean formula equivalent to the circuit would be This is not quite a 3SAT formula since each clause does not necessarily have exactly 3 literals. However, each clause has at most 3 literals, and every clause in the formula can be converted into a clause with exactly 3 literals by duplicating a literal in the clause if necessary.
This completes the description of how we construct a 3SAT formula from a Boolean circuit. We leave it as an exercise to the reader to verify that is satisfiable if and only if is satisfiable, and that the reduction can be carried out in polynomial time.
To show that is -complete, we have to show that it is in and it is -hard. Exercise (CLIQUE is in ) asks you to show that is in , so we will show that is -hard by presenting a reduction from to .
Our reduction will be a Karp reduction. Given an instance of , i.e. a Boolean formula , we will construct an instance of , where is a graph and is a number, such that is a Yes instance of (i.e. is satisfiable) if and only if is a Yes instance of (i.e. contains a -clique). Furthermore, this construction will be done in polynomial time.
Let where each , and is a literal, be an arbitrary 3SAT formula. Notice that is satisfiable if and only if there is a truth assignment to the variables so that each clause has at least one literal set to True. From this formula, we build a graph as follows. For each clause, we create 3 vertices corresponding to the literals of that clause. So in total the graph has vertices. We now specify which vertices are connected to each other with an edge. We do not put an edge between two vertices if they correspond to the same clause. We do not put an edge between and for any . Every other pair of vertices is connected with an edge. This completes the construction of the graph. We still need to specify . We set .
As an example, if , then the corresponding graph is as follows:
We now prove that is satisfiable if and only if has a clique of size . If is satisfiable, then there is an assignment to the variables such that in each clause, there is at least one literal set to True. We claim that the vertices that correspond to these literals form a clique of size in . It is clear that the number of such vertices is . To see that they form a clique, notice that the only way two of these vertices do not share an edge is if they correspond to and for some . But a satisfying assignment cannot assign True to both and .
For the other direction, suppose that the constructed graph has a clique of size . Since there are no edges between two literals if they are in the same clause, there must be exactly one vertex from each clause in . We claim that we can set the literals corresponding to these vertices to True and therefore show that is satisfiable. To see this, notice that the only way we could not simultaneously set these literals to True is if two of them correspond to and for some . But there is no edge between such literals, so they cannot both be in the same clique.
This completes the correctness of the reduction. We still have to argue that it can be done in polynomial time. This is rather straightforward. Creating the vertex set of is clearly polynomial-time since there is just one vertex for each literal of the Boolean formula. Similarly, the edges can be easily added in polynomial time as there are at most many of them.
When we want to show that a language is -hard, we take a known -hard language and show a Karp reduction from to . The diagram of a typical transformation is as shown below.
So typically, the transformation does not cover all the elements of , but only a subset . This means that the Karp reduction not only establishes that is -hard, but also that is -hard.
To show that is -complete, we have to show that it is in and it is -hard. Exercise (IS is in ) asks you to show that is in , so we show that is -hard. By Theorem (CLIQUE is -complete), we know that is -hard, and by Theorem (CLIQUE reduces to IS) we know that . By the transitivity of reductions, we conclude that is also -hard.
The collection of reductions that we have shown can be represented as follows:
The MSAT problem is defined very similarly to SAT (for the definition of SAT, see Definition (Boolean satisfiability problem)). In SAT, we output True if and only if the input CNF formula has at least one satisfying truth assignment to the variables. In MSAT, we want to output True if and only if the input CNF formula has at least two distinct satisfying assignments.
Show that MSAT is -complete.
Showing MSAT is in : To show that MSAT is in , we need to show that there is a polynomial time verifier TM and such that:
if , then there is some , , such that accepts;
if , then for all , rejects.
We present the verifier.
The verifier is polynomial time: Checking whether a valid encoding of two distinct 0/1 assignments to the input variables can be done in polynomial time. Given a 0/1 assignment to the input variables, we can plug those values into the input CNF formula, and evaluate the output of the formula. This can be done in polynomial time. Since every step can be done in polynomial time, the verifier has polynomial time complexity.
Correctness: If , then must a valid encoding of a CNF formula . Furthermore, must be such that there are at least two distinct 0/1 assignments to the input variables that make evaluate to 1. When is the encoding of such two distinct 0/1 assignments, the verifier will accept as desired. Note that clearly for such a , , i.e., the proof is polynomial in the length of . If , then either is not a valid encoding of a CNF formula (in which case the verifier rejects, which is done implicitly since the input is not of the correct type), or is a valid encoding of a CNF formula, but there is at most one 0/1 assignment to the input variables that makes the formula output 1. In the latter case, if is not the encoding of two distinct 0/1 assignments, we reject. And if it is, then the last step detects that at least one of these 0/1 assignments makes the formula output 0, and so the verifier rejects again, as desired.
Showing MSAT is -hard: To show that MSAT is -hard, we show a Karp reduction from SAT (which we know is -hard because we have shown 3SAT is -hard) to MSAT. To show the Karp reduction, we need to show that there is a polynomial-time computable function such that if and only if . Here is the function.
Polynomial time: It is pretty clear that each step above can be carried out in polynomial time (with respect to the length of ).
Correctness: Suppose . Then for some satisfiable CNF formula . Let be the variables in . And suppose is a satisfying assignment for , where for all . Then observe that both and are satisfying assignments for , and so . For the converse, assume . This means is satisfiable, which implies must satisfiable. So .
In the PARTITION problem, we are given non-negative integers , and we want to output True if and only if there is a way to partition the integers into two parts so that the sum of the two parts are equal. In other words, we want to output True if and only if there is a set such that .
In the BIN problem we are given a set of n items with non-negative sizes (not necessarily distinct), a capacity (not necessarily an integer), and an integer . We want to output True if and only if the items can be placed into at most bins such that the total size of items in each bin does not exceed the capacity .
Show that BIN is -complete assuming PARTITION is -hard.
Showing BIN is in : To show that BIN is in , we need to show that there is a polynomial time verifier TM and such that:
if , then there is some , , such that accepts;
if , then for all , rejects.
We now present the verifier. Below, and are non-negative integers and is a non-negative real.
The verifier is polynomial time: (First note that the in the question does not denote the input length. When we say “polynomial time”, it is with respect to the input length, which is .) Checking that is a partition of into parts, we need to check that there are sets, their union is , and that they are pair-wise disjoint. All of these can be easily checked in polynomial time. In the 3rd step, we implicitly have a loop that repeats times. In each iteration, we do at most additions, which is polynomial time in the input length (even if the ’s are super big, note that they are part of the input and contribute to the length; addition is polynomial time).
Correctness: If , then first, it must be a valid encoding. Furthermore, there must be a partition of such that for each , (this is true by the definition of the problem). When represents such a partition, the verifier correctly accepts. Such a has length polynomial in , and therefore polynomial in the input length, as desired.
If , then either is an invalid encoding (in which case the verifier rejects, which is done implicitly since the input is not of the correct type), or is a valid encoding, but there is no way to partition the elements into bins such that for all bins. If does not correspond to a partition , then we reject. And if it does, we successfully reject on the 3rd step. This completes the correctness argument.
Showing BIN is -hard: To show that BIN is -hard, we show a Karp reduction from PARTITION to BIN. To show the Karp reduction, we need to show that there is a polynomial time computable function such that if and only if .
We now present . Below are non-negative integers.
Polynomial time: Addition and division are polynomial time operations and we only do these operations polynomially many times. So the whole function is polynomial time computable.
Correctness: If is in PARTITION, then for non-negative integers , and there is such that . Let be the total sum. Then we must have If we let and , we see that the elements can be partitioned into two bins of capacity . Therefore the output is indeed an element of BIN.
For the other direction, if the output of the reduction is an element of BIN, then there is a way to partition the elements into bins and (, ) of capacity . Since the total sum of the elements is and each element must be in one of the two bins, the two bins must be completely full, i.e., and . If we let , then and so . So is indeed in PARTITION.
Informally, at a high-level, what does it mean for a language to be in ?
Give an example of a language in and at a high level explain why it is in .
What are the things you need to show in order to formally argue that a language is in ?
Why is the complexity class named “Non-deterministic polynomial time”?
True or false: Every language in is decidable.
True or false: Any finite language is in .
True or false: .
True or false: .
Explain why is contained in .
True or false: If there is a polynomial-time algorithm for 3COL, then .
True or false: HALTS is -complete.
Explain why is contained in .
State the Cook-Levin Theorem and explain its significance.
Let , and define its complement . Define . True or false: If , then .
True or false: If every language in is -complete (under Cook reductions) then .
True or false: If then every language in is -complete (under Cook reductions).
True or false: For any two -complete languages and , there is a Cook reduction from to .
True or false: Let . There is a Karp reduction from to 3COL.
Assume . Does 2COL Karp reduce to 3COL?
Assume . Does 3COL Karp reduce to 2COL?
Non-deterministic polynomial time is one of the most important definitions in the course. The formal definition is not easy to understand at first since it has several layers. Take your time to really digest it, and be careful with the order of quantifiers.
When we ask you to show that a language is in , we expect you to use the formal definition of . Your proofs should resemble the proofs presented in this chapter.
It is guaranteed that you will see an -completeness proof in the homework and exam. In this course, your reductions establishing -hardness must be Karp reductions. This chapter contains various such arguments, and we expect your own arguments to follow a similar structure. As usual, do not fall in the trap of memorizing templates. Focus on understanding why the proofs are structured the way they are.