Algorithms and Structural Complexity Theory continued

Recall the one NP-complete problem that we encountered in the last class:
Problem: CIRCUIT-SAT
Instance: An acylcic (i.e., no cycles), directed graph G whose nodes are logic functions: AND, OR, or NOT, or logical variables. The graph represents a combinatorial logic circuit with n inputs and 1 output.
Question: Is there any assignment to the n input variables that will cause the output to become True?
Since we know that any problem in NP is polynomial reducible to CIRCUIT-SAT, we can prove any other problem D is NP-complete by showing The first part is usually easy; we just have to be able to specify a system that shows a certificate proves instances in D in polynomial time and space. The second part is the challenge; showing that instances of CIRCUIT-SAT can be transformed in polynomial time and space to instances of D.

Let's look at another problem:

Problem: 3-CNF-SAT
Instance: A set of m clauses of length at most 3 over n Boolean variables. The clauses are sets of literals, i.e., a variable or its complement (negation). Question: If we take the Boolean OR of the literals in each clause, and then take the Boolean AND of all the clauses, is there any assignment to the n variables such that the result is True (i.e., all the clauses are simultaneously True)?
CNF stands for Conjunctive Normal Form, which simply describes the clause-set representation of Boolean formulas. Here is an example of a formula in 3-CNF-SAT; variables are shown as integers, and the NOT operation is shown as a minus sign:
 2 -3  1
-3 -2  5
-5  3  2
-4 -2  3
 1 -4 -2
Each line is a clause, and the whole clause-set is a formula in 3-CNF-SAT. The interpretation of this clause-set as a Boolean formula is this:
(2 OR NOT 3 OR 1) AND (NOT 3 OR NOT 2 OR 5) AND (NOT 5 OR 3 OR 2)
AND (NOT 4 OR NOT 2 OR 3) AND (1 OR NOT 4 OR NOT 2)
Is this formula a member of 3-CNF-SAT? To show that it is, we need a certificate consisting of a truth assignment that "satisfies" (i.e., makes true) the formula. One assignment to the variables is this:
1 = True, 2 = False, 3 = True, 4 = False, 5 = False
This is a satisfying assignment because all of the clauses are simultaneously true:
(2 OR NOT 3 OR 1)        =  (False OR NOT True OR True) = True
(NOT 3 OR NOT 2 OR 5)    =  (NOT True OR NOT False OR False) = True
(NOT 5 OR 3 OR 2)        =  (NOT False OR True OR False) = True
(NOT 4 OR NOT 2 OR 3)    =  (NOT False OR NOT False OR True) = True
(1 OR NOT 4 OR NOT 2)    =  (True OR NOT False OR NOT False) = True
You can imagine a simple algorithm running in time O(m) (i.e., linear in the size of the formula) that checks whether an assignment satisfies a formula.

You can see that this is a very restricted instance of CIRCUIT-SAT; just replace the word "formula" with "circuit" and place logic gates everywhere there is a logical operator. It may be somewhat surprising that it goes the other way around; 3-CNF-SAT is just as hard as CIRCUIT-SAT because there is a polynomial reduction such that CIRCUIT-SAT <p 3-CNF-SAT.

Theorem: 3-CNF-SAT is NP-complete.
Proof:
Clearly, 3-CNF-SAT is in NP; we just use a satisfying assignment as the linear-time verifiable certificate. So we just need to show CIRCUIT-SAT <p 3-CNF-SAT.

From a circuit C made up of gates AND, OR, and NOT, and input variables in a set { 1 .. n}, we will construct a 3-CNF formula F that is satisfiable if and only if F is satisfiable. The construction goes as follows:

  1. For each variable in C, there is a corresponding variable in F.
  2. For each gate c in C, there is a corresponding variable f in F.
  3. For each NOT gate c modifying a gate or variable d in C, insert logic into F stating "c is equivalent to NOT d." This can be accomplished in CNF with two clauses: (d OR c and (NOT d OR NOT c). If you don't quite buy this, let's see how it's true with a truth table:
    c  d  (d OR c)  (NOT d OR NOT c)  ((d OR c) AND (d OR NOT c))  d == NOT c
    -  -  --------  ----------------  ---------------------------  ----------
    F  F     F             T                     F                     F
    F  T     T             T                     T                     T
    T  F     T             T                     T                     T
    T  T     T             F                     F                     F
    
  4. For each AND gate c modifying two gates or variables d and e in C, insert logic into F stating "c is equivalent to d AND e." We can do in CNF with the following clauses:
    NOT c OR d
    NOT c OR e
    c OR NOT d OR NOT e
    This can be shown using another (eight line) truth table.
  5. For each OR gate c modifying gates or variables d and e, insert logic into F stating "c is equivalent to d OR e." We can do this in CNF with the following clauses:
    c OR NOT d
    c OR NOT e
    NOT c OR d OR e
    This can also be shown with a truth table.
  6. Note that we can omit either AND or OR, simulating one with a combination of NOT and the other, making the circuit a little bigger. We can also simulate any other logic gate (NAND, NOR, XOR, etc.) as combinations of AND, OR, and NOT in linear space.
  7. For the output gate c of the circuit, insert logic into F stating "c is equivalent to True." The singleton clause (c) states this.
We now have a formula F that defines the operation of the circuit, describing from the variables through the gates to the output the precise value on the output of each gate. The formula can only evaluate to True (i.e. be satisfied) if there is some assignment to the inputs of the circuit that make the output True. This transformation is carried out in linear time and space: only up to three clauses are needed for each gate, so for a gate with n gates, the resulting formula has (n) clauses. Thus, 3-CNF-SAT is NP-hard and, being in NP, is NP-complete. []

We don't seem to have accomplished much with this proof; we've replaced one kind of pointy-headed Boolean logic problem with another. However, note that 3-CNF-SAT is a much restricted version of CIRCUIT-SAT. We can use 3-CNF-SAT to prove other problems are NP-complete in instances when tackling all of CIRCUIT-SAT is infeasible or impossible. That is, 3-CNF-SAT is easier to work with when proving things NP-complete.

(Note: It turns out k-CNF-SAT, where clauses have at most k literals, is NP-complete for k > 2. For k = 2, (i.e. 2-CNF-SAT), there is a polynomial time algorithm: a clause (a OR b) is the same as (NOT a IMPLIES b); make a directed graph with vertices the literals and edges the implications derived from the clauses; if any strongly connected component (computable in linear time, similar to the connected components in undirected graphs we saw with Union/Find) of the graph contains a literal and its complement, that is a contradiction, so the formula is not satisfiable. Otherwise, it is. Unfortunately, this algorithm doesn't extend to k-CNF-SAT for k > 2, so we don't get to prove P=NP this way.)

Now let's look at a problem we saw last time:

Problem: HAMILTONIAN-CYCLE
Instance: A graph G = (V, E)
Question: Does G contain a Hamiltonian cycle? That is, is there a path (cycle) going from one vertex of G, through all the other vertices of G exactly once, ending up at the same vertex?
Last time, we saw that HAMILTONIAN-CYCLE is in NP, a certificate being a sequence of vertices satisfying the constraints given above. It turns out, through a complex polynomial transformation from 3-CNF-SAT (given in your book on pp. 954 - 959) that HAMILTONIAN-CYCLE is NP-hard, so it is NP-complete. And since last time, we sat that HAMILTONIAN-CYCLE <p TSP, it follows immediately that the Travelling Salesman Problem is also NP-complete.

Here are some other NP-complete problems:

Problem: SUBSET-SUM
Instance: A set S of positive integers and an integer n.
Question: Is there any subset of S whose elements add up to n?
It looks easy, and there is a pseudopolynomial-time algorithm (i.e., an algorithm that runs in poly-time if we represent the numbers in unary instead of binary, which we said before was an "unreasonable" way of doing things), but the problems turns out to be NP-complete.

Problem: CLIQUE
Instance: An undirected graph G and integer k.
Question: Is there a subgraph of G with k vertices that is complete? That is, is there a clique of vertices of size k that are all mutually connected by (k(k+1))/2 edges?
Your book proves this is NP-complete by a simple reduction from 3-CNF-SAT.

Problem: SUBGRAPH-ISOMORPHISM
Instance: Two undirected graphs G and H.
Question: Is G isomorphic to a subgraph of H? That is, of all the subgraphs (subsets of vertices and edges) of H, can we relabel one so that it is identical to G?
Note that this is a generalization of GRAPH-ISOMORPHISM. Note also that the proof that SUBGRAPH-ISOMORPHISM is NP-complete is trivial if we know that CLIQUE is NP-complete; we simply make G a complete graph of size k to solve CLIQUE.

Problem: REGISTER-ALLOCATION
Instance: A block of computer code, a set of variables, and a set of CPU registers, and an integer k.
Question: Is there a schedule of allocation of registers to variables such that at most k registers must be spilled to main memory during the block of code? (Spilling registers to memory is bad.)
This problem is also NP-complete. It comes up in writing compilers. On machines with very few registers, like the Intel x86/Pentium line, it isn't too hard to solve exactly. But for machines with e.g. 32 general purpose registers, solving it exactly with the best known algorithms takes a very long time, much longer than you can expect the user who has just typed make to wait.

More Structural Complexity

The "polynomial hierarchy" is the structure of the problems in PSPACE, the set of problems that can be solved in polynomial space. NP is a subset of PSPACE; we only need polynomial space to, for example, check all possible travelling salesman tours or try all possible assignments to variables for 3-CNF-SAT. There are other complexity classes within PSPACE, for example: It isn't known whether P = NP, or co-NP = NP, or NP-complete = co-NP-complete, or even if PSPACE = P.

Here is a simple map of PSPACE as we know it, assuming P != NP and NP != co-NP.

 ----------------------------------------------
|PSPACE       ______   ______                  |
|            /      \ /      \                 |
|           /        \        \                |
|          /        / \        \               |
|         /        /   \        \              |
|        /    NP  /  P  \ co-NP  \             |
|        \_____   \     /        /             |
|         \NP- \___\   /        /              |
|          \complete\ /        /               |
|           \        \        /                |
|            \______/ \______/                 |
|                                              |
 ----------------------------------------------