MODULE 7:   Graph Theory
Recitation 5
1
Bits and Pieces

Determine which of the following problems can be computed in worst-case polynomial-time, i.e. O(nk)O(n^k) time for some constant k, where n denotes the number of bits in the binary representation of the input. If you think the problem can be solved in polynomial time, give an algorithm, explain briefly why it gives the correct answer, and argue carefully why the running time is polynomial. If you think the problem cannot be solved in polynomial time, then provide a proof.

  • Given as input positive integer NN, output N!N!.

  • Given as input a positive integer NN, output True iff N=M!N = M! for some positive integer MM.

  • Given as input a positive integer NN, output True iff N=M2N = M^2 for some positive integer MM.

Solution

Problem 1: To represent the value of N!,N!, we will need log2(N!)\log_2 (N!) bits. As shown in the course notes, log2(N!)=Θ(NlogN)\log_2(N!) = \Theta(N \log N). It takes at least one step to write down a single bit of the output. Therefore, writing out the whole output would take exponential time with respect to the input length nn, since n=O(logN)n = O(\log N).

Problem 2: The polynomial-time algorithm is as follows. We iteratively compute the values 1!,2!,3!,1!, 2!, 3!, \ldots. In particular, at iteration MM, in order to compute M!M!, we take the result from previous iteration and multiply it by MM. In every iteration we check whether M!<NM! < N. When M!NM! \geq N, we exit the loop and check whether M!=NM! = N or M!>NM! > N. If they are equal, then return true, and false otherwise.

The analysis of the algorithmic complexity is as follows: First, a crude upper bound on the number of iterations is O(logN)O(\log N). This is because after the second iteration, M!M! grows by a factor of at least 2 in each iteration. So after log2N\log_2 N iterations, the value of M!M! exceeds NN. In each iteration we do a multiplication operation. If we use the quadratic-time multiplication algorithm, then the running time of the multiplication operation would be at most O(log2N)O(\log^2{N}). Therefore the total running time of the loop is O(log3N)O(\log^3{N}). The comparison between two numbers M!M! and NN will take O(logN)O(\log{N}) time. Thus, the whole algorithm takes O(log3N)=O(n3)O(\log^3{N}) = O(n^3) time, which is polynomial in the input length nn.

Problem 3: We will perform binary search to pin down the value of xx such that x2=Nx^2 = N. At each iteration, we have a guess xx for the square root of NN. We compute xxx \cdot x to see if our guess is correct, or whether x2x^2 is above NN or below NN. Depending on that, we update our range and calculate a new guess xx. Here we will not give the details about the implementation of binary search (that is not the point of the exercise).

Let’s assume lolo and hihi are the two variables in the binary search algorithm that keep track of the range we are interested in (the midpoint of lolo and hihi corresponds to our guess xx). At the beginning of each iteration of the while loop, the invariant loNhilo \leq \sqrt{N} \leq hi will be maintained. Since lolo and hihi are integers, and hilohi-lo is decreasing at each iteration, we will find N\sqrt{N} if it is also an integer.

To analyze the run-time, we first look at the number of iterations, and then look at the amount of work we do in each iteration.

In terms of the number of iterations, observe that hilohi-lo decreases by at least a half at the end of each iteration of the loop. Since we start with a range of length NN initially, and we decrease the size of this range by a factor of 2 at every iteration, we have at most O(log(N))O(\log(N)) iterations.

In terms of the work done in each iteration, note that we compute the product xxx \cdot x, and the value of xx is at most NN. Hence, we can do this computation in O(log2N)O(\log^2 N) time (just using the grade-school multiplication algorithm). We also perform some comparisons and assignments, each of which only takes O(logN)O(\log N) time. So the work done in each iteration is O(log2N)O(\log^2 N).

We can conclude that the total time is O(log3N)O(\log^3 N), which is polynomial in n=logNn = \log N.

2
“Clearly” Correct

A connected graph with no cycles is a tree. Consider the following claim and its proof.

Claim: Any graph with nn vertices and n1n - 1 edges is a tree.
Proof: We prove the claim by induction. The claim is clearly true for n=1n=1 and n=2n=2. Now suppose the claim holds for n=kn=k. We’ll prove that it also holds for n=k+1n=k+1. Let GG be a graph with kk vertices and k1k-1 edges. By the induction hypothesis, GG is a tree (and therefore by definition connected). Add a new vertex vv to GG by connecting it with any other vertex in GG. So we create a new graph GG' with k+1k+1 vertices and kk edges. The new vertex we added is a leaf, so it does not create a cycle. Also, since GG was connected, GG' is also connected. A connected graph with no cycles is a tree, so GG' is also a tree. The claim follows by induction.

Explain why the given proof is incorrect.

Solution

The proof relies on a specific construction to obtain bigger graphs from smaller graphs. This really only proves the statement for all graphs that can be constructed in this way (taking a smaller graph with the property and connecting a new vertex to an existing vertex). In this case it is easy to see how and why this fails. Consider any counterexample to the claim and observe that it cannot be obtained by connecting a new vertex to a smaller tree.

The general format for induction of graphs is something like this. Suppose we want to prove that all graphs that satisfy hypothesis AA have property BB.

  1. Assume what is to be proven for all graphs on nn vertices or less.

  2. Consider an arbitrary graph on n+1n+1 vertices that satisfies hypothesis AA.

  3. Remove some vertices to obtain a smaller graph that still satisfies hypothesis AA.

  4. Cite the induction hypothesis to conclude that the smaller graph has property BB.

  5. Reconstruct the original larger graph and argue that property BB is maintained.

3
2 3 Proofs 4 You

Give two proofs (one using induction and another using a degree counting argument) for the following claim: the number of leaves in a tree with n2n\geq 2 vertices is 2+vVdeg(v)3(deg(v)2).2 + \sum_{\substack{v\in V\\\deg(v)\geq3}} (\deg(v)-2).

Solution

As a reminder, any graph satisfying two of the following three properties is a tree: (i) the graph is connected; (ii) the graph is acyclic; (iii) the number of edges is one less than the number of vertices.

We prove the statement by induction on nn, the number of vertices. For the base case, the only tree with n=2n=2 vertices is a single edge, which has 2 leaves and no vertices with degree at least 3. Therefore, 2+vVdeg(v)3(deg(v)2)=0=2,2+\underbrace{\sum_{\substack{v\in V\\\deg(v)\geq3}} (\deg(v)-2)}_{=0}=2, as desired.

For the inductive case, suppose that for some n2n\geq2, the statement is true for all trees on nn vertices. Consider an arbitrary tree TT on n+1n+1 vertices. Every tree with at least two vertices has at least 2 leaves (this is an exercise problem in the textbook). Let uu be a leaf in TT and let ww be its neighbor. Then T:=T{u}T':=T\setminus\{u\} is a tree on nn vertices since it is connected and acyclic. Note that we cannot have degT(w)<2\deg_T(w)<2 since ww has uu as a neighbor as well as another vertex in TT'.

Case 1:

degT(w)=2\deg_T(w)=2. Then degT(w)=1\deg_{T'}(w)=1 and so ww is a leaf in TT'. It follows that TT and TT' have the same number of leaves. So Number of leaves in T=Number of leaves in T=2+vV(T)degT(v)3(degT(v)2)(by IH)=2+vV(T)degT(v)3(degT(v)2),\begin{aligned} \text{Number of leaves in $T$} &= \text{Number of leaves in $T'$}\\ &=2+\sum_{\substack{v\in V(T')\\\deg_{T'}(v)\geq3}}(\deg_{T'}(v)-2) \quad \text{(by IH)}\\ &=2+\sum_{\substack{v\in V(T)\\\deg_{T}(v)\geq3}}(\deg_{T}(v)-2), \end{aligned} where the last equality holds because in both the summation for TT and the summation for TT', neither uu nor ww appear.

Case 2:

degT(w)3\deg_T(w)\geq3. Then degT(w)=degT(w)12\deg_{T'}(w)=\deg_{T}(w)-1\geq2 and so  Number of leaves in T= Number of leaves in T+1= (2+vV(T)degT(v)3(degT(v)2))+1(by IH)= (2+vV(T)degT(v)3(degT(v)2))+(degT(w)2)(degT(w)2)=1+1= 2+vV(T)degT(v)3(degT(v)2).\begin{aligned} &~\text{Number of leaves in $T$}\\ =&~\text{Number of leaves in $T'$}+1\\ =&~\Bigg(2+\sum_{\substack{v\in V(T')\\\deg_{T'}(v)\geq3}} (\deg_{T'}(v)-2)\Bigg)+1 \quad \text{(by IH)}\\ =&~\Bigg(2+\sum_{\substack{v\in V(T)\\\deg_T(v)\geq3}} (\deg_T(v)-2)\Bigg)+\underbrace{\big(\deg_{T'}(w)-2\big)-\big(\deg_T(w)-2\big)}_{=-1}+1\\ =&~2+\sum_{\substack{v\in V(T)\\\deg_T(v)\geq3}} (\deg_T(v)-2). \end{aligned}

Solution

We now prove the statement via a degree counting argument. Let T=(V,E)T=(V,E) be an arbitrary tree with nn vertices, mm edges and LL leaves. Since TT is a tree, we know n=m+1n = m+1. By the handshake lemma, 2m=vVdeg(v)=vVdeg(v)=1deg(v)=L+vVdeg(v)2deg(v)=L+(vVdeg(v)2(deg(v)2))+2(nL)=L+(vVdeg(v)2(deg(v)2))+2(m+1)2L=L+(vVdeg(v)3(deg(v)2))+2(m+1)2L,\begin{aligned} 2m = \sum_{v\in V}\deg(v) & = \underbrace{\sum_{\substack{v\in V\\\deg(v)=1}}\deg(v)}_{=L}+\sum_{\substack{v\in V\\\deg(v)\geq2}}\deg(v)\\ & = L + \left(\sum_{\substack{v \in V\\ \deg(v) \geq 2}} (\deg(v)-2)\right) + 2(n-L)\\ & = L + \left(\sum_{\substack{v \in V\\ \deg(v) \geq 2}} (\deg(v)-2)\right) + 2(m+1) - 2L\\ & = L + \left(\sum_{\substack{v \in V\\ \deg(v) \geq 3}} (\deg(v)-2)\right) + 2(m+1) - 2L,\end{aligned} where the last equality follows because the summation over deg(v)2\deg(v)\geq2 is equivalent to deg(v)3\deg(v)\geq3 (the deg(v)=2\deg(v)=2 terms are all 0). Rearranging the terms we get L=2+vVdeg(v)3(deg(v)2).L = 2 + \sum_{\substack{v\in V\\\deg(v)\geq3}} (\deg(v)-2).

4
(Extra) How About Strictly Less Than?

We use O()O(\cdot), Ω()\Omega(\cdot) and Θ()\Theta(\cdot) notation to compare functions in computer science. In particular, O()O(\cdot) corresponds to \leq, Ω()\Omega(\cdot) corresponds to \geq, and Θ()\Theta(\cdot) corresponds to ==. We now would like to define what “strictly less than” and “strictly greater than” mean with respect to comparing functions. How should we define them?

Solution

In computer science, the standard notation for “strictly less than” is o()o(\cdot), which is known as the little-o notation. We usually define “strictly less than” to mean “not greater than or equal to”. So we can try to define f(n)=o(g(n))f(n) = o(g(n)) to mean “not f(n)=Ω(g(n))f(n) = \Omega(g(n))”. In other words, we cannot find constants c,n0>0c, n_0 > 0 such that for all nn0n \geq n_0, f(n)cg(n)f(n) \geq c \cdot g(n). Unfortunately this definition does not quite work. This is because we can have functions that are incomparable to each other in the sense that neither f(n)=O(g(n))f(n) = O(g(n)) holds nor f(n)=Ω(g(n))f(n) = \Omega(g(n)) (try to come up with an example for this). And for such functions, we have “not f(n)=Ω(g(n))f(n) = \Omega(g(n))”, but we can’t say f(n)=o(g(n))f(n) = o(g(n)) because f(n)=o(g(n))f(n) = o(g(n)) implies f(n)=O(g(n))f(n) = O(g(n)).

Another way to define “strictly less than” is “less than or equal to but not equal to”. So we can try to define f(n)=o(g(n))f(n) = o(g(n)) to mean “f(n)=O(g(n))f(n) = O(g(n)) but not f(n)=Θ(g(n))f(n) = \Theta(g(n))”. This is the definition we can indeed use. Equivalently, we want “g(n)g(n) grows asymptotically faster than f(n)f(n)”, or in other words, limng(n)f(n)=\lim_{n \to \infty} \frac{g(n)}{f(n)} = \infty, which is how little-o notation is usually defined. As a very simple example, we can write n=o(nlogn)n = o(n \log n).

The definition for “strictly greater than” is analogous, and we use little-omega ω()\omega(\cdot) for this. As a simple example, we can write nlogn=ω(n)n \log n = \omega(n).