CS 70 - Lecture 39 - Apr 27, 2011 - 10 Evans

  Our goals for the rest of the semester are as follows:+
   First Goal (Note 21): 
   Understand countability, i.e. in what sense the following sets
     have the same "number" of members:
       all positive integers
       all integers
       all even integers
       all rational numbers
       all computer programs
   Second Goal (Note 22): Use (roughly) the same proof to show that
   1) There are more real numbers than rational numbers
   2) There are more functions with domain N than programs, 
      (so that not every function can be implemented by a program)
   3) A "super-debugger" or "infinite loop finder" is a
       particular function that cannot be implemented.

Recall some definitions:
   DEF: |A| = cardinality of A = number of elements in A 
        (if A is finite, else say A is infinite, details of infinite case later)
   DEF: P(A) = power set of A = set of all subsets of A
   DEF: If A and B are sets, then A x B = { (a,b) | a in A and b in B }
        is the Cartesian product of A and B. A x B is also called the
        set of ordered pairs (a,b) from A, B
   DEF: If A1, A2, ... , An are sets then 
        A1 x A2 x ... x An = { (a1,a2,...,an) | ai in Ai for i=1,..,n }
        is the Cartesian product of A1,...,An. It is also called the
        set of ordered n-tuples (a1,...,an) from A1,..,An

Let f:A->B be a function with domain A and range B
   DEF: f is one-to-one (injective) if f(x)=f(y) -> x=y
   DEF: f is onto (surjective) if all b in B have preimages in A
   DEF: f is a one-to-one correspondence (bijective) 
        if it is one-to-one and onto
ASK&WAIT: is f:Z->Z where f(x)=x+1 bijective?
ASK&WAIT: is f:Z->{even integers} where f(x)=2*x bijective?

   First Goal: understand cardinality, countability

   Recall DEF: If A is finite, the cardinality |A| = # members of A
ASK&WAIT: suppose f:A->B is a bijection, A finite. Is B finite? 
          How are |A| and |B| related?
   DEF: We say that A and B have the same cardinality if there is a
        one-to-one correpondence between them, whether finite or not
ASK&WAIT: Do Z and {even integers} have same cardinality? 
ASK&WAIT: Do N and {powers of 2} have same cardinality? 
ASK&WAIT: Do Z and N have same cardinality? (hint: represent bijection by table)
   DEF: A set that is either finite or has the same cardinality
        as N (or Z) is called countable, else uncountable 
        intuition is that an uncountable set is much larger than 
         any countable set
   More examples of countable sets (most sets we have seen are countable:)

   Theorem: if A and B are countable, so is S = A union B
      proof: number elements of S by a(1), b(1), a(2), b(2),...
             i.e. f(i) = a(i/2) if i is even; b((i+1)/2) if i is odd
             is a one-to-one correspondence between N and S
      Enough to illustrate bijection f:N->S of a set S with N or Z without 
         writing down formula for f, i.e. just show how to write down all 
         members of S in order, each member of S appearing exactly once.

   Theorem: The Cartesian product P =  A x B of all pairs {(a,b)} 
            is countable if A and B are countable
      proof: represent P as "lattice" points in the plane, 
             and number them diagonally.

   Theorem: Suppose A1, A2, A3, ... are all infinite countable sets
            Then S = A1 union A2 union A3 union ... is countable
ASK&WAIT: why? 

   Theorem: Suppose A is countable, and B is a subset of A. 
            Then B is countable
ASK&WAIT: why? 
                      
ASK&WAIT: Is Q (rational numbers) countable?
##    yes: There is an obvious one-to-one correspondence between Q

ASK&WAIT: Is J (set of syntactically correct programs) countable?

   Second Goal: Use (roughly) the same proof to show that
   1) There are more real numbers than rational numbers
   2) There are more functions with domain N than programs, 
      (so that not every function can be implemented by a program)
   3) A "super-debugger" or "infinite loop finder" is a
       particular function that cannot be implemented.

   Rough plan: Use proof by contradiction, i.e. assume the opposite
     is true (eg there are the same number of reals as rationals,
     or functions as programs) and get a contradiction. 
     More precisely, we will assume that the set of reals (or functions)
     is countable, and get a contradiction.

   Recall Def: A set is countable if either it is finite, or
     can be put in one-to-one correspondence with N = {1,2,3,...}

   Recall last time: Q = {rationals} is countable
                     J = {syntactically correct programs} is countable

   1) Theorem 1: The real numbers are not countable.
      Proof: It is enough to show that the set of 
      reals between 0 and 1
      is not countable, because the set of all reals is even bigger.
      We assume that these real numbers are countable. This
      means that there exists a list r(1),r(2),r(3),... in which each 
      real number appears once (or twice)
      We will get a contradiction by constructing another real number s
      that can't be in this list.
      Now write each real number as a binary fraction, eg 
       r(1)   = 1/2    = .100...
       r(2)   = 1/3    = .010101...
       r(3)   = pi - 3 = .0010010..
       r(4)   = 2/3      .101010...
      When there are two ways to write the same number
      (eg 1/4 = .01000... = .00111...) include them both on the list.

      To construct a real number s that is not on this list,
      we will write down a binary fraction that 
      can't be r(1) because we choose s's 1st digit different from r(1)'s
      can't be r(2) because we choose s's 2nd digit different from r(2)'s
      can't be r(3) because we choose s's 3rd digit different from r(3)'s
      ...
      can't be r(i) because we choose s's ith digit different from r(i)'s
      ...
      
      Here is the simple rule for doing this
         "If the i-th digit of r(i) is x (= 0 or 1), 
         let the i-th digit of s be 1-x."
      For example, for our list above, we set s = .0001...
      Since this trick works for any proposed complete list r(1),r(2)...
      of the real numbers, no such complete list can exist.
      This completes the proof.

   2) Theorem 2: the set of functions is not countable.

      Since the proof of Theorem 2 is almost the same as Theorem 1,
      we will CAPITALIZE the few differences between them below.

      Proof: It is enough to show that the set of 
      FUNCTIONS WITH DOMAIN N AND CODOMAIN D = {0,1}
      is not countable, because the set of all FUNCTIONS is even bigger.
      We assume that these FUNCTIONS are countable. This
      means that there exists a list r(1),r(2),r(3),... in which each 
      FUNCTION appears EXACTLY ONCE. 
      We will get a contradiction by constructing another FUNCTION s
      that can't be in this list.
      Now write each FUNCTION's OUTPUT as a LIST OF BITs, eg 
       r(1)(N)   = (100...)
       r(2)(N)   = (0101...)
       r(3)(N)   = (0010010..)
       r(4)(N)   = ( ...

      To construct a FUNCTION s that is not on this list,
      we will write down a LIST OF BITS that 
      can't be r(1)(N) because we choose s's 1st digit different from r(1)
      can't be r(2)(N) because we choose s's 2nd digit different from r(2)
      can't be r(3)(N) because we choose s's 3rd digit different from r(3)
      ...
      can't be r(i)(N) because we choose s's ith digit different from r(i)
      ...

      Here is the simple rule for doing this
        "If the i-th digit of r(i) is x (=0 or 1), 
        let the i-th digit of s be 1-x."
      For example, for our list above, we set s(N) = (000...)
      Since this trick works for any proposed complete list r(1),r(2)...
      of FUNCTIONS, no such complete list can exist.
      This completes the proof.

      The idea of building a "counterexample" (real not on the list,
      function not on the list) by going down the diagonal of a table
      (eg one row for each real, one column for each bit)
      is called "diagonalization". We will use the same proof technique
      for the next result.

   3) To show that a "super-debugger" or "infinite loop finder" cannot
      be implemented requires a little more notation.
      A superdebugger is a program that implements the function
         S(program P,input I) = [ 1 if P halts on input I
                                [ 0 if P goes into infinite loop on input I
       In particular the program implementing S should never go into an 
            infinite loop itself. Another name for the question of
            whether a "super-debugger" exists is the "halting problem".

       As before, we restrict ourselves to the case where the input I of P
       is a single integer, since if this can't be solved, then neither
       can the general case. And since the set of all programs is 
       countable, we can write them in a list p(1), p(2),... and
       so think of the function S as a function of two integers:
         S(j,i) = [ 1 if program p(j) halts on input integer i
                  [ 0 if program p(j) goes into an infinite loop on input i
       To get a contradication, we 
       1) Assume that we have a program H(j,i) (for "halting") that can 
          indeed compute S(j,i) without itself going into an 
          infinite loop, and then 
       2) Construct another program C (for "contradiction") for which
          H must give the wrong answer, using diagonalization.

       Here is the idea of C in a table. In row j of the table,
       we record whether p(j) halts or loops on each input
       by writing down a list of Os and 1s, where the i-th entry
       is 1 if p(j) halts on input i, and 0 if it loops.
       The table looks like this:
         p(1): 00110...
         p(2): 10101...
         p(3): 11110...
         ...
       For example, program p(3) halts on input i=4 because the
       4th digit in "11110..." is 1.
       The idea is to build a program C that that is not on this list,
       because C
      can't be p(1) because we choose C to differ from p(1) on input 1
        (ie C loops if p(1) halts and C halts if p(1) loops)
      can't be p(2) because we choose C to differ from p(2) on input 2
      can't be p(3) because we choose C to differ from p(3) on input 3
      ...
       Here is C, which calls H as a subroutine:
          progam C(integer i)
             if H(i,i) = 1  ... program i halts on input i
                go into an infinite loop
             else           ... program i loops on input i
                stop
             endif
       C is constructed to differ from each program on the "diagonal" 
       (i,i) of the table.

       Here is the contradiction. Since C is itself a program, 
       it must appear on the list of programs, say as program number c. 
       But it can't be on the list because its behavior 
       (halting or looping) differs from each program in the table.

       In more detail, we get a contradiction by asking what 
       happens when we try to compute C(c), i.e. 
       run C on itself. The idea is that H(c,c) should tell us
       whether C(c) goes into an infinite loop or not.
       There are two cases:
        1) Suppose H(c,c)=1, that is C(c) is supposed to halt.
           But looking at the code for C, we see that
           C(c) goes into an infinite loop if H(c,c)=1, a contradiction.
        2) Suppose H(c,c)=0, that is C(c) is supposed to loop.
           But looking at the code for C, we see that
           C(c) halts if H(c,c)=0, a contradiction.
       So assuming that the program H(j,i) existed led to a contradiction.
       So it can't exist.