Math 55 - Fall 2007 - Lecture notes # 15 - Oct 3 (Wednesday)

   Finish through end of Section 3.7 (not 3.8)

   Goals for today: Recall Euclidean algorithm for the gcd
                    Use it to solve a "congruence equation" 
                         a*x=b mod m for x
                         how to do division in modular arithmetic
                    Use it to solve a system of congruence equations:
                         Chinese Remainder Theorem
                    Apply it to cyptography
                    
   Recall property of Euclidean algorithm for gcd: given a and m, 
   it computes
       1) d = gcd(a,m)
       2) integers s and t such that a*s+m*t=d 

   How to solve a*x == 1 mod m for x: 
   (analogy of reciprocal of a in modular arithmetic)
   Theorem: a*x ==1 mod m can be solved for x if and only if gcd(a,m)=1.
          When it can be solved, x is unique mod m, i.e. the only one in
          the range 0 to m-1, and is called "the inverse of a modulo m".
     EX: Solve 2*x==1 mod 5: try x=0,1,2,3,4, 
                       getting 2*x=0,2,4,1,3, 
         so x=3 is the unique answer (gcd(2,5)=1)
     EX: Solve 2*x==1 mod 4: try x=0,1,2,3, 
                       getting 2*x=0,2,0,2, 
         so there is no solution (gcd(2,4)=2)
     Proof: If gcd(a,m)=1, we have to show that we can solve for x:
            Use the Euclidean algorithm to find s and t such that 
            a*s+m*t=1. Thus a*s = 1-m*t == 1 mod m, so x=s is a solution.

            If gcd(a,m) /= 1, we have to show that no x satisfies 
            a*x==1 mod m: Recall that a*x == 1 mod m is equivalent to 
            a*x mod m = 1, and that a*x mod m = a*x+m*t 
            for some t.  But if gcd(a,m)=d>1, then d|a and d|m, 
            so d|(a*x+m*t) for any integer t, and in particular 
            d|(a*x mod m).  Since d does not divide 1, a*x mod m /= 1.

            To show that the solution x is unique mod m when it exists,
            suppose both that a*x1 == 1 mod m and a*x2 == 1 mod m, and
            that 1 <= x1 < m and 1 <= x2 < m; we have to show that x1=x2. 
            Now a*x1-a*x2 == 0 mod m, so m|(a*(x1-x2)). 
            Since gcd(a,m)=1, a and m have no common factors,
            and thus m|(x1-x2). Now x1-x2 satisfies two properties:
              1) m|(x1-x2), so x1-x2 is in the set 
                  {..., -2*m,-m,0,m,2*m,...}
              2) -m < x1-x2 < m, since 1 <= x1 < m and 1 <= x2 < m;
            The only value of x1-x2 satisfying these properties is 
            x1-x2=0, or x1=x2 as desired.

   Corollary: a*y == b mod m has a solution y for any b if and only if
       gcd(a,m) == 1  (analogy of dividing b/a in modular arithmetic)
       proof: if gcd(a,m)=1, then the Theorem says we can solve 
              a*x == 1 mod m. Multiply through by b to get  
              a*(x*b) == b mod m, so we can take y = x*b 
              (b times "inverse of a") If gcd(a,m)>1, 
              then the Theorem tells us we cannot solve when b=1.

ASK&WAIT: under what conditions on b can we solve a*y == b mod m for y?

   Chinese Remainder Theorem: Let m1, m2,..., mn be pairwise
       relatively prime numbers, ie gcd(mi,mj)=1 for all i and j.
       Let m = m1*m2*...*mn.  Then the n equations  
         x == a1 mod m1 , x == a2 mod m2, ... , x == an mod mn
       have a unique solution mod m for any a1, a2,...,an, i.e. there is
       only one solution in the range from 0 to m-1.
   EX: x == 2 mod 3, x == 3 mod 5 
           x    x==2 mod 3 ?   x==3 mod 5?
           0                                
           1                                
           2        Yes
           3                      Yes       
           4                                
           5        Yes                     
           6                                
           7                                
           8        Yes           Yes  x=8 is unique solution mod 3*5=15
           9                                
          10                                
          11        Yes                     
          12                                
          13                      Yes       
          14        Yes                     

    Proof: We give an algorithm for computing x, and leave uniqueness to
        homework. Let Mi = m/mi, for i=1,...,n. Thus
        Mi = product of all mj except for mi, so gcd(Mi,mi)=1, since
        mj and mi have no common factors. By the last theorem, each
        Mi has an inverse yi mod mi, i.e. Mi*yi == 1 mod mi.
        We claim a solution is x = a1*M1*y1 + a2*M2*y2 + ... + an*Mn*yn.
        To confirm this we have to verify that x == ai mod mi for all i:
        x == (a1*M1*y1 + a2*M2*y2 + ... + an*Mn*yn) mod mi
          == ( a1*M1*y1 mod mi + ... + an*Mn*yn mod mi ) mod mi
          == ( ai*Mi*yi mod mi 
             + sum_{j /= i} aj*Mj*yj ) mod mi
          == ai mod mi       since Mi*yi == 1 mod mi
             + 0             since mi | Mj when j /= i
          == ai mod mi       as desired

   EX: x == 2 mod 3, x == 3 mod 5 again:
       a1=2, m1=3, a2=3, m2=5, M1=5, M2=3, 
       y1=2 since 2*5==1 mod 3   and y2=2 since 2*3==1 mod 5
       x = 2*5*2 + 3*3*2 = 38 == 8 mod 15
                    
  Cryptography
     Recall that a message (character string) is converted to a number M
     What happens when a Sender wants to send a secret message to 
     a Receiver:
       The Sender takes message M and encrypts it to get the
            encrypted message C = f_enc(M)
       The Sender sends C to the Receiver. Anyone may "intercept" 
            C on its way.
       The Receiver decrypts C to get the original message M = f_dec(C).

     For this to work as the Sender and Receiver desire:

       f_enc and f_dec have to be one-to-one, onto functions and be
         inverses of one another, i.e. M = f_dec(f_enc(M)) for all M
       It is easy for the Sender to evaluate f_enc
       It is easy for the Receiver to evaluate f_dec
       It is very hard for anyone other than the Receiver to evaluate 
         f_dec.  The harder it is, the better the secrecy.
         
     Two kinds of cryptography:
       Private key (traditional): need one "Key" for both f_enc and f_dec
         where K=Key is a shared secret between Sender, Receiver
  EX: shift:   C = f_enc(M) = M-K mod n, M = f_dec(C) = C+K mod n,
      easy to break
ASK&WAIT: How?
  EX: xor:  C = f_enc(M) = M xor K (thinking of M, C, K as bit strings
            of the same length)
            M = f_dec(C) = C xor K 
ASK&WAIT: Why are f_enc and f_dec inverses?
      hard to break if K used once
  EX: Original Washington/Moscow hotline worked this way
  EX: crypt command in UNIX, uses algorithm from German Enigma machine
      used in World War II, which was broken by Turing

  Secrecy depends on keeping K a secret known only to Sender, Receiver
  so only they can evaluate f_enc and f_dec
  Disadvantage: if 1000 people want to talk to one another in secret,
  need 999*1000 secret keys, so all pairs can talk; too many keys!

  Public key: any Sender can do f_enc, but only one Receiver can do f_dec
  Advantage: for 1000 people to talk in secret, each person has his/her
       own secret f_dec, but can just publish the corresponding f_enc
  EX: RSA (Rivest/Shamir/Adleman)
    Need: 1) large number n that is product of two large primes p*q=n
            large means 200 to 400 decimal digits
          2) integer e that is relatively prime to (p-1)*(q-1)
          3) integer d = inverse of e mod (p-1)*(q-1)
    Everyone knows n and e, but only Receiver  knows d
   Then for message M, C = f_enc(M) = M^e mod n is the encryted message
      For encrypted message C, M = f_dec(C) = C^d mod n is the decrypted 
          message
  EX: Try 2537=n=p*q=43*59, e=13, message = STOP = (ST,OP)=(1819,1415)
      using position of letters in alphabet.  Then encrypted message 
      = ( 1819^13 mod 2537 , 1415^13 mod 2537 ) = ( 2081, 2182 ).
      To decrypt we use d = 937 and compute
      ( 2081^937 mod 2537 , 2182^937 mod 2537 ) = (1819,1415)

  We will show that f_enc and f_dec are inverses of one another shortly.
  But first, why is f_enc() easy and f_dec() hard to evaluate?
    f_enc() requires multiplying by M and taking the remainder mod n,
      both of which are easy, even if M and n are large.
    f_dec() equally easy if we know d, which only the Receiver knows.
      Why is it hard to figure out d? All you have to do is
       1) factor n=p*q
       2) use Euclidean algorithm to compute d so d*e ==1 mod (p-1)*(q-1)
     But 1) is very hard: Best algorithms would take billions of years
     if n has 400 digits. And any other known algorithm to compute d 
     leads to computing p and q too. So quality of encryption depends on
     large integers being very hard to factor. If you figure out an
     algorithm to factor quickly, you can become rich or famous.
  
   Proof that f_dec() is inverse of f_enc requires
   Fermat's Little Theorem (proof later)
       If p is prime and p /| a, then a^(p-1) == 1 mod p

   Proof that f_dec(f_enc(M)) = M, where M < p,q
   f_dec(f_enc(M)) = f_dec(M^e mod n) = (M^e)^d mod n = M^(e*d) mod n.
   We need to show that M^(e*d) mod n = M mod n = M, since M < p*q = n.
   Now e*d == 1 mod (p-1)*(q-1) so e*d = 1+m*(p-1)*(q-1) for some m. Then
   M^(e*d) mod n = M^(1 + m*(p-1)*(q-1)) mod n
                 = M * M^(m*(p-1)*(q-1)) mod n
   Now since M < p and M < q, and p and q are prime, we must have
   gcd(M,p) = gcd(M,q) = 1. Then Fermat's Little Theorem implies that
   M^(p-1) == 1 mod p and M^(q-1) == 1 mod q.
   Thus  M^(e*d) = M * (M^(p-1))^(m*(q-1)) 
                == M * (1)^(m*(q-1)) mod p 
                == M mod p
   and  M^(e*d) = M * (M^(q-1))^(m*(p-1)) 
               == M * (1)^(m*(p-1)) mod q 
               == M mod q.
   Finally, by the Chinese Remainder Theorem, M^(e*d) is the unique
   solution mod p*q to
     x == M mod p
     x == M mod q
   so M^(e*d) mod n = M as desired.

   For RSA to be useful, we need to find a lot of large primes.
   It turns out that there are so many primes, you can just
   pick numbers randomly and test if they are prime; 
   there are enough primes that chances are you won't have
   to test too many random numbers before finding one.

   Def: pi(n) = the number of primes <= n
   Ex:  pi(20) = |{2,3,5,7,11,13,17,19}| = 8
   Theorem (Prime Number Theorem): The limit as n -> infinity of
      pi(n) / (n/ log_e n) = 1

   EX:   n     pi(n)   n/log_e(n)    pi(n)/ (n/log_e n)
        10^1       4       4.3        .92
        10^2      25      21.7       1.15
        10^3     168     144.8       1.16
        10^4    1229    1085.7       1.13
        10^5    9592    8685.9       1.10
        10^6   78498   72382.4       1.08
        10^7  664579  620420.7       1.07
        10^8 5761455 5428681.0       1.06
   The point is that the ratio in the last column is slowly approaching 1

   So about what fraction of 200 decimal digit numbers are prime?
     # 200 digit primes / # 200 digit numbers
  =  ( pi(10^200) - pi(10^199) ) / (10^200 - 10^199 )
  ~  ( 10^200/log_e(10^200) - 10^199/log_e(10^199) ) / (10^200 - 10^199)
  ~  .002 or about 1 out of 500
    So if you pick 500 random 200 digit numbers, 
    there is a reasonable chance that one is prime.

    But we still need a quick test that a particular number is
    prime. We already said that trying to factor big numbers
    is too expensive (which is why we can use RSA safely in
    the first place!), so we need something cheaper. 

    It turns out that Fermat's Little Theorem tells us (almost)
    all we need: a prime p satisfies  a^(p-1) == 1 mod p, which
    is cheap to test for some randomly chosen a not divisible by p;
    if a^(p-1) is not == 1 mod p, we are sure p is not prime.
    But if a^(p-1) == 1 mod p for enough randomly chosen a, we have
    strong evidence that p is prime. This is not quite enough
    (there is a set of nonprimes, called "Carmichael numbers", that
    pass this test), but the test can be improved to identify
    primes reliably.
                    
   To finish cryptography, we need a proof of Fermat's Little Theorem:
   Thm: IF p is prime and p \| a, then a^(p-1) == 1 mod p
   
   Here are some "numerical experiments" to devise proof conjecture:
    
      consider integers 1 <= i < p, for some prime p, say p=7.
      Try multiplying them by any integer mod p, see what you get:
                   1 2 3 4 5 6
       *2 mod 7 => 2 4 6 3 5 7
       *3 mod 7 => 3 6 2 5 1 4
       *4 mod 7 => 4 1 5 2 6 3
       *5 mod 7 => 5 3 1 6 4 2
       *6 mod 7 => 6 5 4 3 2 1
ASK&WAIT:  What is the pattern?
       Can see same pattern for any prime p
    Conjecture (proven shortly): given any prime p and any 1 <= a < p,
      the numbers a*1 mod p, a*2 mod p , ... a*(p-1) mod p are 
      all different, i.e. just a permutatation of 1,...,p-1

    Now take their product:
      (p-1)! = (a*1) mod p * (a*2) mod p *...*(a*(p-1)) mod p
    or
      (p-1)! == (a*1*a*2*...a*(p-1)) mod p
             == a^(p-1) (p-1)! mod p

    Suppose we could "divide by" (p-1)!; 
    would get 1 == a^(p-1) mod p as desired

    Now let's do proof carefully:
    Proof of Conjecture: suppose 1 <= x,y < p , x \= y
             so -(p-1) <= x-y <= p-1, x \= y
             so p \| x-y
             so p \| a*(x-y)
             so a*x mod p \= a*y mod p
       In other words, a*1 mod p, a_2 mod p , ... , a*(p-1) mod p 
       all different as conjectured.

   So now we have (p-1)! == a^(p-1)*(p-1)! mod p, and want
   to conclude 1 == a^(p-1) mod p
ASK&WAIT: What did we prove last time that lets us do this?
   Thus (p-1)!*x == 1 mod p has unique solution, multiply through to get
   (p-1)!*x == a^(p-1)*(p-1)!*x mod p 
     or
          1 == a^(p-1)*1 mod p
     as desired
For homework, you will show more, that 
   (p-1)! == -1 mod p (Wilson's Theorem)