- hash functions (structural cryptanalysis) (2 weeks)
  - birthday paradox
    (explain via "fish-in-the-lake" anecdote; if lake has n fish,
     you only need to catch O(sqrt(n)) of them)
  - parallel collision search
  - chaining attacks (e.g., meet-in-the-middle; see HAC)
  - time-space tradeoffs, hellman's algorithm
  - double-width hashes from single-width primitives
  - fft hashing
  - comp128
  - dedicated constructions of prf's


Review of the thought problem (autokey cipher):

                            [Kind of attack]
                 Indistinguishability     Key recovery
Ciphertext-only      <= O(n)              ???
Known-plaintext      <= O(n)              O(n^2 log n)
Chosen-plaintext     sqrt(n)              O(n^2)
Chosen-ciphertext          2              O(n^2)

Birthday attack
- a simple scenario
  - Suppose there are k different fish in a pond, and we're catching
    a fish at random, tagging it, and then throwing it back in
  - How many tries before we find the first repeated fish?  sqrt(k)  [birthday]
  - How many tries before we see 63% of the fish?  k
  - How many tries before we've seen all fish?  k log k  [coupon collector]
 
What's a hash function?  h : {0,1}^* -> {0,1}^n

Applications
- Storing your password in /etc/passwd
  - How many bits of output are needed for 128-bit security?  (128 bits)
  - Requirement
    - try #1: given y=h(x), can't find x.
      - not strong enough!  if you can find x' with h(x')=y, can use x'
        as an alternate password to log in
    - preimage resistant: given h(x), can't find x' with h(x')=h(x)
    - best brute-force attack: try random x' until they coincide;
      needs 2^128 work
  - Will a CRC work?  No.  Not preimage-resistant.
    (What about a CRC with secret feedback polynomial?  No; can't
     keep the polynomial secret.)
- Detecting modification of files on hard drive
  - How many bits of output are needed for 128-bit security?  (256 bits)
    - birthday attack!
      pick 2^{n/2} random messages,
      look for a pair that collide, store first file on disk;
      then can later change to the second file without detection
- Contract signing: hash the message, and then apply a public-key signature
  - How many bits of output are needed for 128-bit security?  (256 bits)
    - birthday attack!
      pick 2^{n/2} variations on "I buy your house for $500k"
      (add spaces, etc.); then pick 2^{n/2} variations on
      "I buy your house for $200k"; look for a collision between the two
      batches; then sign the result, and give it to seller, who verifies that
      we have committed to pay $500k; then after he signs over the title to
      us, we only pay him $200k, and when he hauls us into court, we show
      the judge the valid signature on the $200k offer.


Security notions for hashes:
- preimage resistance
- 2nd preimage resistance: given x, can't find x'!=x with h(x)=h(x')
- collision resistance
- "pseudorandomness"

Best you can hope for with a n-bit hash:
- preimage resistance: 2^n work
- collision resistance: 2^{n/2} work

Hashing variable-length messages:
- Merkle-Damgard hashing
- It preserves preimage + collision resistance
- What if we leave out the length block?
  => fixed-point attacks
- What if we use CBC with non-secret key for hashing? i.e., C(h,m)=E(h+m)
  => correcting-block attacks
- What if the compression function isn't preimage resistant?
  e.g., what if it is easily invertible?
  => meet-in-the-middle attack to invert hash with 2^{n/2} work
     (if inversion takes 2^s time, then mitm attack takes 2^{(n+s)/2} time:
      compute 2^{(n+s)/2} values forward, and 2^{(n-s)/2} values backward)
- What if we repeat the message twice, then hash it with
  an invertible compression function?
  => Coppersmith's triple-birthday attack (CRYPTO'85)

Black-box cryptanalysis and multipermutations
- Show a FFT network
- Show the "nonlinear" picture from Vaudenay & Schnorr's journal 
- idea of "resolution"

Structural cryptanalysis of a S-N network
- Show a picture of a 1.5-round cipher:
  - A layer of 16 parallel 8-bit random S-boxes
  - A layer with a random linear map
  - Another layer of 16 parallel 8-bit random S-boxes
- What is its security under various guises?
  - public components
    => pre-image attacks are easy
  - secret components, random S-box functions
    => look for collisions in a single S-box in first layer: 2^4.5 messages
  - secret components, random bijective S-box functions
    => look at data path from Si in layer 1 to Sj in layer 2, and
       vary over all 256 inputs to Si; outputs from Sj will vary over
       all 256 values, or over 128 values twice, or 64 values four times
       each, or ...; gives a distinguishing attack
  - secret components, random bijective S-box functions, bijective linear map:
    => changing one byte in input is guaranteed to change all output bytes;
       gives a distinguishing attack
- What if we change to using 2 parallel 64-bit S-boxes?
  - if linear map is chosen randomly (not forced to be bijective),
    there will be a 2^32 work attack with good probability based on
    considering datapath from Si to Sj and looking for collisions in
    layer 2 of this path
- In general, for n-bit S-boxes and random linear map:
  - Let f be the datapath from Si to Sj
  - For a good hash function, f is a random function
  - For this construction, f is:
    - a random 1-to-1 (bijective) function (prob ~ .29), or
    - a random 2-to-1 function, or
    - a random 4-to-1 function, etc.