From daw@cs.berkeley.edu Sat Nov 14 17:39:43 PST 1998
Article: 1213 of sci.crypt.research
From: daw@cs.berkeley.edu (David Wagner)
Newsgroups: sci.crypt.research
Subject: Re: Entropy vs Work (Mistake in Crypto FAQ?)
Date: 14 Nov 1998 14:05:04 GMT
Organization: ISAAC Group, UC Berkeley
Lines: 93
Approved: crypt-submission@cs.auckland.ac.nz
Message-ID: <72k2mg$gm0$1@scream.auckland.ac.nz>
References: <72dcle$us7$1@joseph.cs.berkeley.edu>
Reply-To: daw@cs.berkeley.edu (David Wagner)
NNTP-Posting-Host: gate.student.auckland.ac.nz
X-Newsreader: NN version 6.5.0 #3 (NOV)
Path: agate!newsfeed.berkeley.edu!newsfeed.cwix.com!203.97.37.7!newsfeed.clear.net.nz!news.iprolink.co.nz!auckland.ac.nz!not-for-mail
Xref: agate sci.crypt.research:1213
In article <72dcle$us7$1@joseph.cs.berkeley.edu>,
John Pliam wrote:
> I believe I have a very strong family of counterexamples to a
> statement in the "Cryptography FAQ".
>
> > We can measure how bad a key distribution is by calculating
> > its entropy. This number E is the number of ``real bits
> > of information'' of the key: a cryptanalyst will typically
> > happen across the key within 2^E guesses. E is defined as
> > the sum of -p_K log_2 p_K, where p_K is the probability
> > of key K.
>
> As a general rule, this is dangerously wrong.
Agreed. Good catch!
As a general rule, there are many measures of entropy (not just the
Shannon entropy defined above), which are useful in different contexts;
confusing them can be dangerous.
Here are several examples:
+ Shannon entropy: H(X) = - \sum_x Pr_X(x) \log_2 Pr_X(x)
Useful for analyzing one-time pads, secret sharing, compression,
asymptotics, etc.
+ Renyi entropy (of order 2): H_2(X) = - \log_2 \sum_x Pr_X(x)^2
Useful for analyzing the probability of collisions.
+ ``Guessing'' entropy: H_G(X) = \log_2 \sum_i i p_i
if the probabilities Pr_X(x) are arranged so that p_1 >= p_2 >= ...
Useful for analyzing the expected cost of keysearch, inverting hash
functions, etc.
I'm stealing liberally from Chapter 3 of [1] here.
The different measures of entropy don't necessarily agree, of course.
I think many people are often very sloppy about their use of the word
``entropy.'' Your paper seems to be a useful reminder of just how
misleading our intuition can be.
Perhaps an example would be useful. Let's take Renyi entropy, say.
Suppose we have a memoryless source which generates outputs according to
the distribution of the random variable X. Then H_2(X) is a good
measure of the collision properties of the source. The probability that
any two fixed outputs collide is 2^{-H_2(X)}; also, after n outputs, the
expected number of collisions is n(n-1)/2 * 2^{-H_2(X)}. See [1] for
other results.
As an application of Renyi entropy, suppose we encrypt plaintext under
ECB mode, and the random variable X represents the distribution of
blocks of plaintext. Then after n outputs the expected number of
collisions in ciphertext blocks which leak information on the plaintext
is n(n-1)/2 * 2^{-H_2(X)}. So when we are interested in collision
properties, it seems that the Renyi entropy roughly matches our
intuition.
As an example application of ``guessing'' entropy, imagine analyzing the
Unix password hashing scheme, where we are given the hash of the user's
password. If we let the r.v. X represent the distribution of users'
passwords, then the expected amount of work needed (in an optimal
dictionary search attack) to recover a single target's password is
2^{H_G(X)}.
You give a generalization, wf_p(X), which measures the expected work
required to gain a probability p of success. This seems useful in the
case that we have many cryptanalytic targets. Suppose we have a
password file containing n hashes; if we do wf_{c/n}(X) work, the
expected number of passwords broken is c (assuming independence of
password choices).
Note that [1] cites a paper by Massey [2] that apparently proves the
bound 2^{H_G(X)} >= 2^{H(X)-2} + 1. [1] also cites a paper by McEliece
and Yu [3] that apparently proves a weak bound in the other direction,
H(X) >= (2^{H_G(X)} - 1) 2 \log_2 |X| / (|X| - 1), where |X| represents
the number of different possible values X can take on. See [1] for more
results in this vein.
Note that for the uniform distribution, the measures of entropy roughly
agree [H(X) = H_2(X) = \log_2 |X|, H_G(X) = (\log_2 (|X|+1)) - 1].
Interestlying, [1] mentions that when X is nearly uniform in the sense
that H(X) = \log_2 |X| - c for small c, then the other measures of
entropy roughly agree too. Therefore, to prove security, it is
apparently enough to show that H(X) is close to \log_2 |X|.
[1] Christian Cachin, Entropy Measures and Unconditional Security in
Cryptography, PhD thesis, ETH Zurich, May 1997.
ftp://ftp.inf.ethz.ch/pub/publications/dissertations/th12187.ps.gz
[2] James Massey, ``Guessing and entropy,'' Proc. 1994 IEEE Int'l Symp.
on Information Theory, 1994, p. 204.
[3] Robert McEliece and Zhong Yu, ``An inequality on entropy,'' Proc.
1995 IEEE Int'l Symp. on Information Theory, 1995, p. 329.