From daw@CS.Berkeley.EDU Wed Jul 29 11:25:01 PDT 1998
Article: 5551 of isaac.lists.coderpunks
Path: news.isaac.cs.berkeley.edu!not-for-mail
From: David Wagner
Newsgroups: isaac.lists.coderpunks
Subject: Re: Entropy loss
Date: 22 Jul 1998 17:07:10 -0700
Organization: ISAAC Group, UC Berkeley
Lines: 17
Sender: daemon@abraham.cs.berkeley.edu
Approved: mail2news@news.isaac.cs.berkeley.edu
Distribution: isaac
Message-ID: <199807222343.QAA14453@joseph.cs.berkeley.edu>
NNTP-Posting-Host: abraham.cs.berkeley.edu
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
X-Mailer: ELM [version 2.4 PL25]
Precedence: bulk
Xref: joseph.cs.berkeley.edu isaac.lists.coderpunks:5551
In article you write:
> This raises an issue I've been wondering about for a while: how much
> entropy is lost by recursive hashing?
Suppose we've got a random function F : {0,1}^n -> {0,1}^n.
Let f_j denote the fraction of n-bit values which can appear as the
result of iterating F j times. Then f_1 ~ 1 - 1/e, and
f_{j+1} ~ 1 - e^{-f_j}.
The first few numbers in the sequence are .632, .469, .374, .312,
.268, .235, .210, etc. Asymptotically, f_j ~ 2/(j + log j) is a
pretty good approximation.
Moreover, I'd guess that the lost entropy is going to be on the
order of lg f_j bits, which is not too large as long as j isn't too
big. Even asymptotically, this is lg f_j ~ 1 - lg(j + log j), so
if you hash it 2^k times, you'll only lose about k-1 bits of entropy
(just a guess -- I have no proof).