Here's a clarification on applying linear cryptanalysis
to find a good approximation for a linear function. Sorry
about strewing confusion in class today on this topic.
Let L be a nxn boolean matrix. Then we can view it as a
linear function that takes n-bit strings to n-bit strings,
and every linear function can be viewed this way.
Recall that the prob. of an approximation gamma -> gamma'
for function F is
Pr[gamma' . F(x) = gamma . x],
with x a random n-bit string and the probability taken
over the choice of x. Here s.t represents the dot-product
of the n-bit strings s and t. Let's view both s and t as
column vectors. Also, write s^T for the transpose of s,
so that s^T is a n-bit row vector. We can note that the
dot-product s.t is just s^T t, i.e., we take the transpose
of s (which is a row vector) and multiply it by t. (The
latter multiplication is ok, since we're multiplying a
1xn vector by a nx1 vector, and we get a 1x1 vector, i.e.,
a single bit, as expected.)
A key property of the transpose is that (A B)^T = B^T A^T.
Another property is that (A^T)^T = A.
I'm finally ready to state the claim about how to approximate
a linear function L. We can take gamma' to be arbitrary,
and then we choose gamma = L^T gamma'. Notice that L^T is
the transpose of L, and thus is another nxn boolean matrix;
also, viewing gamma' as a n-bit column vector, we can multiply
L^T by gamma' to get a n-bit column vector, as required, so
the dimensions check out as desired.
With this choice, I claim Pr[gamma' . L(x) = gamma . x] = 1.
This can be verified with some linear algebra:
gamma' . L(x)
= gamma'^T L(x)
= gamma'^T L x
= gamma'^T (L^T)^T x (2nd property of the transpose)
= (L^T gamma')^T x (1st property of the transpose)
= gamma^T x (by definition of gamma)
= gamma . x,
from which the claim follows. Therefore, every linear operation
in a cipher has a bias 1 approximation.