Introduction

Until now we dealt primarily with single image and the type of information encoded in it as well as with the means how to extract it. In some evolutionary circles, it is believed that the estimation of the motion of predators advancing at a mobile animal was important to its ability to take flight away from the predator and hence survive. In this lecture, we will deal with the problem of recovering the motion of objects in the 3-D world from the motion of segments on the 2-D image plane.

The technical problem with estimating the motion of objects in 3-D is that in the image formation process, due to the perspective projection of the 3-D world onto the 2-D image plane, some of the information is lost. We will now address several ways of recovering the 3-D information from 2-D images using various ``cues''. These cues are motion, binocular stereopsis, texture, shading and contour. In this lecture we will content ourselves with studying motion flow.

If the projection of a 3-D point on the image plane is a point with image coordinates , simple inversion of the perspective projection equation in order to obtain 3-D information will not work, since there are infinitely many points in the 3-D which would get projected to the same point in the image plane, all lying on a line going through the center of projection and the point (see Figure 1). Thus, some additional information is needed in order to recover the 3-D structure from 2-D images. One possible way how to extract this 3-D information is from time-varying sequences. This 3-D information crucial for performing certain task, such as manipulation, navigation, recognition.

Figure 1: Displacement of a point in the environment causes a displacement of the corresponding image point. The relationship between the velocities can be found by differentiating the perspective projection equation. For more details see (B. Horn, ``Robot Vision'', MIT Press, 1986, Chapter 12, Chapter 17)

There is lot of biological motivation for studying motion flow (besides being eaten by sabre toothed tigers, that is):

Human beings do it all the time without even realizing it, for example, this is why we have saccadic eye movements (that is our eyes jump from focusing at one spot to another). Thus, if the scene has no motion, and we are still our eyes are moving. There are celebrated experiments made on the movements of the pupils of people looking at the Mona Lisa for instance showing the eyes darting from the eyes to the lips to the mouth and then the hair and so on.
Simple experiment can demonstrate how motion can reveal something about the 3-D structure. Fixating on something close and very far away and moving your head (either sideways or forward and backward), you can notice that the retinal image of the close by tree moves more then the one of a distant tree, i.e. the motion in the retinal plane is inversely proportional to the distance from the retinal plane.
There are a few examples from the animal world, where the motion helps the animals to obtain some information about the environment structure, e.g. pigeons move their necks to get the so called ``motion parallax''.

Next: Generation of Optical Up: Motion Flow in Computer Previous: Motion Flow in Computer

S Sastry
Sun May 5 23:42:22 PDT 1996