Next: Synthesis Up: Hybrid rigid and non-rigid Previous: Image Analysis

Training

Let e₁, f₁, s₁ be the positions of the fist, elbow, and shoulder in image 1; e₂, f₂, and s₂ are the corresponding points in image 2. These are 2D homogenous image points in our experiments; they could also be 3D homogeneous points for images with depth or disparity. We have found that rendering is not sensitive to the precise locations of these points, and approximate tracking or annotations are sufficient -- non-rigid interpolation accounts for this. This is encouraging.

Let $\vec{F} = e_1 - s_1$ , $\vec{S} = f_1 - e_1$ . All training angles for the arm will be placed in a matrix $\Theta$ , where each row corresponds to an example image. $\Theta_{i,1}$ is the angle between $\vec{S}$ and $\vec{F}$ (i.e the angle of the forearm relative to the angle the upper arm segment): $\Theta_{i,1} = cos^{-1} \frac{\vec{S}^T \vec{F}}{\vert\vert\vec{S}\vert\vert \vert\vert\vec{F}\vert\vert}$ . $\Theta_{i,2}$ is the absolute angle of the upper arm segment: $\Theta_{2,i} = cos^{-1} \frac{[0,1,0] \vec{S}}{\vert\vert\vec{S}\vert\vert}$ . Additional example images for the arm are added as additional rows.

The following treatment shows the math for the forearm; the upper arm is treated in the same way.

The example images and dense correspondence are converted to image lists to make correspondences explicit. Each image point from the forearm in image 1 is represented as a vector p=[x,y,1,r,g,b]^T, expressing the image position and the texture value of the point. These points are combined into a matrix: P=[ p₁, p₂, p₃, ... ]. The corresponding points are placed in another matrix: Q=[ q₁, q₂, q₃, ... ]. The matrices are ordered so that p_i in image 1 corresponds to q_i in image 2.

For each segment, we first align the examples. Let T be a transformation that aligns the forearm in image 2 to the forearm in image 1: $T e_2 \dot{=}e_1$ , $T (f_2 - e_2) \dot{=}(f_1 - e_1)$ )In other words, the two elbows coincide and the fists point in the same direction. Let be Q' be the Q aligned: $Q' = \mbox{$\left [ \begin{array} {c} {T S} \\ {X}\end{array} \right ]$}$ where $Q = \mbox{$\left [ \begin{array} {c} {S} \\ {X}\end{array} \right ]$}$ (S is the shape component of Q and X is the texture). Any additional example images will be aligned to the same axis.

P and Q are now used as training data for an RBF. Let $\vec{P}$ and $\vec{Q}$ be the vectorized forms of P and Q ( $\vec{P} = [ p_{11}, p_{21}, ... p_{12}, p_{22} ... ]$ . The RBF training equation is

$\begin{displaymath} C \Phi = D\end{displaymath}$

where $D = \left [ \begin{array} {c} \vec{P} \\ \vec{Q} \end{array} \right ]$ , $\Phi_{i,j} = \phi(\vert\vert\Delta(\Theta_i - \Theta_j)\vert\vert)$ . This last term quantifies the distances between the parameters values between the two images: $\Theta_i = [ \theta_{i,1}, \theta_{i,2} ]$ as defined above. The difference between two scalar angle values must be computed with respect to cycles; we define $\Delta(a) = \vert \mbox{mod}(a,2\pi) - \pi\vert, \Delta([a, b, c, ...]) = [\Delta(a), \Delta(b), \Delta(c), ...]$ . $\vert\vert \cdot \vert\vert$ denotes the L2-norm (Euclidean distance). We use $\phi(x) = x$ (a linear RBF). Additional example images are rectified as above and added as extra rows to the training matrices ( $\theta$ and D.)

We solve for the coefficients with

$\begin{displaymath} C = \Phi^+ D \end{displaymath}$

The two body segments use the same training parameters $\Theta$ .In general, each link in the chain should use training parameters corresponding to it's relative orientation, as well as the orientation of any connected links.

Next: Synthesis Up: Hybrid rigid and non-rigid Previous: Image Analysis

Trevor Darrell
10/29/1998