| |
|

Above:
How the computer sees your arm. Once the back-ground
(which in this case includes the table the person
is sitting at) has been subtracted out, the computer
fuzzes the image a bit. The gradient tells the computer
how far off it is, minimizing the number of iterations
it takes to find the arm. The red lines are the computer's
guess of the arm's position; the computer then samples
the image at the blue crosses to see how good the
alignment is. Right: A conceptual rendering of NASA's
Robonaut, which may be guided by such software. Half
humanoid, half scorpioid, Robonaut's "stinger"
allows it to attach itself to sockets in the Space
Station's exterior members or to the Space Shuttle's
manipulator arm. The backpack, which can be changed
from mission to mission, holds tools and accessories
(think vacuum-cleaner attachments), and can also be
used as a mounting point. Below: Some Robonaut hardware,
like this prototype arm, is already taking shape
|
|
|
|
|
As the camera
rolls, the computer looks at each frame and finds the person by subtracting
a back-ground image shot before the person arrived. The system then uses
whats called a Kalman filter, which incorporates a mathematical model
of how the object is allowed to move, to gure out the arms position. "They're
usually used for projectiles-you know the laws of physics, so you can
estimate a very good trajectory from noisy observations," Gonçalves
explains. (In this case, the "noise" includes such things as
baggy sleeves that mask the arms position.) The Kalman filter also enables
the system to operate in real time, because the computer only examines
the part of the image where the filter predicts the arm must beif
you know the arm is moving up and to the left, for example, theres no
point in looking for it in the image's lower right corner. "We process
only 900 pixels out of 300,000 in the image."
In 1995, says
Gonçalves, the available biomechanical models of human motion "worked
under limited conditions. One smooth gesture, say. Not for general movement."
So the trio created their own model that described the relative positions
and angular velocities of the elbow and shoulder joints. Its a very simple
modeltwo truncated cones with two joints, four rotational degrees
of freedom, and no hand motion. It assumed the velocities were the same
as they had been in the previous frame, but it incorporated a random-velocity
component that allowed it to cope with speed and direction changes. (If
you change direction really violently, it may still lose you.)
The filter
estimates where the arm is and compares the estimate with the image. The
first guess is never dead-on, says Di Bernardo, "so the difference
between the two gives you an error measurement. And you input that error
back into the model recursively, and it tries to bring the error down
to zero." Adds Gonçalves, "You could have an iterative
process that keeps repeating until it converges to the best pose at each
image, but that's not very efficient computationally. A Kalman filter
converges over time, but at each image it does only one iteration, so
you don't have to do a lot of computations.
" The
system reliably estimates the arms position to within five centimeters
in all directions, including along the cameras line of sightthe
hardest direction to calculate. Based on this work, the Perona lab is
contracting with JPL to provide the "front end" of a vision
based control system that may be used for Robonaut, a humanoid (from the
waist up) robot that NASA is developing to help build the space station.
Robonaut is designed to cut down on human spacewalksit will mimic
the movements made by an operator aboard the space shuttle, pantomiming
for a camera. So as the operator tightens a virtual pipe with a virtual
wrench, or whatever, Robonaut will tighten the real thing. (A pair of
TV cameras in Robonauts head will allow the operator to see what Robonaut
is doing.) Says Gonçalves, "NASA didn't want any electromagnetic
sensors, because of the potential for interference with other shuttle
systems." "They really like the camera-based solution,"
Di Bernardo adds having demonstrated that they could capture 3-D arm motion
without tracking specific features, the research group was ready to take
on the whole body. This was a far more ambitious projectthere were
14 major joints (not counting fingers and toes), more than 50 degrees
of freedom, and an assortment of shapes to contend with. Meanwhile, computer
animation had made great strides, and fully jointed human models had become
available in commercial graphics packages. But these models didn't help
the Kalman filter decide where to look, says Gonçalves. "The
models are very good anatomicallythe geometry of the skeleton, the
range of motion of the joints, the appearance of the surfacebut
they're static. There's no model for how people move, no synchrony of
all the parts. Either a human animator draws a series of intermediate
poses, or the model takes data from a motion-capture system with markers.
The model doesn't generate the motion.
|