The Machine Stares Back by Douglas L. Smith | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
     
 













Above: How the computer sees your arm. Once the back-ground (which in this case includes the table the person is sitting at) has been subtracted out, the computer fuzzes the image a bit. The gradient tells the computer how far off it is, minimizing the number of iterations it takes to find the arm. The red lines are the computer's guess of the arm's position; the computer then samples the image at the blue crosses to see how good the alignment is. Right: A conceptual rendering of NASA's Robonaut, which may be guided by such software. Half humanoid, half scorpioid, Robonaut's "stinger" allows it to attach itself to sockets in the Space Station's exterior members or to the Space Shuttle's manipulator arm. The backpack, which can be changed from mission to mission, holds tools and accessories (think vacuum-cleaner attachments), and can also be used as a mounting point. Below: Some Robonaut hardware, like this prototype arm, is already taking shape

As the camera rolls, the computer looks at each frame and finds the person by subtracting a back-ground image shot before the person arrived. The system then uses whats called a Kalman filter, which incorporates a mathematical model of how the object is allowed to move, to gure out the arms position. "They're usually used for projectiles-you know the laws of physics, so you can estimate a very good trajectory from noisy observations," Gonçalves explains. (In this case, the "noise" includes such things as baggy sleeves that mask the arms position.) The Kalman filter also enables the system to operate in real time, because the computer only examines the part of the image where the filter predicts the arm must be—if you know the arm is moving up and to the left, for example, theres no point in looking for it in the image's lower right corner. "We process only 900 pixels out of 300,000 in the image."
In 1995, says Gonçalves, the available biomechanical models of human motion "worked under limited conditions. One smooth gesture, say. Not for general movement." So the trio created their own model that described the relative positions and angular velocities of the elbow and shoulder joints. Its a very simple model—two truncated cones with two joints, four rotational degrees of freedom, and no hand motion. It assumed the velocities were the same as they had been in the previous frame, but it incorporated a random-velocity component that allowed it to cope with speed and direction changes. (If you change direction really violently, it may still lose you.)

The filter estimates where the arm is and compares the estimate with the image. The first guess is never dead-on, says Di Bernardo, "so the difference between the two gives you an error measurement. And you input that error back into the model recursively, and it tries to bring the error down to zero." Adds Gonçalves, "You could have an iterative process that keeps repeating until it converges to the best pose at each image, but that's not very efficient computationally. A Kalman filter converges over time, but at each image it does only one iteration, so you don't have to do a lot of computations.



" The system reliably estimates the arms position to within five centimeters in all directions, including along the cameras line of sight—the hardest direction to calculate. Based on this work, the Perona lab is contracting with JPL to provide the "front end" of a vision based control system that may be used for Robonaut, a humanoid (from the waist up) robot that NASA is developing to help build the space station. Robonaut is designed to cut down on human spacewalks—it will mimic the movements made by an operator aboard the space shuttle, pantomiming for a camera. So as the operator tightens a virtual pipe with a virtual wrench, or whatever, Robonaut will tighten the real thing. (A pair of TV cameras in Robonauts head will allow the operator to see what Robonaut is doing.) Says Gonçalves, "NASA didn't want any electromagnetic sensors, because of the potential for interference with other shuttle systems." "They really like the camera-based solution," Di Bernardo adds having demonstrated that they could capture 3-D arm motion without tracking specific features, the research group was ready to take on the whole body. This was a far more ambitious project—there were 14 major joints (not counting fingers and toes), more than 50 degrees of freedom, and an assortment of shapes to contend with. Meanwhile, computer animation had made great strides, and fully jointed human models had become available in commercial graphics packages. But these models didn't help the Kalman filter decide where to look, says Gonçalves. "The models are very good anatomically—the geometry of the skeleton, the range of motion of the joints, the appearance of the surface—but they're static. There's no model for how people move, no synchrony of all the parts. Either a human animator draws a series of intermediate poses, or the model takes data from a motion-capture system with markers. The model doesn't generate the motion.

The Machine Stares Back by Douglas L. Smith | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |