Shape from Motion

When an ellipse is rotated in the image plane, one soon begins to see a tilted circular disk wobbling and rotating (imagine a rotating satellite dish). This phenomenon is known as the stereo-kinetic illusion, which belongs to the larger class of shape-from-motion problem. Perceiving the 3-dimensional (3-D) surface layout of the environment from motion information is arguably one of the two most important questions in visual perception.

At play here is a fundamental problem in motion perception called the correspondence problem. When an ellipse rotates, the visual system must calculate the precise direction of movement for every segment of the shape over time. To illustrate, imagine a fat ellipse. Any contour segment is similar in curvature to a number of other segments on the ellipse. This creates some ambiguity about where a particular segment ends up after an arbitrary amount of rotation. In other words, the same visual input can be interpreted in several different ways, each percept corresponding to a different rotation speed, wobble pattern, etc. So which does the brain choose to perceive? Past research has suggested that the slowest and spatially smoothest possible percept is the one perceived by the brain.

We extended prior work on 2-D motion to 3-D motion, and found that our psychophysical data matched the theoretical prediction remarkably well (Rokers, Yuille, and Liu, 2006). The key insight is: if the perceived shape is circular, the correspondence ambiguity is maximal because every segment on the circle exactly matches every other segment. This maximal ambiguity allows the visual system more opportunities to find the “slowest and smoothest” percept. This is why, we believe, once the wobbling disk is perceived, it is not easy to revert back to perceiving the ellipse – the circle percept is much more salient because of it is maximally “slow and smooth”.

A specific problem we are currently working on is: an image of two intersecting rings rotates in the image plane, and generates a percept of a wobbling cylinder. Once the cylinder is perceived, it is hard to revert back to the percept of two circles rotating in the same plane. However, you can flip the relative depths of the two circles that define the cylinder. We ask the following questions: “Why is a cylinder perceived?” “What determines the length of the cylinder?” “Does everyone perceive the cylinder as having the same length?“

We are tackling the first two questions computationally and mathematically using the similar approach in the wobbling disk study. Regarding the third question, we are studying it in a number of ways as follows: first, we take advantage of linear perspective and ask participants to adjust the size of one circle with respect to the other, so that the perceived cylinder is uniformly thick. Second, we create a stereo probe for participants to match the front and back of the cylinder in order to measure the cylinder length. Third, using a 45° mirror, we can also generate the cylinder away from the computer monitor so that participants can reach the virtual front and back of the cylinder (and their finger locations can be recorded via a sensor attached to the finger by a multi-camera motion capture system). Lastly, we ask “Do different sensory modalities (monocular linear perspective, binocular stereoscopic vision, and motor reaching) provide self-consistent measurements of the cylinder length? If not, what does this imply?“

Three-Dimensional Perception ⮕