SPROCKET is a computer vision system that looks at and understands
geared machines. Understanding means being able to explain how the machine
"makes sense"--how the axles and gears are arranged to turn each other in
some useful manner, and how the frame holds it all together. As
SPROCKET inspects a machine, it poses questions such as "What would
this gear have to connect to in order to serve some sensible function?" or
"What could keep this axle from falling out of the machine?" These
questions drive visual exploration. Their answers form the backbone of
SPROCKET's final understanding of the machine.
This page contains simple and complex examples of SPROCKET at work, a note on
its ancestry, and some possible applications of this technology. You can also download an early article on SPROCKET.
Here is a movie of SPROCKET exploring the
very simple machine pictured above. In the movie, highlighted areas
indicate regions of interest that SPROCKET will look for parts in, and the
trace shows the actual path of visual focus of attention. If the movie
flashes by too quickly, here is an annotated
pictorial trace of how the gearbox was figured out. Of course, the
point of all this is to get an explanation of the scene.
This is a fairly complex gearbox with a subtle design and lots of opportunities for error. SPROCKET has to contend with the fact that much of the important structure of this machine is invisible or shadowed. (How many axles are there?) Moreover, it's almost impossible to properly distinguish where one gear ends and another begins. Here's how a slice of the machine looks to SPROCKET's gear-segmenting visual routine:
Can you tell how many gears are in this strip? Or where they begin and end?
Of course we humans use context to help us with problems like
these. SPROCKET does too, by trying to figure out how various ways
of divvying up the strip of texture into gears (and bits of frame) would
combine with other known parts of the machine to make a sensible drivetrain.
Here is part of a trace of
SPROCKET puzzling out exactly this problem.
SPROCKET is an outgrowth of BUSTER, an vision system that
looks at and understands towers made of children's blocks. BUSTER
tries to show how blocks towers carry weight to the ground without falling
down; SPROCKET tries to show how geared machines carry motion
through their drivetrain without falling apart. Both use basic knowledge
about how the physical world works to help them explore a scene and make
sense of what they see. In the case of BUSTER, that basic
knowledge covers substantiality, rigidity, weight, and balance. In the case
of SPROCKET, that knowledge is extended to cover attachment,
friction, containment. Many educational toys are aimed at helping children
master these principles, especially in terms of hand-eye coordination. This
makes sense because these basic aspects of physical causality account for
much of how the physical world looks and how it responds to our
manipulations. Naturally, this seems to be a good place to start building
Some potential elaborations of SPROCKET:
Reporters like the robot mechanic angle: SPROCKET has been in the news.
- a mechanic's assistant
- a lego playmate for children
- a mechanics tutor/critic
- an assembly-line inspector
- a reverse-engineer
- a foreman for a robot assembly crew
Back to my recent work page.
Matthew Brand / MIT Media Lab / email@example.com