
Manual annotation of new images in large image collections is
prohibitively expensive for commercial databases, and overly
time-consuming for the home photographer. However, low-cost imaging,
storage and communication technologies have already made accessible
millions of images that are meaningfully associated with text in the
form of captions or keywords. It is tempting to see these pairings of
visual and linguistic representations as a kind of distributed Rosetta
Stone from which we may learn to automatically translate between the
names of things and their appearances. Our algorithm uses the
repetition of appearance across an unstructured collection of captioned
images and a measure of correspondence with caption words to learn to
recognize named objects.
Sven Dickinson
Faculty
Yulia Eskin
Undergraduate Student
Afsaneh Fazly
Post Doctoral Fellow
Mike Jamieson
Graduate Student
Suzanne Stevenson
FacultyMore Research Profiles