Publication

Vision Steered Beam-forming and Transaural Rendering for the Artificial Life Interactive Video Environment

Nov. 1, 1995

Groups

Michael A. Casey, William G. Gardner, Sumit Basu

Abstract

The average person with a networked computer can now understand why computers should have vision -- to search the world's collections of digital video and images and "retrieve a picture of _______.'' Computer vision for intelligent browsing, querying, and retrieval of imagery is needed now, and yet traditional approaches to computer vision remain far from a general solution to the scene understanding problem. In this paper I discuss the need for a solution based on combining high-level and low-level vision, that works in concert with input from a human user. The solution is based on: 1) Learning from the user what is important visually, and 2) Learning associations between text descriptions and visual data. I describe some recent results in these areas, and overview key challenges for future research in computer vision for digital libraries.

Related Content