Bishop: Comprehension of complex spatial descriptions

The Bishop project explores the subtleties of human language talking about spatial scenes. In particular, we investigate the various descriptive strategies human speakers employ in talking about objects in relation to other objects. These strategies include ordering objects, visually grouping them, describing their spatial relation or even referring back to objects that used to be in the scene. Furthermore, human subjects frequently perform combinations of these strategies, for example "the green one to the left of the three purple ones". We are building a computational system that replicates both the individual phenomena and their compositional behaviour. As a result, this system understands relatively complex expressions referring to a scene of objects and can indicate the object being described. This work has direct applications in understanding for natural language user interfaces, especially in augmenting direct manipulation interfaces with intelligent speech control. A good example are speech interfaces for GPS map devices in cars where users speak about objects on the map.

Peter Gorniak, Deb Roy

Related papers:

Peter Gorniak and Deb Roy. Grounded Semantic Composition for Visual Scenes. Journal of Artificial Intelligence Research, 2004. pdf (1.2 MB)

Peter Gorniak and Deb Roy. A Visually Grounded Natural Language Interface for Reference to Spatial Scenes. ICMI 2003. pdf (562K)