Visually-Grounded Language Understanding for the Interface of an Interactive Design System

In this project we demonstrate an application of our work in understanding natural language about spatial scenes. All 3-D modeling applications face the problem of letting their users interact with a 2-D projection of a 3-D scene. Rather than the common solutions that include multiple views, and selective display and editing of the scene, we employ our language learning and understanding research to allow for speech-based selection and manipulation of objects in the 3-D scene. We demonstrate such an interface based on our Bishop project for the 3-D modeling application Blender.