Sports Video Search Using Situated Natural Language Processing

The quantity and availability of video content is soaring due to the combination of television networks and the Internet. The aim of this project is to develop more effective means to manage, search, and translate video content. We are developing algorithms that interpret language in video (speech and closed caption text) by exploiting aspects of the non-linguistic context, or situation, conveyed by the accompanying video. We model situations by automatically finding patterns within low-level audio/video features that represent events. Event patterns are then mapped to words spoken in the video in order to create a �grounded� dictionary of word meanings. Our research focuses on sports video, in particular, on Major League Baseball games. We are exploring applications in multimedia search and video-based machine translation.