Work for a Member organization and need a Member Portal account? Register here with your official email address.

Publication

Grounded Language Modeling for Automatic Speech Recognition of Sports Video

Jan. 1, 2008

People

Share this publication

Michael Fleischman, Deb Roy

Abstract

Grounded language models represent the relationship between words and the non-linguistic context in which they are said. This paper describes how they are learned from large corpora of unlabeled video, and are applied to the task of automatic speech recognition of sports video. Results show that grounded language models improve perplexity and word error rate over text based language models, and further, support video information retrieval better than human generated speech transcriptions.

ACL-08.pdf

Grounded Language Modeling for Automatic Speech Recognition of Sports Video

People

Abstract

Intentional Context in Situated Language Learning

Why are verbs harder to learner than nouns? Initial insights from a computational model of situated word learning

Situated Models of Meaning for Sports Video Retrieval

Unsupervised Content-Based Indexing of Sports Video Retrieval

Grounded Language Modeling for Automatic Speech Recognition of Sports Video

People

Share this publication

Abstract

Intentional Context in Situated Language Learning

Why are verbs harder to learner than nouns? Initial insights from a computational model of situated word learning

Situated Models of Meaning for Sports Video Retrieval

Unsupervised Content-Based Indexing of Sports Video Retrieval