Mining Temporal Patterns of Movement for Video Content Classification

Michael Fleischman, Philip DeCamp, Deb Roy


Scalable approaches to video event recognition are limited by an inability to automatically generate representations of events that encode abstract temporal structure. This paper presents a method in which temporal information is captured by representing events using a lexicon of hierarchical patterns of movement that are mined from large corpora of unannotated video data. These patterns are then used as features for a discriminative model of event recognition that exploits tree kernels in a Support Vector Machine. Evaluations show the method learns informative patterns on a 1450-hour video corpus of natural human activities recorded in the home.

Related Content