MIT Media Lab, E14-633
The proliferation of inexpensive video recording hardware and enormous storage capacity has enabled the collection of retail customer behavior at an unprecedented scale. Who will watch the billions of hours of video data that is captured every year?
This thesis presents Mimic, a system that processes video captured in a retail store into predictions about customer proclivity to purchase. Mimic relies on the observation that aggregate patterns of all of a store’s patrons— the gestalt—captures behavior indicative of an imminent transaction. Video is distilled into a homogenous feature vector of activity by first tracking the locations of customers, then discretizing their movements using a collection of functional locations.
Mimic is evaluated on a small operational retail store located in the Mall of America near Minneapolis, Minnesota. Its performance is characterized across a wide cross-section of the model’s parameters.
Using this classification scheme, the behavior of customers in the store can be examined at fine levels of detail without foregoing the potential afforded by big data. Mimic enables a suite of valuable tools. For ethnographic researchers, it offers a technique for identifying key moments in hundreds or thousands of hours of raw video. Retail managers gain a fine-grained metric to evaluate the performance of their stores, and interior designers acquire a critical component in a store layout optimization framework.
Host/Chair: Deb Roy
Eric Grimson, Leslie Kaelbling