Machine Learning and Pattern Recognition with Multiple Modalities


This project develops new theory and algorithms to enable computers to make rapid and accurate inferences from multiple modes of data, such as determining a person's affective state from multiple sensors�video, mouse behavior, chair pressure patterns, typed selections, or physiology. Recent efforts focus on understanding the level of a person's attention, useful for things such as determining when to interrupt. Our approach is Bayesian: formulating probabilistic models on the basis of domain knowledge and training data, and then performing inference according to the rules of probability theory. This type of sensor fusion work is especially challenging due to problems of sensor channel drop-out, different kinds of noise in different channels, dependence between channels, scarce and sometimes inaccurate labels, and patterns to detect that are inherently time-varying. We have constructed a variety of new algorithms for solving these problems and demonstrated their performance gains over other state-of-the-art methods.