Multimodal Ambulatory Sleep Detection Using Recurrent Neural Networks

Chen, W.*, Sano, A.*, Lopez, D., Taylor, S., McHlll, A. W., Phillips, A. J., Barger, L.K., Czeisler, C. A., Picard, R. W. "Multimodal Ambulatory Sleep Detection Using Recurrent Neural Networks," Sleep 2017 (oral), June 2017.



While polysomnography (PSG) is currently the gold standard for sleep-wake scoring, existing PSG technologies are impractical for long-term home use. Meanwhile, semi-automatic scoring from sleep diaries and actigraphy are commonly used in ambulatory sleep studies, but significant effort is required by users to maintain accurate diaries, and for researchers to check their entries for anomalies. There is thus a need for tools to enable accurate long-term evaluation of sleep timing and duration in daily life with less burden on users and researchers. To meet this need, we developed a system that analyzes large-scale physiological and behavioral data collected from smartphones and wearables using deep neural networks, and compared it to actigraphy and sleep diaries.


We collected 5580 days of multimodal data (3-axis acceleration; skin conductance and skin temperature from a wrist sensor; location and timing of calls, short message service, and screen-on from an Android phone application) from 186 undergraduate students. A deep neural network model (bidirectional long short-term memory recurrent neural networks, commonly used for speech recognition and machine translation) was applied to the collected modalities for sleep/wake classification on each 1-min epoch and for sleep episode on/offset detection. Sleep diaries and actigraphy data were also collected and examined by a human expert who (i) classified every epoch as sleep or wake and (ii) identified sleep episode onset and offset times, as labels for training and testing our model.


The deep learning computer algorithm achieved a best sleep/wake classification accuracy of 96.5%, and sleep episode on/offset detection F1 scores (measuring detection exactness and completeness) of 0.86 and 0.84 with mean errors of 5.0 and 5.5 min respectively, when compared to the labels based on human scored actigraphy with sleep diaries. Among all modalities, a combination of acceleration, skin temperature and time data gave the best overall average performance.


The results indicate that long-term ambulatory sleep/wake records from large populations can be measured unobtrusively and accurately by exploiting the ubiquity of smartphones and wearable sensors and the power of deep learning.

Related Content