Project

PhysioLLM

Mengying Cathy Fang

Groups

An LLM-enabled system that integrates physiological data from wearables and contextual information to offer personalized, real-time health monitoring and support.

The advent of wearable health monitors, such as Fitbit, Apple Watch, and Samsung Gear has made it possible to continuously collect detailed physiological data, such as heart rate, activity data, and sleep stages. They bring convenience and awareness to our personal health and provide a granular look into one's habits and how they affect physiology. These data and trends can help nudge healthier behavior and may even help detect health problems. While it is important to make accessible and accurate health monitoring systems, individuals who wish to change their habits are currently required to first deeply understand their physiological data and how it correlates with their daily routine, and finally think of ways to work towards positive changes. However, users often struggle to make sense of the data and translate them into meaningful actions \cite{canali2022challenges}. Interactions with the data are typically predefined by graphical user interfaces provided by the phone and wearables, which offer limited interaction and generic recommendations with few personalized insights.

Large Language Models (LLMs) potentially present a promising solution to these challenges. For one, they enable individuals to engage in unconstrained questioning and answering in natural language. Second, they have the potential to relate health data and behaviors to a wealth of health literature. Lastly, LLMs have a semantic understanding of the context that could grant flexibility in producing insights based on raw data. Integrating LLMs with physiological data offers the potential to build systems that allow users to ask questions and receive personalized responses, enhancing their understanding of their health and motivating positive behavior changes. This research addresses two main questions: (1) how to implement an LLM-based system that generates personalized insights from physiological data and communicates them through natural language, and (2) how such a system impacts users' understanding of their data and helps them develop actionable health goals.

We designed PhysioLLM, a novel system that utilizes an orchestration of LLMs to deliver personalized insights by incorporating users' own data from already available wearable health trackers together with contextual information. Different from conventional health applications, our system conducts statistical analyses of the user's data to uncover patterns and relationships within the data. As a case study, we focus on improving sleep as the main health goal. Sleeping well is one the most important things to stay healthy physically and mentally. The latest wearable devices offer in-depth reports on sleep, providing information on sleep timing, sleep stages and commonly used metrics such as wake time after sleep onset. They also typically provide a sleep score to indicate overall sleep quality. However, it is often not obvious to users how one can improve one's sleep score and the relationships between one's daytime activity and sleep.

To understand what might improve individuals’ understanding of their data and what questions they might ask a conversational interface, we recruited actual users for an in-situ experiment. 24 adult Fitbit users shared their most recent week of Fitbit data. Each participant used a text-based chatbot that was either the complete PhysioLLM system with personal data and insights, an LLM chatbot with personal data but no access to insights, or a placebo off-the-shelf LLM chatbot with no personal data or generated insights. They filled out a survey before and after interacting with the interface that assessed their understanding of their sleep data, how motivated they felt after interacting with the interface, and how actionable their goals were based on their interactions with the interface.

The results show that chatting with an LLM-based system, which provides effective personalized insights using our LLM architecture, improves one's understanding of their own health. The interface was perceived as more personalized than chatting with a generic LLM-based chatbot. In fact, the latter resulted in the user having less motivation to change, and their goals were found to be less actionable.

We also interviewed two sleep experts to review the personal insights generated by the system and its responses and suggestions provided to the user. Overall, the experts found the insights reasonable but noted the system's tendency to overemphasize correlation values. They suggested improving the system by providing the LLM with more background on the data generation process and tuning responses to be more modest when based on sparse data and potentially spurious correlations.

In summary, the contributions of this work are:

A novel orchestration of LLMs that integrates physiological and contextual data to support conversations about personalized health insights.
An in-the-wild study with 24 users that interacted with the system and the study insights derived from quantitative and qualitative results.
Evidences that show the interface is perceived as personalized and effectively improves users' understanding of their health through personalized insights.
A preliminary valuation by two sleep experts of the accuracy and quality of the generated personal insights and suggestions.