• Login
  • Register

Work for a Member organization and need a Member Portal account? Register here with your official email address.

Project

Simulating human well-being with large language models: Systematic validation and misestimation across 64,000 individuals from 64 countries

Copyright

Nigel Hoare/Unsplash+

Nigel Hoare for Unsplash+

Can AI Predict Human Happiness? Researchers Simulate Well-Being for 64,000 Individuals Worldwide

A new study reveals both promise and peril in using large language models to simulate human wellbeing across diverse global populations

Artificial intelligence has become remarkably adept at mimicking human language. But can it predict something as deeply personal and culturally nuanced as human happiness?

A new study suggests the answer is complex, and reveals critical limitations that could have far-reaching implications for how AI is deployed in policy, healthcare, and development contexts worldwide. The research, recently published in the Proceedings of the National Academy of Sciences (PNAS), represents the first large-scale systematic evaluation of whether leading AI models can accurately estimate life satisfaction across dramatically different psychological, cultural, and economic contexts.

In a large scale investigation spanning 64,000 individuals from 64 countries, researchers Pat Pataranutaporn, Chayapatr Achiwaranguprok, and Pattie Maes of the MIT Media Lab, together with behavioral and well-being economist Nattavudh Powdthavee of Nanyang Technological University (NTU), conducted one of the largest tests to date of whether large language models can simulate human well-being. Their findings reveal a striking duality: today's AI systems can reproduce broad patterns of global life satisfaction, but also carry deep structural biases that risk obscuring the lived realities of millions.

"This work focuses on understanding the potential and limitations of using AI to model elements of human experience, not to diminish its richness, but to use these tools to meaningfully inform policy and interventions that improve the human condition," explains Pat Pataranutaporn, assistant professor at MIT, co-director of the AHA (Advancing Humans with AI) program at MIT Media Lab, and co-lead author of the study. Pataranutaporn, along with co-author Chayapatr Achiwaranguprok, are members of the new Cyborg Psychology research group at MIT Media Lab, which investigates the intersection of AI and human cognition.

Abstract

Subjective well-being is a key metric in economic, medical, and policy decision-making. As artificial intelligence provides scalable tools for modelling human outcomes, it is crucial to evaluate whether large language models (LLMs) can accurately predict well-being across diverse global populations. We evaluate four leading LLMs using data from 64,000 individuals in 64 countries. While LLMs capture broad correlates such as income and health, their predictive accuracy decreases in countries underrepresented in the training data, highlighting systematic biases rooted in global digital and economic inequality. A pre-registered experiment demonstrates that LLMs rely on surface-level linguistic similarity rather than conceptual understanding, leading to systematic misestimations in unfamiliar or resource-limited settings. Injecting findings from underrepresented contexts substantially enhances performance, but a significant gap remains. These results highlight both the promise and limitations of LLMs in predicting global well-being, underscoring the importance of robust validation prior to their implementation across these areas.