Becky Ham | MIT Media Lab
Becky Ham | MIT Media Lab
As it becomes easier to create hyper-realistic digital characters using artificial intelligence, much of the conversation around these tools has centered on misleading and potentially dangerous deepfake content. But the technology can also be used for positive purposes — to revive Albert Einstein to teach a physics class, talk through a career change with your older self, or anonymize people while preserving facial communication.
To encourage the technology’s positive possibilities, MIT Media Lab researchers and their collaborators at the University of California at Santa Barbara and Osaka University have compiled an open-source, easy-to-use character generation pipeline that combines AI models for facial gestures, voice, and motion and can be used to create a variety of audio and video outputs.
The pipeline also marks the resulting output with a traceable, as well as human-readable, watermark to distinguish it from authentic video content and to show how it was generated — an addition to help prevent its malicious use.
By making this pipeline easily available, the researchers hope to inspire teachers, students, and health-care workers to explore how such tools can help them in their respective fields. If more students, educators, health-care workers, and therapists have a chance to build and use these characters, the results could improve health and well-being and contribute to personalized education, the researchers write in Nature Machine Intelligence.
“It will be a strange world indeed when AIs and humans begin to share identities. This paper does an incredible job of thought leadership, mapping out the space of what is possible with AI-generated characters in domains ranging from education to health to close relationships, while giving a tangible roadmap on how to avoid the ethical challenges around privacy and misrepresentation,” says Jeremy Bailenson, founding director of the Stanford Virtual Human Interaction Lab, who was not associated with the study.
Although the world mostly knows the technology from deepfakes, “we see its potential as a tool for creative expression,” says the paper’s first author Pat Pataranutaporn, a PhD student in professor of media technology Pattie Maes’ Fluid Interfaces research group.
Other authors on the paper include Maes; Fluid Interfaces master’s student Valdemar Danry and PhD student Joanne Leong; Media Lab Research Scientist Dan Novy; Osaka University Assistant Professor Parinya Punpongsanon; and University of California at Santa Barbara Assistant Professor Misha Sra.
Deeper truths and deeper learning
Generative adversarial networks, or GANs, a combination of two neural networks that compete against each other, have made it easier to create photorealistic images, clone voices, and animate faces. Pataranutaporn, with Danry, first explored its possibilities in a project called Machinoia, where he generated multiple alternative representations of himself — as a child, as an old man, as female — to have a self-dialogue of life choices from different perspectives. The unusual deepfaking experience made him aware of his “journey as a person,” he says. “It was deep truth — to uncover something about yourself you’ve never thought of before, using your own data on your own self.”
Self-exploration is only one of the positive applications of AI-generated characters, the researchers say. Experiments show, for instance, that these characters can make students more enthusiastic about learning and improve cognitive task performance. The technology offers a way for instruction to be “personalized to your interest, your idols, your context, and can be changed over time,” Pataranutaporn explains, as a complement to traditional instruction.
For instance, the MIT researchers used their pipeline to create a synthetic version of Johann Sebastian Bach, which had a live conversation with renowned cellist Yo Yo Ma in Media Lab Professor Tod Machover’s musical interfaces class — to the delight of both the students and Ma.
Other applications might include characters who help deliver therapy, to alleviate a growing shortage of mental health professionals and reach the estimated 44 percent of Americans with mental health issues who never receive counseling, or AI-generated content that delivers exposure therapy to people with social anxiety. In a related use case, the technology can be used to anonymize faces in video while preserving facial expressions and emotions, which may be useful for sessions where people want to share personally sensitive information such as health and trauma experiences, or for whistleblowers and witness accounts.
But there are also more artistic and playful use cases. In this fall’s Experiments in Deepfakes class, led by Maes and research affiliate Roy Shilkrot, students used the technology to animate the figures in a historical Chinese painting and to create a dating “breakup simulator,” among other projects.
Legal and ethical challenges
Many of the applications of AI-generated characters raise legal and ethical issues that must be discussed as the technology evolves, the researchers note in their paper. For instance, how will we decide who has the right to digitally recreate a historical character? Who is legally liable if an AI clone of a famous person promotes harmful behavior online? And is there any danger that we will prefer interacting with synthetic characters over humans?
“One of our goals with this research is to raise awareness about what is possible, ask questions and start public conversations about how this technology can be used ethically for societal benefit. What technical, legal, policy and educational actions can we take to promote positive use cases while reducing the possibility for harm?” states Maes.
By sharing the technology widely, while clearly labeling it as synthesized, Pataranutaporn says, “we hope to stimulate more creative and positive use cases, while also educating people about the technology’s potential benefits and harms.
Pataranutaporn, Pat, Valdemar Danry, and Pattie Maes. "Machinoia, Machine of Multiple Me: Integrating with Past, Future and Alternative Selves." Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems. 2021.
Danry, Valdemar, et al. "Wearable Reasoner: Towards Enhanced Human Rationality Through A Wearable Device With An Explainable AI Assistant." ACM Augmented Humans 2020.