AlterEgo is a wearable, silent speech system for silent and seamless natural language communication with computing devices and other people. AlterEgo seeks to augment human intelligence and make computing, the Internet, and machine intelligence a natural extension of the user's own cognition by enabling a silent, discreet, and seamless conversation between person and machine. The wearable system reads electrical impulses from the surface of the skin in the lower face and neck that occur when a user is internally vocalizing words or phrases – without actual speech, voice, or discernible movements. The system is a private, personal, and seamless alternative to computing platforms used today.
The goals of AlterEgo are to cognitively augment humans, change the way people communicate with one another, and enable a discreet gateway to digital information (services and applications) where the interaction is intrinsic rather than something extrinsic.
Our current interfaces are a barrier to effortless and private human-machine communication. People either have to shift their attention away from their surroundings to type, or they have to say their private messages out loud, in public. AlterEgo overcomes these barriers by allowing users to silently and seamlessly interface with a computer without the need for explicit actions. It enables a method of human-computer interaction without obstructing the user’s usual perception, thereby letting the user remain present in her surroundings.
A user's deliberate internal speech is characterized by neuromuscular signals in internal speech articulators that are captured by the AlterEgo system to reconstruct this speech. We use this to facilitate a novel user interface where a user can silently communicate in natural language and receive auditory output through bone conduction headphones, thereby enabling discreet, bi-directional interaction with a computing device, and providing a seamless form of intelligence augmentation.
Silent speech is different from either thinking of words or saying words out loud. Remember when you first learned to read? At first, you spoke the words you read out loud, but then you learned to voice them internally and silently. In order to then proceed to faster reading rates, you had to unlearn the “silent speaking” of the words you read. Silent speaking is a conscious effort to say a word, characterized by subtle movements of internal speech organs without actually voicing it. The process results in signals from your brain to your muscles which are picked up as neuromuscular signals and processed by our device.
No, this device cannot read your mind. The novelty of this system is that it reads signals from your facial and vocal cord muscles when you intentionally and silently voice words. The system does not have any direct and physical access to brain activity, and therefore cannot read a user's thoughts. It is crucial that the control over input resides absolutely with the user in all situations, and that such an interface not have access to a user's thoughts. The device only reads words that are deliberately silently spoken as inputs.
The AlterEgo system consists of the following components:
(1) A new peripheral myoneural interface for silent speech input which reads endogenous electrical signals from the surface of the face and neck that are then processed to detect words a person is silently speaking;
(2) Hardware and software to process electrophysiological signals, including a modular neural network based pipeline trained to detect and recognize word(s) silently spoken by the user;
(3) An intelligent system that processes user commands/queries and generates responses; and
(4) Bone conduction output to give audio information back to the user, such as the answer to a question, or confirmation regarding a command.
We live in a world where we frequently interface with a computer, but at the cost of our daily face-to-face communication and/or our privacy. AlterEgo allows users to seamlessly and efficiently interface with their computing devices without unplugging from their environments or even disrupting their real-world interactions. AlterEgo puts the power of computing in a user’s self, instead of on her fingertips, so that users can indiscernibly and effortlessly interface with a computer to record their ideas, send private messages, look up information, compute arithmetic, or interface with AI assistants. AlterEgo facilitates intelligence augmentation by removing the social and physical overhead on human-machine communication.
The platform opens up a wide range of possibilities. This platform allows a human user to connect to the Internet, and access the knowledge of the web in real-time as an extension of the user’s self; a user could internally vocalize a Google query and get a resultant answer through bone conduction without any observable action at all. The platform seeks to augment human speech as a way to transmit information between people.
The system has implications for telecommunications, where people could communicate with the ease and bandwidth of vocal speech with the addition of the fidelity and privacy that silent speech provides. The system acts as a digital memory; the user could internally record streams of information and access these at a later time through the system. Users with memory problems can silently ask the system to remind them of the name of an acquaintance or an answer to a question, without the embarrassment that comes from openly asking for this information. The system allows a human user to control internet-of-things (IoT) devices and control diverse appliances without any observable action. AlterEgo, with a connection to a bluetooth speaker, allows a user to internally vocalize a phrase and then translates the phrase to another language to enable multi-lingual conversation.
An example application developed uses Al Go and Chess engines in conjunction with AlterEgo, enabling a human user to access the expertise of an AI in real time, as though a part of the user herself. The platform, therefore, enables everyone to become expert Go and Chess players in a demonstration of how AlterEgo could augment human decision-making through machine intelligence. We imagine a future possible scenario where doctors might internally and silently consult with a clinical decision making AI agent through AlterEgo in order to improve provision of medical care.
What is key is that the user does not have to disconnect from her surroundings to use computer services. The system can enhance the user’s engagement in the present moment or conversation. For example, when someone uses a word in a meeting that you don’t know, you can silently ask the system for a definition, so as to not be left out of the conversation. When you’ve met someone previously but have forgotten her name, the system can silently consult your address book to help you out.
The focus of our development so far has been on getting the technology to work. We have not prioritized the form factor, but believe that improvements in electrode design, modeling electrophysiology, materials, and industrial design would help the wearable to become inconspicuous. The design of the device is intrinsically connected to the human neurophysiology, and we are further researching more minimal device designs that afford accurate recognition of silent speech.
We currently have a working prototype that, after training with user-specific example data, demonstrates over 90% accuracy on an application-specific vocabulary. The system is currently user-dependent and requires individual training. We are currently on working on iterations that would not require any personalization.
There is a long history of research both in academia and in industry on brain to computer communication. The approaches can be categorized as invasive (implanted) or non-invasive (external). Even though, to the best of our knowledge, there are no public, real-time, discreet conversational brain-computer interfaces, we compare our strategy with the traditional approach of reading from the brain directly. In the category of non-invasive brain computer systems, most approaches are based on reading information directly from the brain using sensors positioned on the skull. While these systems are not invasive, they are still intrusive in that the system has access to brain activity directly. It is ideal and important that a device that would function as an everyday computational interface not have any access to a person's private thoughts.
Our research with AlterEgo is focused on building a device that is non-invasive as well as non-intrusive, where the input to the computing device is a deliberate and streamlined input on part of the human user, with the human user having absolute control over what information she transmits to another person/computing device. AlterEgo reads information from the peripheral somatic system through internal speech movements, rather than directly from the brain. It detects the signals users send to their mouth and vocal cords when deliberately, but silently, voicing words. It does not read the thoughts coming up in the user’s mind, only the ones a user consciously intends to send to the device, thereby enabling the user to keep his or her thoughts private. There have been other attempts at creating silent speech systems, but none adopt the more successful approach of using electrodes, signal processing, and machine learning that AlterEgo is based upon to enable a real-time silent computing system.
The system could possibly help people with speech impairments such as Apraxia, Cluttering, and other voice disorders. We have yet to conduct formal and extensive studies on how the platform could help people with speech impairments.
This is a university-based research project. We are continuing to further develop the system focusing on improvements such as reducing the number of electrodes required, designing more socially acceptable form factors, improving the neural networks that recognize the silent speech, as well as working on reducing the training and customization required, and last but not least, designing the end-to-end user experience and applications. Any hopes for commercialization are premature.
Sorry, not at this moment.
For general and technical inquiries: arnavk@mit.edu OR alterego@media.mit.edu
Press inquiries: press@media.mit.edu and please visit the press kit.
Questions about sponsorship of Media Lab research: contact member-info@media.mit.edu