- MAS Affiliated Graduate Students
Stephen P. Kaputsos is an Embodied / Physical AI Architect, Researcher, and Innovator at the intersection of physical AI, multimodal AI perception, and human-machine systems - architecting intelligent machines and autonomous systems that perceive, reason, and act in the real world. Masters of Science (S.M.) degrees from both Johns Hopkins University (JHU) and MIT with 5.0 GPAs, he draws on an interdisciplinary foundation in sensory & perceptual neuroscience, multimodal AI signal processing, robotic simulation, and spatial computing to architect and optimize physical AI systems that operate at the frontier of machine capability - using biological intelligence not only as inspiration but as a technical blueprint.
Stephen's research in the MIT graduate program featured embodied AI training environments, cognitive architectures, and multimodal AI perceptual systems for autonomous robotic and aerial platforms including drone systems. His work spans the full stack of physical AI: from multimodal AI signal acquisition, neural signal modeling, and bio-inspired sensory processing, to the design and application of machine learning (ML) and deep learning (DL) architectures underlying embodied AI systems, world model development, sim-to-real transfer, and the deployment of Vision-Language-Action models on robotic systems operating in unstructured real-world environments. A core thread throughout is the architecting of rich multimodal AI communication pipelines - integrating visual, auditory, tactile, and neural signal streams into unified representations that enable autonomous machines to perceive and act with biological-level sensory fidelity. He designs and builds multi-agent systems and agentic frameworks capable of coordinated autonomous behavior, alongside digital twin infrastructure that supports large-scale robot learning and physical simulation.
At the human-machine boundary, Stephen pioneers the interface layer where biological and artificial intelligence meet - leveraging multimodal AI signals, extended reality (XR), and edge AI to create systems where humans and autonomous machines communicate and collaborate fluidly across sensory modalities. This work extends naturally toward brain-computer interfaces and neural interface technologies, where his expertise in neural signal processing and multimodal AI communication positions him at the frontier of direct human-to-machine signaling. As autonomous systems grow more capable and more embedded in the physical and cognitive fabric of society, his research addresses not just how machines perceive and act - but how biological and artificial systems will increasingly share signals, perception, agency, and decision-making.
A researcher and recognized innovator whose work spans industry, academia, and high-stakes operational environments, Stephen has led teams, shaped research agendas, and designed and delivered a dedicated embodied AI training program for technology leaders and senior executives of the United States Air Force - bringing frontier physical AI research directly into high-stakes operational contexts. As a graduate of the American Management Association's Strategic Leadership Program, he brings formal organizational leadership training to complement his technical vision, and has served as a keynote speaker at RealityHack, one of the world's largest XR events. Stephen brings both the technical depth to architect next-generation physical AI systems and the vision to lead the teams and organizations building them - wherever physical AI and embodied intelligence are shaping the future.