Publication

WatchThis: A Wearable Point-and-Ask Interface powered by Vision-Language Models for Contextual Queries

Cathy Mengying Fang

Oct. 14, 2024

People

Projects

WatchThis: A Wearable Point-and-Ask Interface powered by Vision-Language Models for Contextual Queries

Groups

Share this publication

Fang, C. M., Chwalek, P., Kuang, Q., & Maes, P. (2024, October). WatchThis: A Wearable Point-and-Ask Interface powered by Vision-Language Models for Contextual Queries. In Adjunct Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology (pp. 1-4).

Abstract

This paper introduces WatchThis, a novel wearable device that enables natural language interactions with real-world objects and environments through pointing gestures. Building upon previous work in gesture-based computing interfaces, WatchThis leverages recent advancements in Large Language Models (LLM) and Vision Language Models (VLM) to create a hands-free, contextual querying system. The prototype consists of a wearable watch with a rotating, fip-up camera that captures the area of interest when pointing, allowing users to ask questions about their surroundings in natural language. This design addresses limitations of existing systems that require specific commands or occupy the hands, while also maintaining a non-discrete form factor for social awareness. The paper explores various applications of this point-and-ask interaction, including object identification, translation, and instruction queries.

via ACM

WatchThis: A Wearable Point-and-Ask Interface powered by Vision-Language Models for Contextual Queries

People

Projects

Groups

Abstract

Cathy Fang selected for 2024 Angela Leong Fellowship

Cathy Fang selected as an Apple Scholar

Developing wearables and environmental sensor systems for studying ecosystems

PhysioLLM: Supporting Personalized Health Insights with Wearables and Large Language Models

WatchThis: A Wearable Point-and-Ask Interface powered by Vision-Language Models for Contextual Queries

People

Projects

Groups

Share this publication

Abstract

Cathy Fang selected for 2024 Angela Leong Fellowship

Cathy Fang selected as an Apple Scholar

Developing wearables and environmental sensor systems for studying ecosystems

PhysioLLM: Supporting Personalized Health Insights with Wearables and Large Language Models