- This event has passed.
MEAM Ph.D. Thesis Defense: “Exploring Multimodal Sensing Across the Stack for Robot Manipulation”
November 11, 2024 at 11:00 AM - 12:00 PM
Despite substantial progress in robotics, achieving human-like manipulation remains a significant challenge. Existing robotic systems typically leverage human-inspired sensory modalities: vision, touch, and proprioception. However, these modalities are historically studied and integrated in isolation, leading to limited performance in complex real-world tasks that require sensing across multiple modalities for robust generalization. As a result, robots have struggled to transition from structured lab environments to effective real-world applications.
This persistent challenge highlights two critical limitations: the narrow focus on only human-inspired senses and the isolated integration of vision, touch, and proprioception fail to provide robots with the necessary adaptability for the real world. In contrast to traditional approaches focused on each of these three modalities, complementary modalities and tightly integrated multimodal systems are underexplored. With the unprecedented availability of diverse off-the-shelf sensors, powerful on-board computation to process rich data streams, advances in data-driven control and perception frameworks, and a new spotlight on robotic system integration, we now face a unique opportunity to construct new multimodal sensing paradigms.
This thesis explores three complementary aspects of multimodality selection and integration across sensor design, perception, and RL control. First, we address the challenge of integrating an additional modality without compromising existing functionality – a sensing mechanism design problem that often forces trade-offs between different sensing modes. We leverage a selectively transmissive membrane to enable proximity depth sensing that seamlessly augments the visuotactile modality. Next, we exploit the complementary nature of our sensor’s dual modalities for sensor fusion. We demonstrate how carefully combining proximity and tactile modalities can enhance perception, enabling more robust and informative contact patch detection. Finally, we bridge the reality gap in robot policy learning, where sim-to-real transfer is particularly challenging due to the complex physics of contact-rich manipulation. By developing a sim-to-real tactile skin model, we achieve zero-shot transfer of tactile data during the in-hand translation task, allowing us to evaluate the impact of combining tactile feedback with proprioception in this dexterous control task.
Jessica Yin
Ph.D. Candidate, Department of Mechanical Engineering & Applied Mechanics, University of Pennsylvania
Jessica Yin is advised by Mark Yim.