CHING-YI TSAI

Gesturing Toward Abstraction
Multimodal Convention Formation in Collaborative Physical Tasks
Kiyosu Maeda, William P. McCarthy, Ching-Yi Tsai, Jeffrey Mu, Haoliang Wang, Robert D. Hawkins, Judith E. Fan, Parastoo Abtahi
ACM CHI'26 PAPER | 13 APRIL 2026

How do people use gestures and language to establish conventions and adapt communication over repeated physical collaboration?

This works investigates how communication strategies evolve through repeated collaboration as people coordinate on shared procedural abstractions. We first conducted an online unimodal study (n = 98) using natural language to probe abstraction hierarchies. Then, in a follow-up lab study (n = 40), we examined how multimodal communication (speech and gestures) changed during physical collaboration. Based on the findings, we extend probabilistic models of convention formation to multimodal settings, capturing shifts in modality preferences.

Multimodal Conventions Teaser
Shifts in multimodal instructions (speech and gesture) from the first repetition (R1) to the final repetition (R2). A) Instructions shift from redundant in R1 to complementary in R4 for block position and orientation. B) For abstract tower-level instructions, no position or orientation information is provided when establishing a convention in R1, but redundancy is introduced to emphasize position and orientation changes in R4. C) The virtual target tower on the 2×2 grid.

Note

See the official project website here, where you can find dataset, codes for both unimodal and multimodal study, and the app we build for viewing our data and experiment sessions.

I work with Kiyosu in PSI lab for the project, helping him perform experiments, coding data, and analysis to distill insights.