Authors

Mark Higger and Zhuoyi Wang and Polina Rygina and Lara Ferreira Bezerra and Logan Daigler and Zane Aloia and Sheena Wu and Amanda Chen and Ishani Pandey and and Tom Williams

Venue

ACM/IEEE International Conference on Human-Robot Interaction

Publication Year

2026
Effective communication is critical to the success of many types of human-robot interaction. A key capability for enabling effective communication is accurate modeling of the cognitive status that entities hold (e.g., modeling what objects interlocutors are currently thinking about, or are generally aware of). However, existing models of cognitive status estimation are not well suited for situated and embodied interactions, as they do not account for nonverbal cues, which are a key way in which humans moderate cognitive status. To address this gap, we make three primary contributions. First, we introduce the BOWTIE corpus of dialogues from a multi-modal open-world referential task, annotated with cognitive status, gesture type, linguistic roles, and grammatical roles of entities across utterances. Second, we introduce RECS, the first LSTM-based model of cognitive status, which we train on the BOWTIE corpus. Third, we present empirical evidence for the success of RECS. These contributions stand to accelerate the future use of cognitively informed algorithms for robot language understanding and generation.