Lars Kunze and Tom Williams and Nick Hawes and Matthias Scheutz
AAAI Fall Symposium on AI and HRI
The ability to refer to entities such as objects, locations, and people is an important capability for robots designed to inter- act with humans. For example, a referring expression (RE) such as “Do you mean the box on the left?” might be used by a robot seeking to disambiguate between objects. In this paper, we present and evaluate algorithms for Referring Expression Generation (REG) in small-scale situated contexts. We first present data regarding how humans generate small-scale spa- tial referring expressions (REs). We then use this data to define five categories of observed small-scale spatial REs, and use these categories to create an ensemble of REG algorithms. Next, we evaluate REs generated by those algorithms and by humans both subjectively (by having participants rank REs), and objectively, (by assessing task performance when partic- ipants use REs) through a set of interrelated crowdsourced experiments. While our machine generated REs were sub- jectively rated lower than those generated by humans, they objectively significantly outperformed human REs. Finally, we discuss the main contributions of this work: (1) a dataset of images and REs, (2) a categorization of observed small- scale spatial REs, (3) an ensemble of REG algorithms, and (4) a crowdsourcing-based framework for subjectively and objectively evaluating REG.