Our research seeks to enable and understand human-robot communication that is sensitive to environmental, cognitive, social, and moral context.
Open World Language Understanding and Generation
A key component of natural-language communication is the ability to refer to objects, locations, and people of interest. Accordingly, language-capable robots must be able to resolve references made by humans and generate their own referring expressions. In our research, we have developed a reference architecture for language-capable robots (within the DIARC Cognitive Robotic Architecture [Cognitive Architectures ’19]) that facilitates both these capabilities.
At the core of this architecture is the Consultant Framework [IIB’17], which allows our language understanding and generation to “consult with” diffferent sources of knowledge throughout the robot architecture, without needing to know how knowledge is represented for each of those knowledge sources, or the location of that knowledge source (i.e., on what machine that component is running).
We have demonstrated how this Consultant Framework can be leveraged for both reference resolution and referring expression generation, allowing us to understand and generate referring expressions under uncertainty, when relevant information is distributed across different machines, and when multiple types of knowledge need to be simultaneously leveraged (e.g., when an object is described with respect to a location, which may be challenging when objects and locations are represented differently and stored in different locations). Moreover, our work on reference resolution is especially novel in that we are one of the only groups researching open-world reference resolution [AAAI’16] [IROS’15][COGSCI’15]. Our reference resolution algorithms not only do not require target referents to be currently visible (as is typically the case) but do not even require them to be known of a priori. For example, a robot hearing “The medical kit in the room at the end of the hall”, when the hall is known of but the room and medical kit are not, will through our algorithms automatically create new representations for the medical kit and room, and record their relationships with each other and with the hallway, such that they can be searched for and/or identified[AAAI’13]. We’ve also demonstrated how we can use this framework for referring expression generation [INLG’17].
Our work on reference resolution, referring expression generation, pragmatic understanding, and pragmatic generation, is tied together by our work on clarification request generation. We have demonstrated how these two approaches can be integrated together such that the uncertainty intervals on the interpretations produced by pragmatic understanding reflect the robot’s uncertainty and ignorance as to the intention or reference of their interlocutor’s utterance; how the dimensions of this interval can be used to determine when to ask for clarification; and how the clarification requests generated by the integrated pragmatic generation and referring expression generation components comply with human preferences [RSS’17][AURO’18].
Enabling robots to communicate understand and generate natural language descriptions… and to infer new information about the world through these descriptions.
Cognitive Memory Architectures and Anaphora Processing
The Consultant Framework and our algorithms for reference resolution (POWER) and referring expression generation (PIA) represent the first level of our reference architecture. On top of this is built a second level inspired by the Givenness Hierarchy, a theory from cognitive linguistics. The Givenness Hierarchy suggests a relationship between different referring forms (e.g., “it”, “this”, “that”) and different tiers of cognitive status (e.g., “in focus”, “activated”, “familiar”). In our research, we have used the Givenness Hierarchy to guide the process of reference resolution, such that depending on what referring form is used, a robot may opt to search through a small, cognitively inspired data structure (e.g., the Focus of Attention or Short-Term Memory) before searching through all of Long Term Memory (which in our architecture is comprised of the set of distributed knowledge bases managed by Consultants) [HRI’16] [Oxford Handbook of Reference ’19][RSS:HCR’18]. We have shown how this leads to more effective and efficient reference resolution. In recent work, we are exploring how statistical model can be trained to directly estimate the cognitive status of a given entity based on how it has appeared in previous dialogue; a model we hope to use for Givenness Hierarchy theoretic language generation [COGSCI’20a].
Outside of a Givenness Hierarchy theoretic context, we have also been exploring feature-based models of working memory, and have shown how augmenting sources of knowledge with Short Term Memory buffers may improve referring expression generation [ICSR’18].
Endowing robots with human-like models of working memory and attention, and leveraging these models to allow robots to efficiently and accurately understand and generate anaphoric and pronominal expressions.
Collaborators: Tufts University (HRI Lab)
Neurophysiological Modulation of Mixed-Reality Communication
Virtual, Augmented, and Mixed-Reality Technologies offer powerful new tools for human-robot interaction (as well as for conducting HRI research in general [VAMR’20]). In the MIRRORLab, we are particularly interested in the use of Augmented Reality as a new modality for robot communication. In human-robot interaction, natural language references are often accompanied by deictic gestures such as pointing, using human-like arm motions. However, with advancements in augmented reality technology, new options become available for deictic gesture, which may be more precise in picking out the target referent, and require less energy on the part of the robot. We have presented the first conceptual framework for categorizing the types of gestures available to robots within mixed-reality environments [VAMR’18] (and more generally, the ways in which AR and VR can increase robots’ potential for interaction [HRI-LBR’19][VAM-HRI’19]), conducted the first human-subject experiments exploring the potential utility of allocentric gestural visualizations [HRI’19][VAM-HRI’19][VAMR’19], and explored the use of VR testbeds for prototyping AR visualizations [SPIE-XR’20].
This research also relates to work we have been performing in human-robot gesture outside the context of augmented reality [HRI:NLG’20].
Trust- and Workload-Sensitive Communication
The Lunar Orbital Platform-Gateway will serve as a staging point for manned and unmanned missions to the Moon, Mars, and beyond. While the Gateway will sustain human crews for small periods of time, it will be primarily staffed by autonomous caretaker robots like the free-flying Astrobee platform: the Gateway’s sole residents during quiescent (uncrewed) periods. This creates a unique human-technical system comprised of two categories of human teammates: ground control workers permanently stationed on earth and astronauts that may transition over time between work on Earth, the Gateway, the Moon, and Mars; and three types of machine teammates: robot workers stationed on the Gateway; robot workers stationed on the Moon and Mars; and the Gateway itself. Our work seeks to address fundamental challenges for effective communication between human and machine teammates within this unique human-technical system, by drawing on both foundational and recent theoretical work on defining, operationalizing, and measuring the constructs of trust and workload in human-autonomous teams.
As a first step towards this goal, we are using virtual reality technologies to explore the different communication needs for nonvehicular mobile robots vs. autonomous vehicles such as robotic wheelchairs, in order to build human-robot trust.
This builds on our previous experience using VR technologies for a variety of purposes. In past work, for example, we used Virtual Reality for Robot Teleoperation, presenting a low-cost method for VR-based robot teleoperation [VAM-HRI’18], and using VR-based teleoperation to explore human-perception of robot teleoperators[IROS’17].
Helping robots balance the workload costs and social benefits of proactive commnication, and exploring how trust develops between humans and teams of robots with distributed intelligence, especially in space exploration contexts.
Once a robot has identified the literal meaning of an utterance and resolved the utterance’s references, the robot must determine the intention underlying that utterance. For reasons such as politeness, humans’ utterances often have different literal and intended meanings. Perhaps the clearest example of this is the utterance “Do you know what time it is?p=” Literally, this is a yes-or-no question. But in many cultures it would be inappropriate to simply respond “Yes”. Listeners are expected to abduce that the speaker’s intended meaning was something along the lines of “Tell me what time it is”, an utterance which would be too impolite to utter directly. Understanding such so-called indirect speech acts (which, we have shown, are common to human-robot dialogue [JHRI’17], [HRI’18]), is especially difficult because a robot may be uncertain or ignorant of the context necessary to abduce an utterance’s correct interpretation.
To address this problem, we have developed a novel approach which uses a set of pragmatic rules and a system of Dempster-Shafer Theoretic Logical Operators to abduce the intended meanings of indirect speech acts in uncertain and open worlds, i.e., when the robot is uncertain or ignorant of the utterance, the context necessary to interpret it, or even of its own pragmatic rules. What is more, we have shown that this approach may be inverted: by using the same set of pragmatic rules, the robot can determine the most appropriate utterance form to use to communicate its own intentions [AAAI’15]. We have also shown how these rules can be learned from human data [AAAI’20].
While this work has focused on directness/indirectness, we are also looking at linguistic norms more generally, and how humans’ adherence to such norms is shaped by various contextual factors [COGSCI’20b].
We have also explored how robot interaction design decisions might lead interactants to speak more politely to robot teammates, and how this primed politeness might carry over into interactions with humans. This research also involved novel educational research components [EAAI’20].
Enabling robots to understand, generate, and encourage language that conforms with sociocultural linguistic norms.
Collaborators: Tufts (HRI Lab), University of Miami
Confucian Robot Ethics
Our research has been exploring new approaches grounded in role-based ethics [DARK-HRI’19][CEPE’19][ALT.HRI’20]. We believe this is a promising framework for robot ethics for several reasons: (1) the Confucian focus on hierarchical structures may be a good fit for robots and may “play well” with current frameworks in human-robot teaming; (2) Confucian role ethics focuses on moral self-cultivation and care for others, rather than solely autonomous individualism; (3) there is exciting recent research on moral communication within the Confucian Role Ethics literature; and (4) non-Western ethics are underexplored in the current robot ethics literature.
Moral Remonstration and Proportionality of Response
In our work we seek to develop language understanding and generation algorithms that are not only effective and efficient, both also ethical. Language-capable robots have the potential to influence human teammates in a way that is unique among other technologies, in part because they can be perceived as teammates and in-group members. Work in behavioral ethics suggests that humans’ moral and social norms need to be followed, enforced, and communicated by all community members in order for them to remain norms (see also our work on the interaction between moral and social agency [DARK-HRI’19] ). Accordingly, if a robot is given an unethical command, it must not only reject the command, but be clear in its rejection in order to communicate its norm-adherence to other community members. Unfortunately, our research suggests that the design of current language-capable robots (in particular, the design of their clarification generation systems) may lead to robots inadvertently communicating unwillingness to adhere to moral norms; and as our research shows, this has the potential to negatively influence teammates’ conceptions of those same moral norms [COGSCI’18] [ICRES’18A] [ALT.HRI’19].
Moreover, we argue that it is not only important for robots to clearly reject requested norm violations, but that those rejections must be appropriately tailored to context, and to the size of the requested violation[AIES’19][HRI20].
We are also pursuing a variety of other research directions related to ethical language generation. We have examined the nature of moral dilemmas faced by robots in realistic open-world environments [ICRES’18B]; and we have presented novel pedagogical strategies for the teaching of AI Ethics [EAAI’17][EAAI’20].