Our research seeks to leverage interdisciplinary insights to understand how effective language-capable robots can be designed and deployed in an ethical, responsible, and equitable manner. 

As shown below, our lab has extensive expertise in cognitive systems, quantitative experimental design, and qualitative analysis. In our most recent work, we are primarily applying these methods to study how roboticists wield power (as analyzed through a Black Feminist lens), and how that power can be wielded more equitably (using a Feminist Design methodology).



Funded Research Projects

Developing a Quantification System for Robot Moral Agency

Page under construction

How can we quantify the extent to which robots count as moral agents? In this project we are working to develop instruments for measuring autonomy, adaptability, interactivity, and capacity for moral action: the dimensions of moral agency under the framework presented by Floridi & Sanders.

Collaborators: Elizabeth Phillips (George Mason University), Boyoung Kim (George Mason Korea), Qin Zhu (Virginia Tech)
Funded By:

Dynamic, Adaptive Interfaces for Team-Level Goal Specification

Page under construction

How can teleoperators most effectively control multiple robots at once? In this project we are designing interfaces that (1) allow users to specify and refine-team level goals for multi-robot teams, and (2) allow users to wizard-of-oz multi-robot teams with flexible associations between body and identity.

Collaborators: Neil Dantam (Colorado School of Mines)
Funded By:

Givenness Hierarchy Theoretic Natural Language Generation for Situated Human-Robot Interaction

This project seeks to answer the following major goals and objectives:

1. How can ground truth data on cognitive status assumptions be collected? In service of this goal, we seek to collect a corpus of dialogue interactions in open-world situated human-human interaction contexts, and annotate this corpus with cognitive status information. 
2. How can behavioral information related to cognitive status be inferred from multisensory perceptions? In service of this goal, we seek to develop approaches for simultaneously understanding human behaviors, estimating the importance of objects involved in those behaviors, and fusing multisensory information. 
3. How can cognitive status be computationally modeled? In service of this goal, we seek to develop methods for learning and modeling GH-theoretic cognitive status dynamics. 
4. How can language be used together with visual perception to construct an attribute-based representation of object instances? In service of this goal, we seek to develop attribute-based zero-shot learning methods. 
5. How can cognitive status be leveraged during the selection of referring forms for natural language generation? In service of this goal, we seek to develop logical and statistical approaches towards referring form selection. 
6. How can cognitive status be leveraged during the selection of referring content for natural language generation? In service of this goal we seek to develop (incremental) referring expression generation algorithms that explicitly leverage cognitive status when deciding what distractors need to be eliminated. 
7. How can representation learning be used to optimally select distinguishable attributes for referring expression generation? In service of this goal, we seek to develop representation learning algorithms that optimize for representations that are maximally discriminative in the context of referring expression generation. 

How can robots generate situated language in human-like ways? In this project we are leveraging the linguistic theory of the Givenness Hierarchy to develop new algorithms for robot referring form selection and referring expression generation.

Collaborators: Hao Zhang (UMass Amherst), Neil Dantam (Colorado School of Mines)
Funded By:

CAREER: Cognitively-Informed Memory Models for Language-Capable Robots

The goal of the first phase of this project is to systematically assess three key research questions:
• RQ 1.1: What level of decay and interference will result in optimal performance with respect to each key metric (accuracy, naturality, computational efficiency, ease of cognitive processing, and humanlikeness of referring expression generation)?
• RQ 1.2: When interference is used, should resources be distributed according to a limit imposed at the entity, consultant, or architectural level, in order to optimize each key metric?
• RQ 1.3: How should the architecture decide which entities for which to maintain representations in Working Memory, in order to optimize each key metric?

The goal of the second phase of my proposed research is to systematically assess:

• RQ 2.1: How can goal relevance be best modeled?
• RQ 2.2: How can goal relevance be best used to manage resources and representations in robot models of working memory, in order to optimize each key metric (accuracy, naturality, computational efficiency, ease of cognitive processing, and humanlikeness of referring expression generation)?

What is the connection between Working Memory and Language Production? In this project we are developing models of robot working memory and using them to enable more natural robot language generation.

Collaborators: N/A
Funded By:

YIP: Calibrated Norm Violation Response in Human-Machine Teaming

The overarching goal of this project is to model calibrated responses to norm violations in human-machine teaming by pursuing four objectives:
Objective 1: Explore the factors that affect assessment of norm violations
Objective 2: Identify metrics to assess responses to norm violations
Objective 3: Computationally model norm violation response calibration
Objective 4: Validate these models
In addition, we have received supplemental funding to also investigate:
Objective 5: Design, conduct, and analyze a wide-scoped norm violation response assessment experiment.
Objective 6: Use the results of this experiment to create a qualitative framework for understanding how people reason about robot norm violation responses, and the role that gender plays in that reasoning.

How strongly should robots push back on observed or requested norm violations? In this project we are empirically studying this question and using the answers to develop new algorithmic methods for robotic moral communication.

Collaborators: N/A
Funded By:

ECF: Performance of Autonomy and Identity for Trust- andWorkload-Sensitive Interaction with Distributed Autonomous Systems

The Lunar Orbital Platform-Gateway will serve as a staging point for manned and unmanned missions to the Moon, Mars, and beyond. While the Gateway will sustain human crews for small periods of time, it will be primarily staffed by autonomous caretaker robots like the free-flying Astrobee platform: the Gateway’s sole residents during quiescent (uncrewed) periods. This creates a unique human-technical system comprised of two categories of human teammates: ground control workers permanently stationed on earth and astronauts that may transition over time between work on Earth, the Gateway, the Moon, and Mars; and three types of machine teammates: robot workers stationed on the Gateway; robot workers stationed on the Moon and Mars; and the Gateway itself. Our work seeks to address fundamental challenges for effective communication between human and machine teammates within this unique human-technical system, by drawing on both foundational and recent theoretical work on defining, operationalizing, and measuring the constructs of trust and workload in human-autonomous teams.

To do so, we sought to explore two key objectives.

Objective 1: Enable optimal performance of autonomy to manage trust and workload
Robots participating in deep space exploration will operate within a spatiotemporally distributed
human-technical system. At present, downlink bandwidth between the ISS and earth is already subject to frequent loss of signal and dropout. In the near future, robots will perform bandwidth-limited missions, e.g., to explore lunar lava tubes and permanently shadowed regions at the lunar poles. An increased number of robots will further limit both bandwidth and cognitive availability. In the far future, ultra-long communication delays will be introduced as humanity progresses to Mars and beyond.

We proposed to enable robots aboard the gateway to maintain sensitivity to spatiotemporal distance when communicating, in order to (1) effectively build trust and rapport without enabling catastrophic overtrust, and (2) facilitate accurate human mental models of their own behaviors and dispositions without cognitively overloading teammates. We aimed to do so by bridging the competing perspectives of dynamic autonomy and mixed-initiative interaction through the novel concept of performance of autonomy, in which robots capable of a given level of autonomy may strategically (and in a way that is sensitive to spatiotemporal distance) perform lower levels of autonomy during decision making to achieve these trust- and workload-related goals.

Objective 2: Enable optimal performance of identity to promote resilient trust

Because the Gateway and its robotic workers will be integrated into a single system, when humans interact with different robot bodies aboard the Gateway, they will in fact be interacting with a single integrated system. Accordingly, we argue that the distinct identities presented by the Gateway and its robots are in fact performed for human benefit. Accordingly, when the integrated Gateway system needs to communicate with a human teammate, it may choose what body to use and what identity to perform. We explored a variety of group identity performance strategies, and studied their impact on human-robot trust.


In this project we are exploring two key questions: (1) How can multi-robot systems perform identity in different ways, and how do those identity performance strategies impact human-robot trust? (2) How can robots enhance teammates’ situation awareness through Performative Autonomy, in which they ask strategic questions?

Collaborators: NASA
Funded By:

APERTURE: Augmented Reality based Perception- Sensitive Robotic Gesture

Augmented reality head-mounted displays are being or will be deployed in many domains of national interest, from advanced and collaborative manufacturing, to warehousing and logistics, to urban and alpine search and rescue. Moreover, these displays are often being used in the context of human-robot interactions, providing a novel channel that can be used to communicate information about robots’ perceptions and intentions that would otherwise be obscured from human interactants. The APERTURE project sought to answer a fundamental question: whether within or beyond human-robot interaction contexts, how should augmented reality devices communicate with humans about objects of interest, to effectively pick out relevant objects without cognitively overloading interactants?

To answer these questions, the project team pursued two strands of research. 

First, the team designed and evaluated nonverbal cues that could be rendered in mixed reality interactions, ranging from circles and arrows directly picking out target objects, to virtual arms reaching out to indicate objects, to combinations thereof. Moreover, the project team explored the effectiveness of these augmented reality cues on their own, or paired with natural language of different levels of complexity. These efforts demonstrated the effectivness of the designed augmented reality cues for achieving different results, showed how these cues could be just as effective as those generated by physical robot arms, and demonstrated how combining different types of cues together enabled the “best of both worlds”. Overall, this thrust produced a range of design guidelines that can guide the deployment of mixed reality gestures across the domains of national interest listed above.

Second, the team explored how to measure different levels and types of cognitive load through brain-computer interfaces. Specifically, the project team developed new approaches to cognitive load estimated grounded in multiple resource theory. Our results demonstrate that functional Near Infrared Spectroscopy (fNIRS) can be used not only to measure overall cognitive load as shown in previous work, but moreover to measure different types of cognitive load — auditory, visual, and working memory, and that fNIRS is overall both sensitive and diagnostic to load in complex tasks. Moreover, we have used the software testbed developed for this experiment to explore how different types of mixed-reality communication might perform under different levels and types of cognitive load, providing promising initial findings findings that may guide future research and enhance the design of human-technology interactions in mixed reality environments.

Finally, in addition to these research activities, the project team has used this project to do significant community building, growing the community of “Virtual, Augmented, and Mixed Reality for Human Robot Interaction” researchers into a vibrant subcommunity.


In this project we studied the effects of multi-modal communication strategies on different types of cognitive load, and using these insights to design more effective augmented reality based communication strategies for nonverbal robot communication. Our work on this project was instrumental in defining the new research field of Virtual, Augmented, and Mixed Reality for Human-Robot Interaction.

Funded Collaborators: Leanne Hirshfield (University of Colorado Boulder)
Other Collaborators: Christopher Wickens (Colorado State University)
Funded By:

Role-Based Norm Violation Response in Human-Robot Teams

Social robots deployed into human-like environments will need moral competence to avoid negatively impacting human moral ecosystems. Robots will need to be able to tell when what is asked of them is wrong, and reject those commands in appropriate ways. Most previous work on moral competence has been grounded in norm-based theories of morality. In contrast, this project provided an initial exploration of role-based theories of morality that emphasize robots and humans’ positions within a broader social and relational moral ecosystem. To understand how robots might be given role-based moral competence, the team explored several distinct research thrusts.

Overall, much of the project’s work focused on developing a theory of Confucian Robot Ethics grounded in Confucian Role Ethics, an ethical framework that emphasizes the cultivation of the moral self. The team took several approaches to achieving this goal, including developing algorithms for role-based moral reasoning; psychologically studying the effectiveness of role-based moral communication; and analyzing more broadly how robots should behave from a Confucian Perspective. The project’s philosophical work involved a number of new explorations of robot ethics, and encouraged a shift from designing robots that do good to designing robots that encourage humans to be good. The project’s algorithmic work led to new algorithms for learning moral representations, and for the first knowledge representations for role-based moral reasoning and moral communication.

The project’s psychological work explored how robots’ moral guidance phrased from different moral perspectives might have different effectiveness at encouraging moral behavior from human users. Our results showed that role-based moral advice can be particularly effective, but that it depends how the interaction is structured. In particular, our results  showed the importance of moral reflection and moral practice for effective moral communication: opportunities for reflection on ethical principles may increase the efficacy of robots’ role-based moral language; and following robots’ moral language with opportunities for moral practice may facilitate role-based moral cultivation. This work also showed the ways that different cultural orientations led to differences in moral behavior under robot moral advising.

The project also explored more generally how robots should give moral advice. Our results showed that people hold different expectations for both humans and robots as to how to give moral advice vs act in moral situations, and even differences between how people should morally act when advised vs not advised. Finally, this work showed the ways that people are blamed more for disobeying human and robot advice than for acting against moral norms.

The project also explored the ways that robot interaction design could shape human politeness both towards robots and toward other people, showing that dominant strategies taken by tech companies were likely to backfire and decrease politeness, and instead suggesting easily adoptable alternate approaches.

Moreover, the project led to new ethical arguments beyond role ethics, and to new paradigms of engineering education. From the ethical perspective, the project led to new examinations of the types of perceptual capabilities robots are given, sometimes in the name of norm violation response, and the ethical problems with those capabilities. From the pedagogical perspective, the project explored experimental robot ethics as a key in-class activity that could be used to cultivate practical ethical reasoning among students. 


In this project we explored how robot design could be informed by Confucian Role Ethics and used the resulting insights to develop new methods for Role Ethics driven moral reasoning and moral communcation.

Collaborators: Elizabeth Phillips (George Mason University), Qin Zhu (Virginia Tech)
Funded By:

Context-Aware Ethical Autonomy for Language Capable Robots

Here, we document the overarching, final contributions of this project over its four years.

First, led by PI Williams, we: (1) developed an understanding of robot command rejection, (2) showed how we could learn context-sensitive politeness norms, (3) developed a nuanced, theoretically grounded account for how robot design could encourage politeness norm adherence in others, (4) de developed new understanding of how context shapes politeness norm adherence, (5) developed new innovations in teaching students about robot ethics through the lens of experimental ethics surrounding sociocultural norm adherence, and (6) developed new knowledge representations and algorithms for norm-grounded relational roles usable in command rejection, and collected experimental evidence showing when and how explanations grounded in these different knowledge representations would be successful.

Second, led by PI Dantam, we (1) developed an approach for robots to identify, prove, and explain infeasible motions, through a learning and sampling framework that can prove infeasibility of considered motions, (2)  we proved the convergence rate of our probability of successfully learning obstacle regions, (3) applied this technique to show order-of magnitude performance improvements in relevant robotics problems.  Overall, this
work has developed a fundamentally new construct, the “learned configuration space manifold”, for robot planning that both helps the planning process, and helps robots explain when planning is infeasible.

Third, led by PI Zhang, we developed several methods over the lifetime of the project for place recognition and perceptual adaptation. Recognizing and referring to locations or places is a crucial component to identify the ethical context from learned, multimodal representations. We developed graph representation learning and matching methods to represent and identify locations, which fuses visual, spatial, and semantic information to form multimodal representations. These approaches explicitly addressed several technical challenges, including perceptual aliasing (i.e., different places may look similar) and long-term changes (i.e., same places may look different over time, such as between summer and winter). Besides representation learning, perceptual adaptation methods were also developed to improve consistency of robot perception that further improves reference accuracy. 

Finally, in a truly collaborative effort involving the whole project team, we collectively developed and demonstrated an integrated robot solution for context sensitive command rejection.

In this project, we explored how robots could behave and communicate in accordance with context-sensitive moral norms.

Collaborators: Hao Zhang (UMass Amherst), Neil Dantam (Colorado School of Mines)
Funded By:

Unfunded Research Projects


In this project, we are exploring the needs faced by “care wizard” who teleoperate robots in care settings, and developing new interfaces to enable care wizards to author, execute, and analyze content in care domains.

In this project, we are exploring the needs faced by “care wizard” who teleoperate robots in care settings, and developing new interfaces to enable care wizards to author, execute, and analyze content in care domains.

Collaborators: NA

Past Research Efforts

Open World Language Understanding and Generation

A key component of natural-language communication is the ability to refer to objects, locations, and people of interest. Accordingly, language-capable robots must be able to resolve references made by humans and generate their own referring expressions. In our research, we have developed a reference architecture for language-capable robots (within the DIARC Cognitive Robotic Architecture [Cognitive Architectures ’19]) that facilitates both these capabilities.

At the core of this architecture is the Consultant Framework [IIB’17], which allows our language understanding and generation to “consult with” diffferent sources of knowledge throughout the robot architecture, without needing to know how knowledge is represented for each of those knowledge sources, or the location of that knowledge source (i.e., on what machine that component is running).

We have demonstrated how this Consultant Framework can be leveraged for both reference resolution and referring expression generation, allowing us to understand and generate referring expressions under uncertainty, when relevant information is distributed across different machines, and when multiple types of knowledge need to be simultaneously leveraged (e.g., when an object is described with respect to a location, which may be challenging when objects and locations are represented differently and stored in different locations). Moreover, our work on reference resolution is especially novel in that we are one of the only groups researching open-world reference resolution [AAAI’16] [IROS’15][COGSCI’15]. Our reference resolution algorithms not only do not require target referents to be currently visible (as is typically the case) but do not even require them to be known of a priori. For example, a robot hearing “The medical kit in the room at the end of the hall”, when the hall is known of but the room and medical kit are not, will through our algorithms automatically create new representations for the medical kit and room, and record their relationships with each other and with the hallway, such that they can be searched for and/or identified[AAAI’13]. We’ve also demonstrated how we can use this framework for referring expression generation [INLG’17].

Our work on reference resolution, referring expression generation, pragmatic understanding, and pragmatic generation, is tied together by our work on clarification request generation. We have demonstrated how these two approaches can be integrated together such that the uncertainty intervals on the interpretations produced by pragmatic understanding reflect the robot’s uncertainty and ignorance as to the intention or reference of their interlocutor’s utterance; how the dimensions of this interval can be used to determine when to ask for clarification; and how the clarification requests generated by the integrated pragmatic generation and referring expression generation components comply with human preferences [RSS’17][AURO’18]

We have applied our research to two main application domains. The first is assistive robotics, especially robot wheelchairs. About 40% of wheelchair users find it difficult or impossible to maneuver using a joystick, often due to tremors, a limited range of motion, or spastic rigidity. Natural language is a particularly well-suited alternative modality for wheelchair control due to its capacity for the natural, flexible communication of a wide array of commands. Although natural language-controlled wheelchairs have existed since the late seventies, they have significantly advanced since the mid 2000s[RAS’17]. Recent natural language-controlled wheelchairs identify landmarks, travel between multiple floors, ask and answer questions, and map their environments. In collaborative research with the University of Michigan, we have significantly extended the state of the art, by integrating the natural language capabilities described above with a robust spatial reasoning and navigation system, through a novel approach to architectural integration[AAMAS’17].
Our second primary application domain is search-and-rescue robotics. In collaborative research with the University of Bremen, we have integrated our language-capable robot architecture with the KnowRob Knowledge Processing System and it’s associated Alpine Search and Rescue Simulator[AURO’18]. This is a natural application domain in Colorado, due to the number of mountain rescue organizations in the rocky mountain region. We are currently also looking towards underground environments for application of our work, especially given the number of dangerous abandoned mines throughout the state of Colorado.

Enabling robots to communicate understand and generate natural language descriptions… and to infer new information about the world through these descriptions.

Collaborators: Tufts University (HRI Lab); University of Bremen (Institute for AI); University of Michigan (Intelligent Robotics Lab); University of Oxford (GOALS Lab)

Cognitive Memory Architectures and Anaphora Processing

The Consultant Framework and our algorithms for reference resolution (POWER) and referring expression generation (PIA) represent the first level of our reference architecture. On top of this is built a second level inspired by the Givenness Hierarchy, a theory from cognitive linguistics. The Givenness Hierarchy suggests a relationship between different referring forms (e.g., “it”, “this”, “that”) and different tiers of cognitive status (e.g., “in focus”, “activated”, “familiar”). In our research, we have used the Givenness Hierarchy to guide the process of reference resolution, such that depending on what referring form is used, a robot may opt to search through a small, cognitively inspired data structure (e.g., the Focus of Attention or Short-Term Memory) before searching through all of Long Term Memory (which in our architecture is comprised of the set of distributed knowledge bases managed by Consultants) [HRI’16] [Oxford Handbook of Reference ’19][RSS:HCR’18]. We have shown how this leads to more effective and efficient reference resolution. In recent work, we are exploring how statistical model can be trained to directly estimate the cognitive status of a given entity based on how it has appeared in previous dialogue; a model we hope to use for Givenness Hierarchy theoretic language generation [COGSCI’20a].


Outside of a Givenness Hierarchy theoretic context, we have also been exploring feature-based models of working memory, and have shown how augmenting sources of knowledge with Short Term Memory buffers may improve referring expression generation [ICSR’18].

Endowing robots with human-like models of working memory and attention, and leveraging these models to allow robots to efficiently and accurately understand and generate anaphoric and pronominal expressions.

Collaborators: Tufts University (HRI Lab)

Polite Communication

Once a robot has identified the literal meaning of an utterance and resolved the utterance’s references, the robot must determine the intention underlying that utterance. For reasons such as politeness, humans’ utterances often have different literal and intended meanings. Perhaps the clearest example of this is the utterance “Do you know what time it is?p=” Literally, this is a yes-or-no question. But in many cultures it would be inappropriate to simply respond “Yes”. Listeners are expected to abduce that the speaker’s intended meaning was something along the lines of “Tell me what time it is”, an utterance which would be too impolite to utter directly. Understanding such so-called indirect speech acts (which, we have shown, are common to human-robot dialogue [JHRI’17][HRI’18]), is especially difficult because a robot may be uncertain or ignorant of the context necessary to abduce an utterance’s correct interpretation.

To address this problem, we have developed a novel approach which uses a set of pragmatic rules and a system of Dempster-Shafer Theoretic Logical Operators to abduce the intended meanings of indirect speech acts in uncertain and open worlds, i.e., when the robot is uncertain or ignorant of the utterance, the context necessary to interpret it, or even of its own pragmatic rules. What is more, we have shown that this approach may be inverted: by using the same set of pragmatic rules, the robot can determine the most appropriate utterance form to use to communicate its own intentions [AAAI’15]. We have also shown how these rules can be learned from human data [AAAI’20].

While this work has focused on directness/indirectness, we are also looking at linguistic norms more generally, and how humans’ adherence to such norms is shaped by various contextual factors [COGSCI’20b].

We have also explored how robot interaction design decisions might lead interactants to speak more politely to robot teammates, and how this primed politeness might carry over into interactions with humans. This research also involved novel educational research components [EAAI’20].

Enabling robots to understand, generate, and encourage language that conforms with sociocultural linguistic norms.

Collaborators: Tufts (HRI Lab), University of Miami


Young Investigator Award
Early Career Award