Research Overview

Our lab focuses on enabling robots assist human teammates in real-world manipulation tasks. We develop frameworks for robot learning in three levels of abstraction:

  1. Low-level motor skills: The robot learns, executes and refines motor skills that use tools to perform physical tasks.
  2. High-level task actions: The robot learns which action to perform and which objects to manipulate, given the current world state and internal and physical state of the user.
  3. Whom to assist: The robot learns which user to assist in order to optimize performance, while accounting for human trust in the robot.

Research Contribution #1: Low-level motor skills

The robot learns, executes and refines motor skills that use tools to perform physical tasks.

Much work in robotics has focused on ``human-in-the-loop'' learning techniques that improve the efficiency of the learning process. However, these algorithms have made the strong assumption of a cooperating human supervisor that assists the robot. In reality, human observers tend to also act in an adversarial manner towards deployed robotic systems. We show that this can in fact improve the robustness of the learned models by proposing a physical framework that leverages perturbations applied by a human adversary, guiding the robot towards more robust models. In a manipulation task, we show that grasping success improves significantly when the robot trains with a human adversary as compared to training in a self-supervised manner.

Robot Learning via Human Adversarial Games

Duan, Jiali*; Wang, Qian*; Pinto, Lerrel; Kuo C.-C. Jay; Nikolaidis, Stefanos.

International Conference on Intelligent Robots and Systems; November 2019.

Best Cognitive Robotics Paper Award Nomination

Research Contribution #2: High-level task actions

The robot learns which action to perform and which objects to manipulate, given the current world state and internal and physical state of the user.

People often watch videos on the web to learn how to cook new recipes, assemble furniture or repair a computer. We wish to enable robots with the very same capability. This is challenging; there is a large variation in manipulation actions and some videos even involve multiple persons, who collaborate by sharing and exchanging objects and tools. Furthermore, the learned representations need to be general enough to be transferable to robotic systems. Previous systems have enabled generation of semantic and human-interpretable robot commands in the form of visual sentences. However, they require manual selection of short action clips, which are then individually processed. We propose a framework for executing demonstrated action sequences from full-length, unconstrained videos on the web. The framework takes as input a video annotated with object labels and bounding boxes, and outputs a collaborative manipulation action plan for one or more robotic arms. We demonstrate the performance of the system in three full-length collaborative cooking videos on the web and propose an open-source platform for executing the learned plans in a simulation environment.

Learning Collaborative Action Plans from YouTube Videos

Hejia Zhang; Po-Jen Lai; Sayan Paul; Suraj Kothawade; Stefanos Nikolaidisgi

International Symposium on Robotics Research; October 2019.

Research Contribution #3: Whom to assist

The robot learns which user to assist in order to optimize performance, while accounting for human trust in the robot.

Much work in robotics and operations research has focused on optimal resource distribution, where an agent dynamically decides how to sequentially distribute resources among different candidates. However, most work ignores the notion of fairness in candidate selection. In the case where a robot distributes resources to human team members, disproportionately favoring the highest performing teammate can have negative effects in team dynamics and system acceptance. We introduce a multi-armed bandit algorithm with fairness constraints, where a robot distributes resources to human teammates of different skill levels. In this problem, the robot does not know the skill level of each human teammate, but learns it by observing their performance over time. We define fairness as a constraint on the minimum rate that each human teammate is selected throughout the task. We provide theoretical guarantees on performance and perform a large-scale user study, where we adjust the level of fairness in our algorithm. Results show that fairness in resource distribution has a significant effect on users' trust in the system.

Multi-armed Bandits with Fairness Constraints for Distributing Resources to Human Teammates

Claure, Houston; Chen, Yifang; Modi, Jignesh; Jung, Malte; Nikolaidis, Stefanos.

IEEE/ACM International Conference on Human-Robot Interaction; March 2020.