2023¶
Journals ¶
Learning performance graphs from demonstrations via task-based evaluations
A. Puranic, J. Deshmukh, and S. Nikolaidis
Robotics and Automation Letters (RA-L), 2023
Conferences ¶
Inverse reinforcement learning framework for transferring task sequencing policies from humans to robots in manufacturing applications
O. Manyar, Z. McNulty, S. Nikolaidis, and S. Gupta
Proceedings of the International Conference on Robotics and Automation (ICRA), May 2023
Contingency-aware task assignment and scheduling for human-robot teams
N. Dhanaraj, S. Narayan, S. Nikolaidis, and S. Gupta
Proceedings of the International Conference on Robotics and Automation (ICRA), May 2023
Transfer learning of human preferences for proactive robot assistance in assembly tasks
H. Nemlekar, A. Guan, N. Dhanaraj, S. Gupta, and S. Nikolaidis
Proceedings of the ACM/IEEE International Conference on Human-Robot Interaction (HRI), March 2023
Best Systems Paper Award Finalist
2022¶
Journals ¶
Evaluating Human-Robot Interaction Algorithms in Shared Autonomy via Quality Diversity Scenario Generation
M. Fontaine, S. Nikolaidis
ACM Transactions on Human-Robot Interaction, 2022
Abstract
The growth of scale and complexity of interactions between humans and robots highlights the need for new computational methods to automatically evaluate novel algorithms and applications. Exploring diverse scenarios of humans and robots interacting in simulation can improve understanding of the robotic system and avoid potentially costly failures in real-world settings. We formulate this problem as a quality diversity (QD) problem, where the goal is to discover diverse failure scenarios by simultaneously exploring both environments and human actions. We focus on the shared autonomy domain, where the robot attempts to infer the goal of a human operator, and adopt the QD algorithms CMA-ME and MAP-Elites to generate scenarios for two published algorithms in this domain: shared autonomy via hindsight optimization and linear policy blending. Some of the generated scenarios confirm previous theoretical findings, while others are surprising and bring about a new understanding of state-of-the-art implementations. Our experiments show that the QD algorithms CMA-ME and MAP-Elites outperform Monte-Carlo simulation and optimization based methods in effectively searching the scenario space, highlighting their promise for automatic evaluation of algorithms in human-robot interaction.
Citation
@article{fontaine2022evaluating,
author = {Fontaine, Matthew C. and Nikolaidis, Stefanos},
title = {Evaluating Human–Robot Interaction Algorithms in Shared Autonomy via Quality Diversity Scenario Generation},
year = {2022},
issue_date = {September 2022},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {11},
number = {3},
url = {https://doi.org/10.1145/3476412},
doi = {10.1145/3476412},
abstract = {The growth of scale and complexity of interactions between humans and robots highlights the need for new computational methods to automatically evaluate novel algorithms and applications. Exploring diverse scenarios of humans and robots interacting in simulation can improve understanding of the robotic system and avoid potentially costly failures in real-world settings. We formulate this problem as a quality diversity (QD) problem, of which the goal is to discover diverse failure scenarios by simultaneously exploring both environments and human actions. We focus on the shared autonomy domain, in which the robot attempts to infer the goal of a human operator, and adopt the QD algorithms CMA-ME and MAP-Elites to generate scenarios for two published algorithms in this domain: shared autonomy via hindsight optimization and linear policy blending. Some of the generated scenarios confirm previous theoretical findings, while others are surprising and bring about a new understanding of state-of-the-art implementations. Our experiments show that the QD algorithms CMA-ME and MAP-Elites outperform Monte-Carlo simulation and optimization-based methods in effectively searching the scenario space, highlighting their promise for automatic evaluation of algorithms in human–robot interaction.},
journal = {J. Hum.-Robot Interact.},
month = {sep},
articleno = {25},
numpages = {30},
keywords = {automatic scenario generation, human-robot interaction, Quality diversity optimization}
}
Using Design Metaphors to Understand User Expectations of Socially Interactive Robot Embodiments
N. Dennler, C. Ruan, J. Hadiwijoyo, B. Chen, S. Nikolaidis, M. Mataric
ACM Transactions on Human-Robot Interaction, 2022
Abstract
The physical design of a robot suggests expectations of that robot's functionality for human users and collaborators. When those expectations align with the true capabilities of the robot, interaction with the robot is enhanced. However, misalignment of those expectations can result in an unsatisfying interaction. This paper uses Mechanical Turk to evaluate user expectation through the use of design metaphors as applied to a wide range of robot embodiments. The first study (N=382) associates crowd-sourced design metaphors to different robot embodiments. The second study (N=803) assesses initial social expectations of robot embodiments. The final study (N=805) addresses the degree of abstraction of the design metaphors and the functional expectations projected on robot embodiments. Together, these results can guide robot designers toward aligning user expectations with true robot capabilities, facilitating positive human-robot interaction.
Citation
@article{DBLP:journals/corr/abs-2201-10671,
author = {Nathaniel Dennler and
Changxiao Ruan and
Jessica Hadiwijoyo and
Brenna Chen and
Stefanos Nikolaidis and
Maja J. Mataric},
title = {Using Design Metaphors to Understand User Expectations of Socially
Interactive Robot Embodiments},
journal = {CoRR},
volume = {abs/2201.10671},
year = {2022},
url = {https://arxiv.org/abs/2201.10671},
eprinttype = {arXiv},
eprint = {2201.10671},
timestamp = {Tue, 01 Feb 2022 14:59:01 +0100},
biburl = {https://dblp.org/rec/journals/corr/abs-2201-10671.bib},
bibsource = {dblp computer science bibliography, https://dblp.org},
abstract = {The physical design of a robot suggests expectations of that robot's functionality for human users and collaborators. When those expectations align with the true capabilities of the robot, interaction with the robot is enhanced. However, misalignment of those expectations can result in an unsatisfying interaction. This paper uses Mechanical Turk to evaluate user expectation through the use of design metaphors as applied to a wide range of robot embodiments. The first study (N=382) associates crowd-sourced design metaphors to different robot embodiments. The second study (N=803) assesses initial social expectations of robot embodiments. The final study (N=805) addresses the degree of abstraction of the design metaphors and the functional expectations projected on robot embodiments. Together, these results can guide robot designers toward aligning user expectations with true robot capabilities, facilitating positive human-robot interaction.}
}
Preference-Driven Texture Modeling Through Interactive Generation and Search
S. Lu, M. Zheng, M. Fontaine, S. Nikolaidis, H. Culbertson
IEEE Transactions on Haptics, 2022
Abstract
Data-driven texture modeling and rendering has pushed the limit of realism in haptics. However, the lack of haptic texture databases, difficulties of model interpolation and expansion, and the complexity of real textures prevent data-driven methods from capturing a large variety of textures and from customizing models to suit specific output hardware or user needs. This work proposes an interactive texture generation and search framework driven by user input. We design a GAN-based texture model generator, which can create a wide range of texture models using Auto-Regressive processes. Our interactive texture search method, which we call preference-driven, follows an evolutionary strategy given guidance from user's preferred feedback within a set of generated texture models. We implemented this framework on a 3D haptic device and conducted a two-phase user study to evaluate the efficiency and accuracy of our method for previously unmodeled textures. The results showed that by comparing the feel of real and generated virtual textures, users can follow an evolutionary process to efficiently find a virtual texture model that matched or exceeded the realism of a data-driven model. Furthermore, for 4 out of 5 real textures, 80% of the preference-driven models from participants were rated comparable to the data-driven models.
Citation
@ARTICLE{9772285,
author={Lu, Shihan and Zheng, Mianlun and Fontaine, Matthew C. and Nikolaidis, Stefanos and Culbertson, Heather Marie},
journal={IEEE Transactions on Haptics},
title={Preference-Driven Texture Modeling Through Interactive Generation and Search},
year={2022},
volume={},
number={},
pages={1-1},
doi={10.1109/TOH.2022.3173935},
abstract={Data-driven texture modeling and rendering has pushed the limit of realism in haptics. However, the lack of haptic texture databases, difficulties of model interpolation and expansion, and the complexity of real textures prevent data-driven methods from capturing a large variety of textures and from customizing models to suit specific output hardware or user needs. This work proposes an interactive texture generation and search framework driven by user input. We design a GAN-based texture model generator, which can create a wide range of texture models using Auto-Regressive processes. Our interactive texture search method, which we call preference-driven, follows an evolutionary strategy given guidance from user's preferred feedback within a set of generated texture models. We implemented this framework on a 3D haptic device and conducted a two-phase user study to evaluate the efficiency and accuracy of our method for previously unmodeled textures. The results showed that by comparing the feel of real and generated virtual textures, users can follow an evolutionary process to efficiently find a virtual texture model that matched or exceeded the realism of a data-driven model. Furthermore, for 4 out of 5 real textures, 80% of the preference-driven models from participants were rated comparable to the data-driven models.}
}
Conferences ¶
Deep Surrogate Assisted Generation of Environments
V. Bhatt*, B. Tjanaka*, M. C. Fontaine*, S. Nikolaidis
Neural Information Processing Systems (NeurIPS), November 2022
A mip-based approach for multi-robot geometric task-and-motion planning
H. Zhang, S.-H. Chan, J. Zhong, J. Li, S. Koenig, S. Nikolaidis
Proceedings of the IEEE International Conference on Automation Science and Engineering (CASE), 2022
Towards transferring human preferences from canonical to actual assembly tasks
H. Nemlekar, R. Guan, G. Luo, S. Gupta, S. Nikolaidis
Proceedings of the IEEE International Conference on Robot & Human Interactive Communication (RO-MAN), 2022
Abstract
To assist human users according to their individual preference in assembly tasks, robots typically require user demonstrations in the given task. However, providing demonstrations in actual assembly tasks can be tedious and time-consuming. Our thesis is that we can learn user preferences in assembly tasks from demonstrations in a representative canonical task. Inspired by previous work in economy of human movement, we propose to represent user preferences as a linear function of abstract task-agnostic features, such as movement and physical and mental effort required by the user. For each user, we learn their preference from demonstrations in a canonical task and use the learned preference to anticipate their actions in the actual assembly task without any user demonstrations in the actual task. We evaluate our proposed method in a model-airplane assembly study and show that preferences can be effectively transferred from canonical to actual assembly tasks, enabling robots to anticipate user actions.
Human-guided goal assignment to effectively manage workload for a smart robotic assistant
N. Dhanaraj, R. Malhan, H. Nemlekar, S. Nikolaidis, and S. Gupta
Proceedings of the IEEE International Conference on Robot & Human Interactive Communication (RO-MAN), 2022
Approximating Gradients for Differentiable Quality Diversity in Reinforcement Learning
B. Tjanaka, M. Fontaine, J. Togelius, S. Nikolaidis
Genetic and Evolutionary Computation Conference, 2022
Abstract
Consider a walking agent that must adapt to damage. To approach this task, we can train a collection of policies and have the agent select a suitable policy when damaged. Training this collection may be viewed as a quality diversity (QD) optimization problem, where we search for solutions (policies) which maximize an objective (walking forward) while spanning a set of measures (measurable characteristics). Recent work shows that differentiable quality diversity (DQD) algorithms greatly accelerate QD optimization when exact gradients are available for the objective and measures. However, such gradients are typically unavailable in RL settings due to non-differentiable environments. To apply DQD in RL settings, we propose to approximate objective and measure gradients with evolution strategies and actor-critic methods. We develop two variants of the DQD algorithm CMA-MEGA, each with different gradient approximations, and evaluate them on four simulated walking tasks. One variant achieves comparable performance (QD score) with the state-of-the-art PGA-MAP-Elites in two tasks. The other variant performs comparably in all tasks but is less efficient than PGA-MAP-Elites in two tasks. These results provide insight into the limitations of CMA-MEGA in domains that require rigorous optimization of the objective and where exact gradients are unavailable.
Citation
@inproceedings{10.1145/3512290.3528705,
author = {Tjanaka, Bryon and Fontaine, Matthew C. and Togelius, Julian and Nikolaidis, Stefanos},
title = {Approximating Gradients for Differentiable Quality Diversity in Reinforcement Learning},
year = {2022},
isbn = {9781450392372},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3512290.3528705},
doi = {10.1145/3512290.3528705},
abstract = {Consider the problem of training robustly capable agents. One approach is to generate a diverse collection of agent polices. Training can then be viewed as a quality diversity (QD) optimization problem, where we search for a collection of performant policies that are diverse with respect to quantified behavior. Recent work shows that differentiable quality diversity (DQD) algorithms greatly accelerate QD optimization when exact gradients are available. However, agent policies typically assume that the environment is not differentiable. To apply DQD algorithms to training agent policies, we must approximate gradients for performance and behavior. We propose two variants of the current state-of-the-art DQD algorithm that compute gradients via approximation methods common in reinforcement learning (RL). We evaluate our approach on four simulated locomotion tasks. One variant achieves results comparable to the current state-of-the-art in combining QD and RL, while the other performs comparably in two locomotion tasks. These results provide insight into the limitations of current DQD algorithms in domains where gradients must be approximated. Source code is available at https://github.com/icaros-usc/dqd-rl},
booktitle = {Proceedings of the Genetic and Evolutionary Computation Conference},
pages = {1102–1111},
numpages = {10},
keywords = {neuroevolution, reinforcement learning, quality diversity},
location = {Boston, Massachusetts},
series = {GECCO '22}
}
Deep Surrogate Assisted MAP-Elites for Automated Hearthstone Deckbuilding
Y. Zhang, M. Fontaine, A. Hoover, S. Nikolaidis
Genetic and Evolutionary Computation Conference, 2022
Abstract
We study the problem of efficiently generating high-quality and diverse content in games. Previous work on automated deckbuilding in Hearthstone shows that the quality diversity algorithm MAP-Elites can generate a collection of high-performing decks with diverse strategic gameplay. However, MAP-Elites requires a large number of expensive evaluations to discover a diverse collection of decks. We propose assisting MAP-Elites with a deep surrogate model trained online to predict game outcomes with respect to candidate decks. MAP-Elites discovers a diverse dataset to improve the surrogate model accuracy, while the surrogate model helps guide MAP-Elites towards promising new content. In a Hearthstone deckbuilding case study, we show that our approach improves the sample efficiency of MAP-Elites and outperforms a model trained offline with random decks, as well as a linear surrogate model baseline, setting a new state-of-the-art for quality diversity approaches in automated Hearthstone deckbuilding. We include the source code for all the experiments at: https://github.com/icaros-usc/EvoStone2.
Citation
@inproceedings{10.1145/3512290.3528718,
author = {Zhang, Yulun and Fontaine, Matthew C. and Hoover, Amy K. and Nikolaidis, Stefanos},
title = {Deep Surrogate Assisted MAP-Elites for Automated Hearthstone Deckbuilding},
year = {2022},
isbn = {9781450392372},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3512290.3528718},
doi = {10.1145/3512290.3528718},
abstract = {We study the problem of efficiently generating high-quality and diverse content in games. Previous work on automated deckbuilding in Hearthstone shows that the quality diversity algorithm MAP-Elites can generate a collection of high-performing decks with diverse strategic gameplay. However, MAP-Elites requires a large number of expensive evaluations to discover a diverse collection of decks. We propose assisting MAP-Elites with a deep surrogate model trained online to predict game outcomes with respect to candidate decks. MAP-Elites discovers a diverse dataset to improve the surrogate model accuracy while the surrogate model helps guide MAP-Elites towards promising new content. In a Hearthstone deck-building case study, we show that our approach improves the sample efficiency of MAP-Elites and outperforms a model trained offline with random decks, as well as a linear surrogate model baseline, setting a new state-of-the-art for quality diversity approaches in automated Hearthstone deckbuilding. We include the source code for all the experiments at: https://github.com/icaros-usc/EvoStone2.},
booktitle = {Proceedings of the Genetic and Evolutionary Computation Conference},
pages = {158–167},
numpages = {10},
keywords = {deep neural networks, surrogate modeling, MAP-elites},
location = {Boston, Massachusetts},
series = {GECCO '22}
}
Illuminating Diverse Neural Cellular Automata for Level Generation
S. Earle, J. Snider, M. Fontaine, S. Nikolaidis, J. Togelius
Genetic and Evolutionary Computation Conference, 2022
Abstract
We present a method of generating diverse collections of neural cellular automata (NCA) to design video game levels. While NCAs have so far only been trained via supervised learning, we present a quality diversity (QD) approach to generating a collection of NCA level generators. By framing the problem as a QD problem, our approach can train diverse level generators, whose output levels vary based on aesthetic or functional criteria. To efficiently generate NCAs, we train generators via Covariance Matrix Adaptation MAP-Elites (CMA-ME), a quality diversity algorithm which specializes in continuous search spaces. We apply our new method to generate level generators for several 2D tile-based games: a maze game, Sokoban, and Zelda. Our results show that CMA-ME can generate small NCAs that are diverse yet capable, often satisfying complex solvability criteria for deterministic agents. We compare against a Compositional Pattern-Producing Network (CPPN) baseline trained to produce diverse collections of generators and show that the NCA representation yields a better exploration of level-space.
Citation
@inproceedings{10.1145/3512290.3528754,
author = {Earle, Sam and Snider, Justin and Fontaine, Matthew C. and Nikolaidis, Stefanos and Togelius, Julian},
title = {Illuminating Diverse Neural Cellular Automata for Level Generation},
year = {2022},
isbn = {9781450392372},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3512290.3528754},
doi = {10.1145/3512290.3528754},
abstract = {We present a method of generating diverse collections of neural cellular automata (NCA) to design video game levels. While NCAs have so far only been trained via supervised learning, we present a quality diversity (QD) approach to generating a collection of NCA level generators. By framing the problem as a QD problem, our approach can train diverse level generators, whose output levels vary based on aesthetic or functional criteria. To efficiently generate NCAs, we train generators via Covariance Matrix Adaptation MAP-Elites (CMA-ME), a quality diversity algorithm which specializes in continuous search spaces. We apply our new method to generate level generators for several 2D tile-based games: a maze game, Sokoban, and Zelda. Our results show that CMA-ME can generate small NCAs that are diverse yet capable, often satisfying complex solvability criteria for deterministic agents. We compare against a Compositional Pattern-Producing Network (CPPN) baseline trained to produce diverse collections of generators and show that the NCA representation yields a better exploration of level-space.},
booktitle = {Proceedings of the Genetic and Evolutionary Computation Conference},
pages = {68–76},
numpages = {9},
keywords = {cellular automata, evolutionary strategies, procedural content generation, neural networks},
location = {Boston, Massachusetts},
series = {GECCO '22}
}
2021¶
Journals ¶
Autonomy in Physical Human-Robot Interaction: A Brief Survey
M. Selvaggio, M. Cognetti, S. Nikolaidis, S. Ivaldi, B. Siciliano
IEEE Robotics and Automation Letters, 2021
Abstract
Sharing the control of a robotic system with an autonomous controller allows a human to reduce his/her cognitive and physical workload during the execution of a task. In recent years, the development of inference and learning techniques has widened the spectrum of applications of shared control (SC) approaches, leading to robotic systems that are capable of seamless adaptation of their autonomy level. In this perspective, shared autonomy (SA) can be defined as the design paradigm that enables this adapting behavior of the robotic system. This letter collects the latest results achieved by the research community in the field of SC and SA with special emphasis on physical human-robot interaction (pHRI). Architectures and methods developed for SC and SA are discussed throughout the letter, highlighting the key aspects of each methodology. A discussion about open issues concludes this letter.
Citation
@article{selvaggio2021survey,
author={Selvaggio, Mario and Cognetti, Marco and Nikolaidis, Stefanos and Ivaldi, Serena and Siciliano, Bruno},
journal={IEEE Robotics and Automation Letters},
title={Autonomy in Physical Human-Robot Interaction: A Brief Survey},
year={2021},
volume={6},
number={4},
pages={7989-7996},
doi={10.1109/LRA.2021.3100603},
abstract={Sharing the control of a robotic system with an autonomous controller allows a human to reduce his/her cognitive and physical workload during the execution of a task. In recent years, the development of inference and learning techniques has widened the spectrum of applications of shared control (SC) approaches, leading to robotic systems that are capable of seamless adaptation of their autonomy level. In this perspective, shared autonomy (SA) can be defined as the design paradigm that enables this adapting behavior of the robotic system. This letter collects the latest results achieved by the research community in the field of SC and SA with special emphasis on physical human-robot interaction (pHRI). Architectures and methods developed for SC and SA are discussed throughout the letter, highlighting the key aspects of each methodology. A discussion about open issues concludes this letter.}
}
Learning From Demonstrations Using Signal Temporal Logic in Stochastic and Continuous Domains
A. Gopinath Puranic, J. V. Deshmukh, S. Nikolaidis
IEEE Robotics and Automation Letters, 2021
Abstract
Learning control policies that are safe, robust and interpretable are prominent challenges in developing robotic systems. Learning-from-demonstrations with formal logic is an arising paradigm in reinforcement learning to estimate rewards and extract robot control policies that seek to overcome these challenges. In this approach, we assume that mission-level specifications for the robotic system are expressed in a suitable temporal logic such as Signal Temporal Logic (STL). The main idea is to automatically infer rewards from user demonstrations (that could be suboptimal or incomplete) by evaluating and ranking them w.r.t. the given STL specifications. In contrast to existing work that focuses on deterministic environments and discrete state spaces, in this letter, we propose significant extensions that tackle stochastic environments and continuous state spaces.
Citation
@article{puranic2021signal,
author={Gopinath Puranic, Aniruddh and V. Deshmukh, Jyotirmoy and Nikolaidis, Stefanos},
journal={IEEE Robotics and Automation Letters},
title={Learning From Demonstrations Using Signal Temporal Logic in Stochastic and Continuous Domains},
year={2021},
volume={6},
number={4},
pages={6250-6257},
doi={10.1109/LRA.2021.3092676},
abstract={Learning control policies that are safe, robust and interpretable are prominent challenges in developing robotic systems. Learning-from-demonstrations with formal logic is an arising paradigm in reinforcement learning to estimate rewards and extract robot control policies that seek to overcome these challenges. In this approach, we assume that mission-level specifications for the robotic system are expressed in a suitable temporal logic such as Signal Temporal Logic (STL). The main idea is to automatically infer rewards from user demonstrations (that could be suboptimal or incomplete) by evaluating and ranking them w.r.t. the given STL specifications. In contrast to existing work that focuses on deterministic environments and discrete state spaces, in this letter, we propose significant extensions that tackle stochastic environments and continuous state spaces.}
}
Conferences ¶
Differentiable Quality Diversity
M. Fontaine, S. Nikolaidis
Advances in Neural Information Processing Systems, 2021
NeurIPS 2021 Oral
Abstract
Quality diversity (QD) is a growing branch of stochastic optimization research that studies the problem of generating an archive of solutions that maximize a given objective function but are also diverse with respect to a set of specified measure functions. However, even when these functions are differentiable, QD algorithms treat them as "black boxes", ignoring gradient information. We present the differentiable quality diversity (DQD) problem, a special case of QD, where both the objective and measure functions are first order differentiable. We then present MAP-Elites via Gradient Arborescence (MEGA), a DQD algorithm that leverages gradient information to efficiently explore the joint range of the objective and measure functions. Results in two QD benchmark domains and in searching the latent space of a StyleGAN show that MEGA significantly outperforms state-of-the-art QD algorithms, highlighting DQD's promise for efficient quality diversity optimization when gradient information is available.
Citation
@inproceedings{fontaine2021dqd,
author = {Fontaine, Matthew and Nikolaidis, Stefanos},
booktitle = {Advances in Neural Information Processing Systems},
editor = {M. Ranzato and A. Beygelzimer and Y. Dauphin and P.S. Liang and J. Wortman Vaughan},
pages = {10040--10052},
publisher = {Curran Associates, Inc.},
title = {Differentiable Quality Diversity},
url = {https://proceedings.neurips.cc/paper/2021/file/532923f11ac97d3e7cb0130315b067dc-Paper.pdf},
volume = {34},
year = {2021},
abstract={Quality diversity (QD) is a growing branch of stochastic optimization research that studies the problem of generating an archive of solutions that maximize a given objective function but are also diverse with respect to a set of specified measure functions. However, even when these functions are differentiable, QD algorithms treat them as "black boxes", ignoring gradient information. We present the differentiable quality diversity (DQD) problem, a special case of QD, where both the objective and measure functions are first order differentiable. We then present MAP-Elites via Gradient Arborescence (MEGA), a DQD algorithm that leverages gradient information to efficiently explore the joint range of the objective and measure functions. Results in two QD benchmark domains and in searching the latent space of a StyleGAN show that MEGA significantly outperforms state-of-the-art QD algorithms, highlighting DQD's promise for efficient quality diversity optimization when gradient information is available.}
}
Design and Evaluation of a Hair Combing System Using a General-Purpose Robotic Arm
N. Dennler, E. Shin, M. Matarić, S. Nikolaidis
2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021
Abstract
This work introduces an approach for automatic hair combing by a lightweight robot. For people living with limited mobility, dexterity, or chronic fatigue, combing hair is often a difficult task that negatively impacts personal routines. We propose a modular system for enabling general robot manipulators to assist with a hair-combing task. The system consists of three main components. The first component is the segmentation module, which segments the location of hair in space. The second component is the path planning module that proposes automatically-generated paths through hair based on user input. The final component creates a trajectory for the robot to execute. We quantitatively evaluate the effectiveness of the paths planned by the system with 48 users and qualitatively evaluate the system with 30 users watching videos of the robot performing a hair-combing task in the physical world. The system is shown to effectively comb different hairstyles.
Citation
@INPROCEEDINGS{dennler2021combing,
author={Dennler, Nathaniel and Shin, Eura and Matarić, Maja and Nikolaidis, Stefanos},
booktitle={2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
title={Design and Evaluation of a Hair Combing System Using a General-Purpose Robotic Arm},
year={2021},
volume={},
number={},
pages={3739-3746},
doi={10.1109/IROS51168.2021.9636768},
abstract={This work introduces an approach for automatic hair combing by a lightweight robot. For people living with limited mobility, dexterity, or chronic fatigue, combing hair is often a difficult task that negatively impacts personal routines. We propose a modular system for enabling general robot manipulators to assist with a hair-combing task. The system consists of three main components. The first component is the segmentation module, which segments the location of hair in space. The second component is the path planning module that proposes automatically-generated paths through hair based on user input. The final component creates a trajectory for the robot to execute. We quantitatively evaluate the effectiveness of the paths planned by the system with 48 users and qualitatively evaluate the system with 30 users watching videos of the robot performing a hair-combing task in the physical world. The system is shown to effectively comb different hairstyles.},
}
Robotic Lime Picking by Considering Leaves as Permeable Obstacles
H. Nemlekar, Z. Liu, S. Kothawade, S. Niyaz, B. Raghavan, S. Nikolaidis
2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021
Abstract
The problem of robotic lime picking is challenging; lime plants have dense foliage which makes it difficult for a robotic arm to grasp a lime without coming in contact with leaves. Existing approaches either do not consider leaves, or treat them as obstacles and completely avoid them, often resulting in undesirable or infeasible plans. We focus on reaching a lime in the presence of dense foliage by considering the leaves of a plant as 'permeable obstacles' with a collision cost. We then adapt the rapidly exploring random tree star (RRT*) algorithm for the problem of fruit harvesting by incorporating the cost of collision with leaves into the path cost. To reduce the time required for finding low-cost paths to goal, we bias the growth of the tree using an artificial potential field (APF). We compare our proposed method with prior work in a 2-D environment and a 6-DOF robot simulation. Our experiments and a real-world demonstration on a robotic lime picking task demonstrate the applicability of our approach.
Citation
@INPROCEEDINGS{9636396,
author={Nemlekar, Heramb and Liu, Ziang and Kothawade, Suraj and Niyaz, Sherdil and Raghavan, Barath and Nikolaidis, Stefanos},
booktitle={2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
title={Robotic Lime Picking by Considering Leaves as Permeable Obstacles},
year={2021},
volume={},
number={},
pages={3278-3284},
doi={10.1109/IROS51168.2021.9636396},
abstract={The problem of robotic lime picking is challenging; lime plants have dense foliage which makes it difficult for a robotic arm to grasp a lime without coming in contact with leaves. Existing approaches either do not consider leaves, or treat them as obstacles and completely avoid them, often resulting in undesirable or infeasible plans. We focus on reaching a lime in the presence of dense foliage by considering the leaves of a plant as 'permeable obstacles' with a collision cost. We then adapt the rapidly exploring random tree star (RRT*) algorithm for the problem of fruit harvesting by incorporating the cost of collision with leaves into the path cost. To reduce the time required for finding low-cost paths to goal, we bias the growth of the tree using an artificial potential field (APF). We compare our proposed method with prior work in a 2-D environment and a 6-DOF robot simulation. Our experiments and a real-world demonstration on a robotic lime picking task demonstrate the applicability of our approach.}
}
On the Importance of Environments in Human-Robot Coordination
M. Fontaine*, Y. Hsu*, Y. Zhang*, B. Tjanaka, S. Nikolaidis
Robotics: Science and Systems, July 2021
Abstract
When studying robots collaborating with humans, much of the focus has been on robot policies that coordinate fluently with human teammates in collaborative tasks. However, less emphasis has been placed on the effect of the environment on coordination behaviors. To thoroughly explore environments that result in diverse behaviors, we propose a framework for procedural generation of environments that are (1) stylistically similar to human-authored environments, (2) guaranteed to be solvable by the human-robot team, and (3) diverse with respect to coordination measures. We analyze the procedurally generated environments in the Overcooked benchmark domain via simulation and an online user study. Results show that the environments result in qualitatively different emerging behaviors and statistically significant differences in collaborative fluency metrics, even when the robot runs the same planning algorithm.
Citation
@inproceedings{fontaine2021importance,
title={On the Importance of Environments in Human-Robot Coordination},
author={Matthew C. Fontaine and Ya-Chuan Hsu and Yulun Zhang and Bryon Tjanaka and Stefanos Nikolaidis},
year={2021},
month={July},
url={https://arxiv.org/abs/2106.10853},
booktitle={Proceedings of Robotics: Science and Systems},
doi={10.15607/RSS.2021.XVII.038},
abstract={When studying robots collaborating with humans, much of the focus has been on robot policies that coordinate fluently with human teammates in collaborative tasks. However, less emphasis has been placed on the effect of the environment on coordination behaviors. To thoroughly explore environments that result in diverse behaviors, we propose a framework for procedural generation of environments that are (1) stylistically similar to human-authored environments, (2) guaranteed to be solvable by the human-robot team, and (3) diverse with respect to coordination measures. We analyze the procedurally generated environments in the Overcooked benchmark domain via simulation and an online user study. Results show that the environments result in qualitatively different emerging behaviors and statistically significant differences in collaborative fluency metrics, even when the robot runs the same planning algorithm.}
}
A Quality Diversity Approach to Automatically Generating Human-Robot Interaction Scenarios in Shared Autonomy
M. Fontaine, S. Nikolaidis
Robotics: Science and Systems, July 2021
Abstract
The growth of scale and complexity of interactions between humans and robots highlights the need for new computational methods to automatically evaluate novel algorithms and applications. Exploring diverse scenarios of humans and robots interacting in simulation can improve understanding of the robotic system and avoid potentially costly failures in real-world settings. We formulate this problem as a quality diversity (QD) problem, where the goal is to discover diverse failure scenarios by simultaneously exploring both environments and human actions. We focus on the shared autonomy domain, where the robot attempts to infer the goal of a human operator, and adopt the QD algorithm MAP-Elites to generate scenarios for two published algorithms in this domain: shared autonomy via hindsight optimization and linear policy blending. Some of the generated scenarios confirm previous theoretical findings, while others are surprising and bring about a new understanding of state-of-the-art implementations. Our experiments show that MAP-Elites outperforms Monte-Carlo simulation and optimization based methods in effectively searching the scenario space, highlighting its promise for automatic evaluation of algorithms in human-robot interaction.
Citation
@inproceedings{fontaine2021shared,
title={A Quality Diversity Approach to Automatically Generating Human-Robot
Interaction Scenarios in Shared Autonomy},
author={Matthew C. Fontaine and Stefanos Nikolaidis},
year={2021},
month={July},
url={https://arxiv.org/abs/2012.04283},
booktitle={Proceedings of Robotics: Science and Systems},
doi={10.15607/RSS.2021.XVII.036},
abstract={The growth of scale and complexity of interactions between humans and robots highlights the need for new computational methods to automatically evaluate novel algorithms and applications. Exploring diverse scenarios of humans and robots interacting in simulation can improve understanding of the robotic system and avoid potentially costly failures in real-world settings. We formulate this problem as a quality diversity (QD) problem, where the goal is to discover diverse failure scenarios by simultaneously exploring both environments and human actions. We focus on the shared autonomy domain, where the robot attempts to infer the goal of a human operator, and adopt the QD algorithm MAP-Elites to generate scenarios for two published algorithms in this domain: shared autonomy via hindsight optimization and linear policy blending. Some of the generated scenarios confirm previous theoretical findings, while others are surprising and bring about a new understanding of state-of-the-art implementations. Our experiments show that MAP-Elites outperforms Monte-Carlo simulation and optimization based methods in effectively searching the scenario space, highlighting its promise for automatic evaluation of algorithms in human-robot interaction.}
}
Personalizing User Engagement Dynamics in a Non-Verbal Communication Game for Cerebral Palsy
N. Dennler, C. Yunis, J. Realmuto, T. Sanger, S. Nikolaidis, M. Matarić
2021 30th IEEE International Conference on Robot Human Interactive Communication (RO-MAN), 2021
Abstract
Children and adults with cerebral palsy (CP) can have involuntary upper limb movements as a consequence of the symptoms that characterize their motor disability, leading to difficulties in communicating with caretakers and peers. We describe how a socially assistive robot may help individuals with CP to practice non-verbal communicative gestures using an active orthosis in a one-on-one number-guessing game. We performed a user study and data collection with participants with CP; we found that participants preferred an embodied robot over a screen-based agent, and we used the participant data to train personalized models of participant engagement dynamics that can be used to select personalized robot actions. Our work highlights the benefit of personalized models in the engagement of users with CP with a socially assistive robot and offers design insights for future work in this area.
Citation
@inproceedings{dennler2021personalizing,
author={Dennler, Nathaniel and Yunis, Catherine and Realmuto, Jonathan and Sanger, Terence and Nikolaidis, Stefanos and Matarić, Maja},
booktitle={2021 30th IEEE International Conference on Robot Human Interactive Communication (RO-MAN)},
title={Personalizing User Engagement Dynamics in a Non-Verbal Communication Game for Cerebral Palsy},
year={2021},
pages={873-879},
doi={10.1109/RO-MAN50785.2021.9515466},
abstract={Children and adults with cerebral palsy (CP) can have involuntary upper limb movements as a consequence of the symptoms that characterize their motor disability, leading to difficulties in communicating with caretakers and peers. We describe how a socially assistive robot may help individuals with CP to practice non-verbal communicative gestures using an active orthosis in a one-on-one number-guessing game. We performed a user study and data collection with participants with CP; we found that participants preferred an embodied robot over a screen-based agent, and we used the participant data to train personalized models of participant engagement dynamics that can be used to select personalized robot actions. Our work highlights the benefit of personalized models in the engagement of users with CP with a socially assistive robot and offers design insights for future work in this area.}
}
Learning Collaborative Pushing and Grasping Policies in Dense Clutter
B. Tang, M. Corsaro, G. Konidaris, S. Nikolaidis, S. Tellex
2021 IEEE International Conference on Robotics and Automation (ICRA), May 2021
Abstract
Robots must reason about pushing and grasping in order to engage in flexible manipulation in cluttered environments. Earlier works on learning pushing and grasping only consider each operation in isolation or are limited to top-down grasping and bin-picking. We train a robot to learn joint planar pushing and 6-degree-of-freedom (6-DoF) grasping policies by self-supervision. Two separate deep neural networks are trained to map from 3D visual observations to actions with a Q-learning framework. With collaborative pushes and expanded grasping action space, our system can deal with cluttered scenes with a wide variety of objects (e.g. grasping a plate from the side after pushing away surrounding obstacles). We compare our system to the state-of-the-art baseline model VPG in simulation and outperform it with 10% higher action efficiency and 20% higher grasp success rate. We then demonstrate our system on a KUKA LBR iiwa arm with a Robotiq 3-finger gripper.
Citation
@article{tang2021learning,
title={Learning Collaborative Pushing and Grasping Policies in Dense Clutter},
author={Tang, Bingjie and Corsaro, Matthew and Konidaris, George and Nikolaidis, Stefanos and Tellex, Stefanie},
journal={2021 IEEE International Conference on Robotics and Automation (ICRA)},
year={2021},
month={May},
url={https://cs.brown.edu/~gdk/pubs/push_grasp_clutter.pdf},
abstract={Robots must reason about pushing and grasping in order to engage in flexible manipulation in cluttered environments. Earlier works on learning pushing and grasping only consider each operation in isolation or are limited to top-down grasping and bin-picking. We train a robot to learn joint planar pushing and 6-degree-of-freedom (6-DoF) grasping policies by self-supervision. Two separate deep neural networks are trained to map from 3D visual observations to actions with a Q-learning framework. With collaborative pushes and expanded grasping action space, our system can deal with cluttered scenes with a wide variety of objects (e.g. grasping a plate from the side after pushing away surrounding obstacles). We compare our system to the state-of-the-art baseline model VPG in simulation and outperform it with 10% higher action efficiency and 20% higher grasp success rate. We then demonstrate our system on a KUKA LBR iiwa arm with a Robotiq 3-finger gripper.}
}
Two-Stage Clustering of Human Preferences for Action Prediction in Assembly Tasks
H. Nemlekar, J. Modi, S. Gupta, S. Nikolaidis
2021 IEEE International Conference on Robotics and Automation (ICRA), May 2021
Abstract
To effectively assist human workers in assembly tasks a robot must proactively offer support by inferring their preferences in sequencing the task actions. Previous work has focused on learning the dominant preferences of human workers for simple tasks largely based on their intended goal. However, people may have preferences at different resolutions: they may share the same high-level preference for the order of the sub-tasks but differ in the sequence of individual actions. We propose a two-stage approach for learning and inferring the preferences of human operators based on the sequence of sub-tasks and actions. We conduct an IKEA assembly study and demonstrate how our approach is able to learn the dominant preferences in a complex task. We show that our approach improves the prediction of human actions through cross-validation. Lastly, we show that our two-stage approach improves the efficiency of task execution in an online experiment, and demonstrate its applicability in a real-world robot-assisted IKEA assembly.
Citation
@article{nemlekar2021twostage,
title={Two-Stage Clustering of Human Preferences for Action Prediction in
Assembly Tasks},
author={Heramb Nemlekar and Jignesh Modi and Satyandra K. Gupta and Stefanos Nikolaidis},
journal={2021 IEEE International Conference on Robotics and Automation (ICRA)},
year={2021},
month={May},
url={https://arxiv.org/abs/2103.14994},
abstract={To effectively assist human workers in assembly tasks a robot must proactively offer support by inferring their preferences in sequencing the task actions. Previous work has focused on learning the dominant preferences of human workers for simple tasks largely based on their intended goal. However, people may have preferences at different resolutions: they may share the same high-level preference for the order of the sub-tasks but differ in the sequence of individual actions. We propose a two-stage approach for learning and inferring the preferences of human operators based on the sequence of sub-tasks and actions. We conduct an IKEA assembly study and demonstrate how our approach is able to learn the dominant preferences in a complex task. We show that our approach improves the prediction of human actions through cross-validation. Lastly, we show that our two-stage approach improves the efficiency of task execution in an online experiment, and demonstrate its applicability in a real-world robot-assisted IKEA assembly.}
}
Illuminating Mario Scenes in the Latent Space of a Generative Adversarial Network
M. Fontaine, R. Liu, A. Khalifa, J. Modi, J. Togelius, A. Hoover, S. Nikolaidis
AAAI Conference on Artificial Intelligence, February 2021
Abstract
Generative adversarial networks (GANs) are quickly becoming a ubiquitous approach to procedurally generating video game levels. While GAN generated levels are stylistically similar to human-authored examples, human designers often want to explore the generative design space of GANs to extract interesting levels. However, human designers find latent vectors opaque and would rather explore along dimensions the designer specifies, such as number of enemies or obstacles. We propose using state-of-the-art quality diversity algorithms designed to optimize continuous spaces, i.e. MAP-Elites with a directional variation operator and Covariance Matrix Adaptation MAP-Elites, to efficiently explore the latent space of a GAN to extract levels that vary across a set of specified gameplay measures. In the benchmark domain of Super Mario Bros, we demonstrate how designers may specify gameplay measures to our system and extract high-quality (playable) levels with a diverse range of level mechanics, while still maintaining stylistic similarity to human authored examples. An online user study shows how the different mechanics of the automatically generated levels affect subjective ratings of their perceived difficulty and appearance.
Citation
@article{fontaine2021illuminating,
title={Illuminating Mario Scenes in the Latent Space of a Generative Adversarial Network},
volume={35},
url={https://ojs.aaai.org/index.php/AAAI/article/view/16740},
journal={AAAI Conference on Artificial Intelligence},
author={Matthew C. Fontaine and Ruilin Liu and Ahmed Khalifa and Jignesh Modi and Julian Togelius and Amy K. Hoover and Stefanos Nikolaidis},
year={2021},
month={February},
abstract={Generative adversarial networks (GANs) are quickly becoming a ubiquitous approach to procedurally generating video game levels. While GAN generated levels are stylistically similar to human-authored examples, human designers often want to explore the generative design space of GANs to extract interesting levels. However, human designers find latent vectors opaque and would rather explore along dimensions the designer specifies, such as number of enemies or obstacles. We propose using state-of-the-art quality diversity algorithms designed to optimize continuous spaces, i.e. MAP-Elites with a directional variation operator and Covariance Matrix Adaptation MAP-Elites, to efficiently explore the latent space of a GAN to extract levels that vary across a set of specified gameplay measures. In the benchmark domain of Super Mario Bros, we demonstrate how designers may specify gameplay measures to our system and extract high-quality (playable) levels with a diverse range of level mechanics, while still maintaining stylistic similarity to human authored examples. An online user study shows how the different mechanics of the automatically generated levels affect subjective ratings of their perceived difficulty and appearance.}
}
2020¶
Journals ¶
Trust-Aware Decision Making for Human-Robot Collaboration: Model Learning and Planning
M. Chen*, S. Nikolaidis*, H. Soh, D. Hsu, S. Srinivasa
ACM Transactions on Human-Robot Interaction, January 2020
Abstract
Trust in autonomy is essential for effective human-robot collaboration and user adoption of autonomous systems such as robot assistants. This article introduces a computational model that integrates trust into robot decision making. Specifically, we learn from data a partially observable Markov decision process (POMDP) with human trust as a latent variable. The trust-POMDP model provides a principled approach for the robot to (i) infer the trust of a human teammate through interaction, (ii) reason about the effect of its own actions on human trust, and (iii) choose actions that maximize team performance over the long term. We validated the model through human subject experiments on a table clearing task in simulation (201 participants) and with a real robot (20 participants). In our studies, the robot builds human trust by manipulating low-risk objects first. Interestingly, the robot sometimes fails intentionally to modulate human trust and achieve the best team performance. These results show that the trust-POMDP calibrates trust to improve human-robot team performance over the long term. Further, they highlight that maximizing trust alone does not always lead to the best performance.
Citation
@article{chen2020trust,
author = {Chen, Min and Nikolaidis, Stefanos and Soh, Harold and Hsu, David and Srinivasa, Siddhartha},
title = {Trust-Aware Decision Making for Human-Robot Collaboration: Model Learning and Planning},
year = {2020},
issue_date = {February 2020},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {9},
number = {2},
url = {https://doi.org/10.1145/3359616},
doi = {10.1145/3359616},
abstract = {Trust in autonomy is essential for effective human-robot collaboration and user adoption of autonomous systems such as robot assistants. This article introduces a computational model that integrates trust into robot decision making. Specifically, we learn from data a partially observable Markov decision process (POMDP) with human trust as a latent variable. The trust-POMDP model provides a principled approach for the robot to (i) infer the trust of a human teammate through interaction, (ii) reason about the effect of its own actions on human trust, and (iii) choose actions that maximize team performance over the long term. We validated the model through human subject experiments on a table clearing task in simulation (201 participants) and with a real robot (20 participants). In our studies, the robot builds human trust by manipulating low-risk objects first. Interestingly, the robot sometimes fails intentionally to modulate human trust and achieve the best team performance. These results show that the trust-POMDP calibrates trust to improve human-robot team performance over the long term. Further, they highlight that maximizing trust alone does not always lead to the best performance.},
journal = {J. Hum.-Robot Interact.},
month = jan,
articleno = {9},
numpages = {23},
keywords = {human-robot collaboration, Trust models, partially observable Markov decision process (POMDP)}
}
Conferences ¶
Learning from Demonstrations using Signal Temporal Logic
A. Puranic, J. Deshmukh, S. Nikolaidis
Conference on Robot Learning, November 2020
Abstract
We present a model-based reinforcement learning framework for robot locomotion that achieves walking based on only 4.5 minutes of data collected on a quadruped robot. To accurately model the robot’s dynamics over a long horizon, we introduce a loss function that tracks the model’s prediction over multiple timesteps. We adapt model predictive control to account for planning latency, which allows the learned model to be used for real time control. Additionally, to ensure safe exploration during model learning, we embed prior knowledge of leg trajectories into the action space. The resulting system achieves fast and robust locomotion. Unlike model-free methods, which optimize for a particular task, our planner can use the same learned dynamics for various tasks, simply by changing the reward function.1 To the best of our knowledge, our approach is more than an order of magnitude more sample efficient than current model-free methods.
Citation
@InProceedings{puranic2020signal,
title = {Learning from Demonstrations using Signal Temporal Logic},
author = {Aniruddh Puranic and Jyotirmoy Deshmukh and Stefanos Nikolaidis},
booktitle = {Conference on Robot Learning},
year = {2020},
month = {November},
abstract = {We present a model-based reinforcement learning framework for robot locomotion that achieves walking based on only 4.5 minutes of data collected on a quadruped robot. To accurately model the robot’s dynamics over a long horizon, we introduce a loss function that tracks the model’s prediction over multiple timesteps. We adapt model predictive control to account for planning latency, which allows the learned model to be used for real time control. Additionally, to ensure safe exploration during model learning, we embed prior knowledge of leg trajectories into the action space. The resulting system achieves fast and robust locomotion. Unlike model-free methods, which optimize for a particular task, our planner can use the same learned dynamics for various tasks, simply by changing the reward function.1 To the best of our knowledge, our approach is more than an order of magnitude more sample efficient than current model-free methods.}
}
Robot Learning in Mixed Adversarial and Collaborative Settings
S. Yoon, S. Nikolaidis
2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), January 2020
Abstract
Previous work has shown that interacting with a human adversary can significantly improve the efficiency of the learning process in robot grasping. However, people are not consistent in applying adversarial forces; instead they may alternate between acting antagonistically with the robot or helping the robot achieve its tasks. We propose a physical framework for robot learning in a mixed adversarial/collaborative setting, where a second agent may act as a collaborator or as an antagonist, unbeknownst to the robot. The framework leverages prior estimates of the reward function to infer whether the actions of the second agent are collaborative or adversarial. Integrating the inference in an adversarial learning algorithm can significantly improve the robustness of learned grasps in a manipulation task.
Citation
@INPROCEEDINGS{9341753,
author={Yoon, Seung Hee and Nikolaidis, Stefanos},
booktitle={2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
title={Robot Learning in Mixed Adversarial and Collaborative Settings},
year={2020},
month={January},
volume={},
number={},
pages={9329-9336},
doi={10.1109/IROS45743.2020.9341753},
abstract={Previous work has shown that interacting with a human adversary can significantly improve the efficiency of the learning process in robot grasping. However, people are not consistent in applying adversarial forces; instead they may alternate between acting antagonistically with the robot or helping the robot achieve its tasks. We propose a physical framework for robot learning in a mixed adversarial/collaborative setting, where a second agent may act as a collaborator or as an antagonist, unbeknownst to the robot. The framework leverages prior estimates of the reward function to infer whether the actions of the second agent are collaborative or adversarial. Integrating the inference in an adversarial learning algorithm can significantly improve the robustness of learned grasps in a manipulation task.},
}
Video Game Level Repair via Mixed Integer Linear Programming
H. Zhang*, M. Fontaine*, A. Hoover, J. Togelius, B. Dilkina, S. Nikolaidis
AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, October 2020
Abstract
Recent advancements in procedural content generation via machine learning enable the generation of video-game levels that are aesthetically similar to human-authored examples. However, the generated levels are often unplayable without additional editing. We propose a “generate-then-repair” framework for automatic generation of playable levels adhering to specific styles. The framework constructs levels using a generative adversarial network (GAN) trained with human-authored examples and repairs them using a mixed-integer linear program (MIP) with playability constraints. A key component of the framework is computing minimum cost edits between the GAN generated level and the solution of the MIP solver, which we cast as a minimum cost network flow problem. Results show that the proposed framework generates a diverse range of playable levels, that capture the spatial relationships between objects exhibited in the human-authored levels.
Citation
@article{zhang2020repair,
title={Video Game Level Repair via Mixed Integer Linear Programming},
volume={16},
url={https://ojs.aaai.org/index.php/AIIDE/article/view/7424},
abstract={Recent advancements in procedural content generation via machine learning enable the generation of video-game levels that are aesthetically similar to human-authored examples. However, the generated levels are often unplayable without additional editing. We propose a “generate-then-repair” framework for automatic generation of playable levels adhering to specific styles. The framework constructs levels using a generative adversarial network (GAN) trained with human-authored examples and repairs them using a mixed-integer linear program (MIP) with playability constraints. A key component of the framework is computing minimum cost edits between the GAN generated level and the solution of the MIP solver, which we cast as a minimum cost network flow problem. Results show that the proposed framework generates a diverse range of playable levels, that capture the spatial relationships between objects exhibited in the human-authored levels.},
number={1},
journal={Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment},
author={Zhang, Hejia and Fontaine, Matthew and Hoover, Amy and Togelius, Julian and Dilkina, Bistra and Nikolaidis, Stefanos},
year={2020},
month={October},
pages={151-158}
}
Covariance Matrix Adaptation for the Rapid Illumination of Behavior Space
M. Fontaine, J. Togelius, S. Nikolaidis, A. Hoover
2020 Genetic and Evolutionary Computation Conference, June 2020
Abstract
We focus on the challenge of finding a diverse collection of quality solutions on complex continuous domains. While quality diversity (QD) algorithms like Novelty Search with Local Competition (NSLC) and MAP-Elites are designed to generate a diverse range of solutions, these algorithms require a large number of evaluations for exploration of continuous spaces. Meanwhile, variants of the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) are among the best-performing derivative-free optimizers in single-objective continuous domains. This paper proposes a new QD algorithm called Covariance Matrix Adaptation MAP-Elites (CMA-ME). Our new algorithm combines the self-adaptation techniques of CMA-ES with archiving and mapping techniques for maintaining diversity in QD. Results from experiments based on standard continuous optimization benchmarks show that CMA-ME finds better-quality solutions than MAP-Elites; similarly, results on the strategic game Hearthstone show that CMA-ME finds both a higher overall quality and broader diversity of strategies than both CMA-ES and MAP-Elites. Overall, CMA-ME more than doubles the performance of MAP-Elites using standard QD performance metrics. These results suggest that QD algorithms augmented by operators from state-of-the-art optimization algorithms can yield high-performing methods for simultaneously exploring and optimizing continuous search spaces, with significant applications to design, testing, and reinforcement learning among other domains.
Citation
@inproceedings{fontaine2020covariance,
author = {Fontaine, Matthew C. and Togelius, Julian and Nikolaidis, Stefanos and Hoover, Amy K.},
title = {Covariance Matrix Adaptation for the Rapid Illumination of Behavior Space},
year = {2020},
month = {June},
isbn = {9781450371285},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3377930.3390232},
doi = {10.1145/3377930.3390232},
abstract = {We focus on the challenge of finding a diverse collection of quality solutions on complex continuous domains. While quality diversity (QD) algorithms like Novelty Search with Local Competition (NSLC) and MAP-Elites are designed to generate a diverse range of solutions, these algorithms require a large number of evaluations for exploration of continuous spaces. Meanwhile, variants of the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) are among the best-performing derivative-free optimizers in single-objective continuous domains. This paper proposes a new QD algorithm called Covariance Matrix Adaptation MAP-Elites (CMA-ME). Our new algorithm combines the self-adaptation techniques of CMA-ES with archiving and mapping techniques for maintaining diversity in QD. Results from experiments based on standard continuous optimization benchmarks show that CMA-ME finds better-quality solutions than MAP-Elites; similarly, results on the strategic game Hearthstone show that CMA-ME finds both a higher overall quality and broader diversity of strategies than both CMA-ES and MAP-Elites. Overall, CMA-ME more than doubles the performance of MAP-Elites using standard QD performance metrics. These results suggest that QD algorithms augmented by operators from state-of-the-art optimization algorithms can yield high-performing methods for simultaneously exploring and optimizing continuous search spaces, with significant applications to design, testing, and reinforcement learning among other domains.},
booktitle = {Proceedings of the 2020 Genetic and Evolutionary Computation Conference},
pages = {94–102},
numpages = {9},
keywords = {hearthstone, evolutionary algorithms, MAP-Elites, optimization, quality diversity, illumination algorithms},
location = {Canc'{u}n, Mexico},
series = {GECCO '20}
}
Fair Contextual Multi-Armed Bandits: Theory and Experiments
Y. Chen, A. Cuellar, H. Luo, J. Modi, H. Nemlekar, S. Nikolaidis
36th Conference on Uncertainty in Artificial Intelligence (UAI), August 2020
Abstract
When an AI system interacts with multiple users, it frequently needs to make allocation decisions. For instance, a virtual agent decides whom to pay attention to in a group, or a factory robot selects a worker to deliver a part.Demonstrating fairness in decision making is essential for such systems to be broadly accepted. We introduce a Multi-Armed Bandit algorithm with fairness constraints, where fairness is defined as a minimum rate at which a task or a resource is assigned to a user. The proposed algorithm uses contextual information about the users and the task and makes no assumptions on how the losses capturing the performance of different users are generated. We provide theoretical guarantees of performance and empirical results from simulation and an online user study. The results highlight the benefit of accounting for contexts in fair decision making, especially when users perform better at some contexts and worse at others.
Citation
@InProceedings{chen2020experiments,
title = {Fair Contextual Multi-Armed Bandits: Theory and Experiments},
author = {Chen, Yifang and Cuellar, Alex and Luo, Haipeng and Modi, Jignesh and Nemlekar, Heramb and Nikolaidis, Stefanos},
booktitle = {Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI)},
pages = {181--190},
year = {2020},
editor = {Jonas Peters and David Sontag},
volume = {124},
series = {Proceedings of Machine Learning Research},
month = {03--06 Aug},
publisher = {PMLR},
pdf = {http://proceedings.mlr.press/v124/chen20a/chen20a.pdf},
url = { http://proceedings.mlr.press/v124/chen20a.html },
abstract = {When an AI system interacts with multiple users, it frequently needs to make allocation decisions. For instance, a virtual agent decides whom to pay attention to in a group, or a factory robot selects a worker to deliver a part.Demonstrating fairness in decision making is essential for such systems to be broadly accepted. We introduce a Multi-Armed Bandit algorithm with fairness constraints, where fairness is defined as a minimum rate at which a task or a resource is assigned to a user. The proposed algorithm uses contextual information about the users and the task and makes no assumptions on how the losses capturing the performance of different users are generated. We provide theoretical guarantees of performance and empirical results from simulation and an online user study. The results highlight the benefit of accounting for contexts in fair decision making, especially when users perform better at some contexts and worse at others.},
}
The Fair Contextual Multi-Armed Bandit
Y. Chen, A. Cuellar, H. Luo, J. Modi, H. Nemlekar, S. Nikolaidis
19th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS) (short paper), May 2020
Abstract
When an AI system interacts with multiple users, it frequently needs to make allocation decisions. For instance, a virtual agent decides whom to pay attention to in a group setting, or a factory robot selects a worker to deliver a part. Demonstrating fairness in decision making is essential for such systems to be broadly accepted. We introduce a Multi-Armed Bandit algorithm with fairness constraints, where fairness is defined as a minimum rate that a task or a resource is assigned to a user. The proposed algorithm uses contextual information about the users and the task and makes no assumptions on how the losses capturing the performance of different users are generated. We view this as an exciting step towards including fairness constraints in resource allocation decisions.
Citation
@inproceedings{chen2020bandit,
author = {Chen, Yifang and Cuellar, Alex and Luo, Haipeng and Modi, Jignesh and Nemlekar, Heramb and Nikolaidis, Stefanos},
title = {The Fair Contextual Multi-Armed Bandit},
year = {2020},
month = {May},
isbn = {9781450375184},
publisher = {International Foundation for Autonomous Agents and Multiagent Systems},
address = {Richland, SC},
abstract = {When an AI system interacts with multiple users, it frequently needs to make allocation decisions. For instance, a virtual agent decides whom to pay attention to in a group setting, or a factory robot selects a worker to deliver a part. Demonstrating fairness in decision making is essential for such systems to be broadly accepted. We introduce a Multi-Armed Bandit algorithm with fairness constraints, where fairness is defined as a minimum rate that a task or a resource is assigned to a user. The proposed algorithm uses contextual information about the users and the task and makes no assumptions on how the losses capturing the performance of different users are generated. We view this as an exciting step towards including fairness constraints in resource allocation decisions.},
booktitle = {Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems},
pages = {1810–1812},
numpages = {3},
keywords = {multi-armed bandits, resource allocation, fairness},
location = {Auckland, New Zealand},
series = {AAMAS '20}
}
Multi-Armed Bandits with Fairness Constraints for Distributing Resources to Human Teammates
H. Claure, Y. Chen, J. Modi, M. Jung, S. Nikolaidis
2020 ACM/IEEE International Conference on Human-Robot Interaction, March 2020
Abstract
How should a robot that collaborates with multiple people decide upon the distribution of resources (e.g. social attention, or parts needed for an assembly)? People are uniquely attuned to how resources are distributed. A decision to distribute more resources to one team member than another might be perceived as unfair with potentially detrimental effects for trust. We introduce a multi-armed bandit algorithm with fairness constraints, where a robot distributes resources to human teammates of different skill levels. In this problem, the robot does not know the skill level of each human teammate, but learns it by observing their performance over time. We define fairness as a constraint on the minimum rate that each human teammate is selected throughout the task. We provide theoretical guarantees on performance and perform a large-scale user study, where we adjust the level of fairness in our algorithm. Results show that fairness in resource distribution has a significant effect on users' trust in the system.
Citation
@inproceedings{claure2020bandits,
author = {Claure, Houston and Chen, Yifang and Modi, Jignesh and Jung, Malte and Nikolaidis, Stefanos},
title = {Multi-Armed Bandits with Fairness Constraints for Distributing Resources to Human Teammates},
year = {2020},
month = {March},
isbn = {9781450367462},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3319502.3374806},
doi = {10.1145/3319502.3374806},
abstract = {How should a robot that collaborates with multiple people decide upon the distribution of resources (e.g. social attention, or parts needed for an assembly)? People are uniquely attuned to how resources are distributed. A decision to distribute more resources to one team member than another might be perceived as unfair with potentially detrimental effects for trust. We introduce a multi-armed bandit algorithm with fairness constraints, where a robot distributes resources to human teammates of different skill levels. In this problem, the robot does not know the skill level of each human teammate, but learns it by observing their performance over time. We define fairness as a constraint on the minimum rate that each human teammate is selected throughout the task. We provide theoretical guarantees on performance and perform a large-scale user study, where we adjust the level of fairness in our algorithm. Results show that fairness in resource distribution has a significant effect on users' trust in the system.},
booktitle = {Proceedings of the 2020 ACM/IEEE International Conference on Human-Robot Interaction},
pages = {299–308},
numpages = {10},
keywords = {reinforcement learning, multi-armed bandits, trust, fairness},
location = {Cambridge, United Kingdom},
series = {HRI '20}
}
Communicating Robot Goals via Haptic Feedback in Manipulation Tasks
R. Pocius, N. Zamani, H. Culbertson, S. Nikolaidis
2020 ACM/IEEE International Conference on Human-Robot Interaction, March 2020
Abstract
In shared autonomy, human teleoperation blends with intelligent robot autonomy to create robot control. This combination enables assistive robot manipulators to help human operators by predicting and reaching the human's desired target. However, this reduces the control authority of the user and the transparency of the interaction. This negatively affects their willingness to use the system. We propose haptic feedback as a seamless and natural way for the robot to communicate information to the user and assist them in completing the task. A proof-of-concept demonstration of our system illustrates the effectiveness of haptic feedback in communicating the robot's goals to the user. We hypothesize that this can be an effective way to improve performance in teleoperated manipulation tasks, while retaining the control authority of the user.
Citation
@inproceedings{pocius2020haptic,
author = {Pocius, Rey and Zamani, Naghmeh and Culbertson, Heather and Nikolaidis, Stefanos},
title = {Communicating Robot Goals via Haptic Feedback in Manipulation Tasks},
year = {2020},
month = {March},
isbn = {9781450370578},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3371382.3377444},
doi = {10.1145/3371382.3377444},
abstract = {In shared autonomy, human teleoperation blends with intelligent robot autonomy to create robot control. This combination enables assistive robot manipulators to help human operators by predicting and reaching the human's desired target. However, this reduces the control authority of the user and the transparency of the interaction. This negatively affects their willingness to use the system. We propose haptic feedback as a seamless and natural way for the robot to communicate information to the user and assist them in completing the task. A proof-of-concept demonstration of our system illustrates the effectiveness of haptic feedback in communicating the robot's goals to the user. We hypothesize that this can be an effective way to improve performance in teleoperated manipulation tasks, while retaining the control authority of the user.},
booktitle = {Companion of the 2020 ACM/IEEE International Conference on Human-Robot Interaction},
pages = {591–593},
numpages = {3},
keywords = {manipulation, teleoperation, haptics, human-robot collaboration},
location = {Cambridge, United Kingdom},
series = {HRI '20}
}
Preprints ¶
Robot Learning and Execution of Collaborative Manipulation Plans from YouTube Videos
H. Zhang, S. Nikolaidis
CoRR, February 2019
Abstract
People often watch videos on the web to learn how to cook new recipes, assemble furniture or repair a computer. We wish to enable robots with the very same capability. This is challenging; there is a large variation in manipulation actions and some videos even involve multiple persons, who collaborate by sharing and exchanging objects and tools. Furthermore, the learned representations need to be general enough to be transferable to robotic systems. On the other hand, previous work has shown that the space of human manipulation actions has a linguistic, hierarchical structure that relates actions to manipulated objects and tools. Building upon this theory of language for action, we propose a framework for understanding and executing demonstrated action sequences from full-length, unconstrained cooking videos on the web. The framework takes as input a cooking video annotated with object labels and bounding boxes, and outputs a collaborative manipulation action plan for one or more robotic arms. We demonstrate performance of the system in a standardized dataset of 100 YouTube cooking videos, as well as in three full-length Youtube videos that include collaborative actions between two participants. We additionally propose an open-source platform for executing the learned plans in a simulation environment as well as with an actual robotic arm.
Citation
@article{DBLP:journals/corr/abs-1911-10686,
author = {Hejia Zhang and
Stefanos Nikolaidis},
title = {Robot Learning and Execution of Collaborative Manipulation Plans from
YouTube Videos},
journal = {CoRR},
volume = {abs/1911.10686},
year = {2019},
month = {February},
url = {http://arxiv.org/abs/1911.10686},
archivePrefix = {arXiv},
eprint = {1911.10686},
timestamp = {Tue, 03 Dec 2019 14:15:54 +0100},
biburl = {https://dblp.org/rec/journals/corr/abs-1911-10686.bib},
bibsource = {dblp computer science bibliography, https://dblp.org},
abstract = {People often watch videos on the web to learn how to cook new recipes, assemble furniture or repair a computer. We wish to enable robots with the very same capability. This is challenging; there is a large variation in manipulation actions and some videos even involve multiple persons, who collaborate by sharing and exchanging objects and tools. Furthermore, the learned representations need to be general enough to be transferable to robotic systems. On the other hand, previous work has shown that the space of human manipulation actions has a linguistic, hierarchical structure that relates actions to manipulated objects and tools. Building upon this theory of language for action, we propose a framework for understanding and executing demonstrated action sequences from full-length, unconstrained cooking videos on the web. The framework takes as input a cooking video annotated with object labels and bounding boxes, and outputs a collaborative manipulation action plan for one or more robotic arms. We demonstrate performance of the system in a standardized dataset of 100 YouTube cooking videos, as well as in three full-length Youtube videos that include collaborative actions between two participants. We additionally propose an open-source platform for executing the learned plans in a simulation environment as well as with an actual robotic arm.}
}
2019¶
Conferences ¶
Robot Learning via Human Adversarial Games
J. Duan*, Q. Wang*, L. Pinto, C. Jay Kuo, S. Nikolaidis
2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), November 2019
Best Cognitive Robotics Paper Award Nomination
Abstract
Much work in robotics has focused on “human-in-the-loop” learning techniques that improve the efficiency of the learning process. However, these algorithms have made the strong assumption of a cooperating human supervisor that assists the robot. In reality, human observers tend to also act in an adversarial manner towards deployed robotic systems. We show that this can in fact improve the robustness of the learned models by proposing a physical framework that leverages perturbations applied by a human adversary, guiding the robot towards more robust models. In a manipulation task, we show that grasping success improves significantly when the robot trains with a human adversary as compared to training in a self-supervised manner.
Citation
@INPROCEEDINGS{duan2019adversarial,
author={Duan, Jiali and Wang, Qian and Pinto, Lerrel and Jay Kuo, C.-C. and Nikolaidis, Stefanos},
booktitle={2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
title={Robot Learning via Human Adversarial Games},
year={2019},
month={November},
volume={},
number={},
pages={1056-1063},
doi={10.1109/IROS40897.2019.8968306},
abstract={Much work in robotics has focused on “human-in-the-loop” learning techniques that improve the efficiency of the learning process. However, these algorithms have made the strong assumption of a cooperating human supervisor that assists the robot. In reality, human observers tend to also act in an adversarial manner towards deployed robotic systems. We show that this can in fact improve the robustness of the learned models by proposing a physical framework that leverages perturbations applied by a human adversary, guiding the robot towards more robust models. In a manipulation task, we show that grasping success improves significantly when the robot trains with a human adversary as compared to training in a self-supervised manner.},
}
Learning Collaborative Action Plans from Unlabeled YouTube Videos
H. Zhang, P. Lai, S. Paul, S. Kothawade, S. Nikolaidis
Robotics Research, The 19th International Symposium, ISRR 2019, October 2019
Abstract
Videos from the World Wide Web provide a rich source of information that robots could use to acquire knowledge about manipulation tasks. Previous work has focused on generating action sequences from unconstrained videos for a single robot performing manipulation tasks by itself. However, robots operating in the same physical space with people need to not only perform actions autonomously, but also coordinate seamlessly with their human counterparts. This often requires representing and executing collaborative manipulation actions, such as handing over a tool or holding an object for the other agent. We present a system for knowledge acquisition of collaborative manipulation action plans that outputs commands to the robot in the form of visual sentence. We show the performance of the system in 12 unlabeled action clips taken from collaborative cooking videos on YouTube. We view this as the first step towards extracting collaborative manipulation action sequences from unconstrained, unlabeled online videos.
Citation
@inproceedings {zhang2019youtube,
author={Hejia Zhang and Po-Jen Lai and Sayan Paul and Suraj Kothawade and Stefanos Nikolaidis},
title={Learning Collaborative Action Plans from Unlabeled YouTube Videos},
booktitle={Robotics Research, The 19th International Symposium, {ISRR} 2019},
location={Hanoi, Vietnam},
year={2019},
month={October},
abstract={Videos from the World Wide Web provide a rich source of information that robots could use to acquire knowledge about manipulation tasks. Previous work has focused on generating action sequences from unconstrained videos for a single robot performing manipulation tasks by itself. However, robots operating in the same physical space with people need to not only perform actions autonomously, but also coordinate seamlessly with their human counterparts. This often requires representing and executing collaborative manipulation actions, such as handing over a tool or holding an object for the other agent. We present a system for knowledge acquisition of collaborative manipulation action plans that outputs commands to the robot in the form of visual sentence. We show the performance of the system in 12 unlabeled action clips taken from collaborative cooking videos on YouTube. We view this as the first step towards extracting collaborative manipulation action sequences from unconstrained, unlabeled online videos.},
}
Surprise! Predicting Infant Visual Attention in a Socially Assistive Robot Contingent Learning Paradigm
L. Klein, L. Itti, B. Smith, M. Rosales, S. Nikolaidis, M. Matarić
2019 28th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), October 2019
Abstract
Early intervention to address developmental disability in infants has the potential to promote improved outcomes in neurodevelopmental structure and function [1]. Researchers are starting to explore Socially Assistive Robotics (SAR) as a tool for delivering early interventions that are synergistic with and enhance human-administered therapy. For SAR to be effective, the robot must be able to consistently attract the attention of the infant in order to engage the infant in a desired activity. This work presents the analysis of eye gaze tracking data from five 6-8 month old infants interacting with a Nao robot that kicked its leg as a contingent reward for infant leg movement. We evaluate a Bayesian model of low-level surprise on video data from the infants' head-mounted camera and on the timing of robot behaviors as a predictor of infant visual attention. The results demonstrate that over 67% of infant gaze locations were in areas the model evaluated to be more surprising than average. We also present an initial exploration using surprise to predict the extent to which the robot attracts infant visual attention during specific intervals in the study. This work is the first to validate the surprise model on infants; our results indicate the potential for using surprise to inform robot behaviors that attract infant attention during SAR interactions.
Citation
@INPROCEEDINGS{klein2019surprise,
author={Klein, Lauren and Itti, Laurent and Smith, Beth A. and Rosales, Marcelo and Nikolaidis, Stefanos and Matarić, Maja J.},
booktitle={2019 28th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)},
title={Surprise! Predicting Infant Visual Attention in a Socially Assistive Robot Contingent Learning Paradigm},
year={2019},
month={October},
volume={},
number={},
pages={1-7},
doi={10.1109/RO-MAN46459.2019.8956385},
abstract={Early intervention to address developmental disability in infants has the potential to promote improved outcomes in neurodevelopmental structure and function [1]. Researchers are starting to explore Socially Assistive Robotics (SAR) as a tool for delivering early interventions that are synergistic with and enhance human-administered therapy. For SAR to be effective, the robot must be able to consistently attract the attention of the infant in order to engage the infant in a desired activity. This work presents the analysis of eye gaze tracking data from five 6-8 month old infants interacting with a Nao robot that kicked its leg as a contingent reward for infant leg movement. We evaluate a Bayesian model of low-level surprise on video data from the infants' head-mounted camera and on the timing of robot behaviors as a predictor of infant visual attention. The results demonstrate that over 67% of infant gaze locations were in areas the model evaluated to be more surprising than average. We also present an initial exploration using surprise to predict the extent to which the robot attracts infant visual attention during specific intervals in the study. This work is the first to validate the surprise model on infants; our results indicate the potential for using surprise to inform robot behaviors that attract infant attention during SAR interactions.}
}
Robot Object Referencing through Legible Situated Projections
T. Weng, L. Perlmutter, S. Nikolaidis, S. Srinivasa, M. Cakmak
2019 International Conference on Robotics and Automation (ICRA), May 2019
Abstract
The ability to reference objects in the environment is a key communication skill that robots need for complex, task-oriented human-robot collaborations. In this paper we explore the use of projections, which are a powerful communication channel for robot-to-human information transfer as they allow for situated, instantaneous, and parallelized visual referencing. We focus on the question of what makes a good projection for referencing a target object. To that end, we mathematically formulatelegibility of projections intended to reference an object, and propose alternative arrow-object match functions for optimally computing the placement of an arrow to indicate a target object in a cluttered scene. We implement our approach on a PR2 robot with a head-mounted projector. Through an online (48 participants) and an in-person (12 participants) user study we validate the effectiveness of our approach, identify the types of scenes where projections may fail, and characterize the differences between alternative match functions.
Citation
@INPROCEEDINGS{weng2019projections,
author={Weng, Thomas and Perlmutter, Leah and Nikolaidis, Stefanos and Srinivasa, Siddhartha and Cakmak, Maya},
booktitle={2019 International Conference on Robotics and Automation (ICRA)},
title={Robot Object Referencing through Legible Situated Projections},
year={2019},
month={May},
volume={},
number={},
pages={8004-8010},
doi={10.1109/ICRA.2019.8793638},
abstract={The ability to reference objects in the environment is a key communication skill that robots need for complex, task-oriented human-robot collaborations. In this paper we explore the use of projections, which are a powerful communication channel for robot-to-human information transfer as they allow for situated, instantaneous, and parallelized visual referencing. We focus on the question of what makes a good projection for referencing a target object. To that end, we mathematically formulatelegibility of projections intended to reference an object, and propose alternative arrow-object match functions for optimally computing the placement of an arrow to indicate a target object in a cluttered scene. We implement our approach on a PR2 robot with a head-mounted projector. Through an online (48 participants) and an in-person (12 participants) user study we validate the effectiveness of our approach, identify the types of scenes where projections may fail, and characterize the differences between alternative match functions.}
}
Demonstrations ¶
Robot-Assisted Hair Brushing
E. Shin, H. Zhang, R. Pocius, N. Dennler, H. Culbertson, N. Zamani, S. Nikolaidis
NeurIPS 2019 Demonstrations
Citation
@inproceedings{shin2019,
title = {Robot-Assisted Hair Brushing},
author = {Shin, Eura and Zhang, Hejia and Pocius, Rey and Dennler, Nathan and Culbertson, Heather and Zamani, Naghmeh and Nikolaidis, Stefanos},
booktitle = {Thirty-fourth Conference on Neural Information Processing Systems (NeurIPS) Demonstrations},
year = {2020}
}
Preprints ¶
Auto-conditioned Recurrent Mixture Density Networks for Complex Trajectory Generation
H. Zhang, E. Heiden, R. Julian, Z. He, J. Lim, G. Sukhatme
CoRR, March 2018
Abstract
Personal robots assisting humans must perform complex manipulation tasks that are typically difficult to specify in traditional motion planning pipelines, where multiple objectives must be met and the high-level context be taken into consideration. Learning from demonstration (LfD) provides a promising way to learn these kind of complex manipulation skills even from non-technical users. However, it is challenging for existing LfD methods to efficiently learn skills that can generalize to task specifications that are not covered by demonstrations. In this paper, we introduce a state transition model (STM) that generates joint-space trajectories by imitating motions from expert behavior. Given a few demonstrations, we show in real robot experiments that the learned STM can quickly generalize to unseen tasks and synthesize motions having longer time horizons than the expert trajectories. Compared to conventional motion planners, our approach enables the robot to accomplish complex behaviors from high-level instructions without laborious hand-engineering of planning objectives, while being able to adapt to changing goals during the skill execution. In conjunction with a trajectory optimizer, our STM can construct a high-quality skeleton of a trajectory that can be further improved in smoothness and precision. In combination with a learned inverse dynamics model, we additionally present results where the STM is used as a high-level planner.
Citation
@article{DBLP:journals/corr/abs-1810-00146,
author = {Hejia Zhang and
Eric Heiden and
Ryan Julian and
Zhangpeng He and
Joseph J. Lim and
Gaurav S. Sukhatme},
title = {Auto-conditioned Recurrent Mixture Density Networks for Complex Trajectory
Generation},
journal = {CoRR},
volume = {abs/1810.00146},
year = {2018},
month = {March},
url = {http://arxiv.org/abs/1810.00146},
archivePrefix = {arXiv},
eprint = {1810.00146},
timestamp = {Tue, 30 Oct 2018 10:49:09 +0100},
biburl = {https://dblp.org/rec/journals/corr/abs-1810-00146.bib},
bibsource = {dblp computer science bibliography, https://dblp.org},
abstract = {Personal robots assisting humans must perform complex manipulation tasks that are typically difficult to specify in traditional motion planning pipelines, where multiple objectives must be met and the high-level context be taken into consideration. Learning from demonstration (LfD) provides a promising way to learn these kind of complex manipulation skills even from non-technical users. However, it is challenging for existing LfD methods to efficiently learn skills that can generalize to task specifications that are not covered by demonstrations. In this paper, we introduce a state transition model (STM) that generates joint-space trajectories by imitating motions from expert behavior. Given a few demonstrations, we show in real robot experiments that the learned STM can quickly generalize to unseen tasks and synthesize motions having longer time horizons than the expert trajectories. Compared to conventional motion planners, our approach enables the robot to accomplish complex behaviors from high-level instructions without laborious hand-engineering of planning objectives, while being able to adapt to changing goals during the skill execution. In conjunction with a trajectory optimizer, our STM can construct a high-quality skeleton of a trajectory that can be further improved in smoothness and precision. In combination with a learned inverse dynamics model, we additionally present results where the STM is used as a high-level planner.}
}
2018¶
Journals ¶
Planning with Verbal Communication for Human-Robot Collaboration
S. Nikolaidis, M. Kwon, J. Forlizzi, S. Srinivasa
ACM Transactions on Human-Robot Interaction, November 2018
Abstract
Human collaborators coordinate effectively their actions through both verbal and non-verbal communication. We believe that the the same should hold for human-robot teams. We propose a formalism that enables a robot to decide optimally between taking a physical action toward task completion and issuing an utterance to the human teammate. We focus on two types of utterances: verbal commands, where the robot asks the human to take a physical action, and state-conveying actions, where the robot informs the human about its internal state, which captures the information that the robot uses in its decision making. Human subject experiments show that enabling the robot to issue verbal commands is the most effective form of communicating objectives, while retaining user trust in the robot. Communicating information about the robot’s state should be done judiciously, since many participants questioned the truthfulness of the robot statements when the robot did not provide sufficient explanation about its actions.
Citation
@article{nikolaidis2018verbal,
author = {Nikolaidis, Stefanos and Kwon, Minae and Forlizzi, Jodi and Srinivasa, Siddhartha},
title = {Planning with Verbal Communication for Human-Robot Collaboration},
year = {2018},
issue_date = {December 2018},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {7},
number = {3},
url = {https://doi.org/10.1145/3203305},
doi = {10.1145/3203305},
abstract = {Human collaborators coordinate effectively their actions through both verbal and non-verbal communication. We believe that the the same should hold for human-robot teams. We propose a formalism that enables a robot to decide optimally between taking a physical action toward task completion and issuing an utterance to the human teammate. We focus on two types of utterances: verbal commands, where the robot asks the human to take a physical action, and state-conveying actions, where the robot informs the human about its internal state, which captures the information that the robot uses in its decision making. Human subject experiments show that enabling the robot to issue verbal commands is the most effective form of communicating objectives, while retaining user trust in the robot. Communicating information about the robot’s state should be done judiciously, since many participants questioned the truthfulness of the robot statements when the robot did not provide sufficient explanation about its actions.},
journal = {J. Hum.-Robot Interact.},
month = nov,
articleno = {22},
numpages = {21},
keywords = {planning under uncertainty, partially observable Markov decision process, Human-robot collaboration, verbal communication}
}
Conferences ¶
Planning with Trust for Human-Robot Collaboration
M. Chen*, S. Nikolaidis*, H. Soh, D. Hsu, S. Srinivasa
2018 ACM/IEEE International Conference on Human-Robot Interaction, February 2018
Best Technical Advances Paper Award Nomination
Abstract
Trust is essential for human-robot collaboration and user adoption of autonomous systems, such as robot assistants. This paper introduces a computational model which integrates trust into robot decision-making. Specifically, we learn from data a partially observable Markov decision process (POMDP) with human trust as a latent variable. The trust-POMDP model provides a principled approach for the robot to (i) infer the trust of a human teammate through interaction, (ii) reason about the effect of its own actions on human behaviors, and (iii) choose actions that maximize team performance over the long term. We validated the model through human subject experiments on a table-clearing task in simulation (201 participants) and with a real robot (20 participants). The results show that the trust-POMDP improves human-robot team performance in this task. They further suggest that maximizing trust in itself may not improve team performance.
Citation
@inproceedings{chen2018trust,
author = {Chen, Min and Nikolaidis, Stefanos and Soh, Harold and Hsu, David and Srinivasa, Siddhartha},
title = {Planning with Trust for Human-Robot Collaboration},
year = {2018},
month = {February},
isbn = {9781450349536},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3171221.3171264},
doi = {10.1145/3171221.3171264},
abstract = {Trust is essential for human-robot collaboration and user adoption of autonomous systems, such as robot assistants. This paper introduces a computational model which integrates trust into robot decision-making. Specifically, we learn from data a partially observable Markov decision process (POMDP) with human trust as a latent variable. The trust-POMDP model provides a principled approach for the robot to (i) infer the trust of a human teammate through interaction, (ii) reason about the effect of its own actions on human behaviors, and (iii) choose actions that maximize team performance over the long term. We validated the model through human subject experiments on a table-clearing task in simulation (201 participants) and with a real robot (20 participants). The results show that the trust-POMDP improves human-robot team performance in this task. They further suggest that maximizing trust in itself may not improve team performance.},
booktitle = {Proceedings of the 2018 ACM/IEEE International Conference on Human-Robot Interaction},
pages = {307–315},
numpages = {9},
keywords = {human-robot collaboration, partially observable markov decision process (pomdp), trust models},
location = {Chicago, IL, USA},
series = {HRI '18}
}
2017¶
Journals ¶
Human-robot mutual adaptation in collaborative tasks: Models and experiments
S. Nikolaidis, D. Hsu, S. Srinivasa
The International Journal of Robotics Research, February 2017
Abstract
Adaptation is critical for effective team collaboration. This paper introduces a computational formalism for mutual adaptation between a robot and a human in collaborative tasks. We propose the Bounded-Memory Adaptation Model, which is a probabilistic finite-state controller that captures human adaptive behaviors under a bounded-memory assumption. We integrate the Bounded-Memory Adaptation Model into a probabilistic decision process, enabling the robot to guide adaptable participants towards a better way of completing the task. Human subject experiments suggest that the proposed formalism improves the effectiveness of human-robot teams in collaborative tasks, when compared with one-way adaptations of the robot to the human, while maintaining the human’s trust in the robot.
Citation
@article{nikolaidis2017adaptation,
author = {Stefanos Nikolaidis and David Hsu and Siddhartha Srinivasa},
title ={Human-robot mutual adaptation in collaborative tasks: Models and experiments},
journal = {The International Journal of Robotics Research},
volume = {36},
number = {5-7},
pages = {618-634},
year = {2017},
month = {February},
doi = {10.1177/0278364917690593},
eprint = {https://doi.org/10.1177/0278364917690593},
abstract = {Adaptation is critical for effective team collaboration. This paper introduces a computational formalism for mutual adaptation between a robot and a human in collaborative tasks. We propose the Bounded-Memory Adaptation Model, which is a probabilistic finite-state controller that captures human adaptive behaviors under a bounded-memory assumption. We integrate the Bounded-Memory Adaptation Model into a probabilistic decision process, enabling the robot to guide adaptable participants towards a better way of completing the task. Human subject experiments suggest that the proposed formalism improves the effectiveness of human-robot teams in collaborative tasks, when compared with one-way adaptations of the robot to the human, while maintaining the human’s trust in the robot.}
}
Conferences ¶
Game-Theoretic Modeling of Human Adaptation in Human-Robot Collaboration
S. Nikolaidis, S. Nath, A. Procaccia, S. Srinivasa
2017 ACM/IEEE International Conference on Human-Robot Interaction, March 2017
Abstract
In human-robot teams, humans often start with an inaccurate model of the robot capabilities. As they interact with the robot, they infer the robot's capabilities and partially adapt to the robot, i.e., they might change their actions based on the observed outcomes and the robot's actions, without replicating the robot's policy. We present a game-theoretic model of human partial adaptation to the robot, where the human responds to the robot's actions by maximizing a reward function that changes stochastically over time, capturing the evolution of their expectations of the robot's capabilities. The robot can then use this model to decide optimally between taking actions that reveal its capabilities to the human and taking the best action given the information that the human currently has. We prove that under certain observability assumptions, the optimal policy can be computed efficiently. We demonstrate through a human subject experiment that the proposed model significantly improves human-robot team performance, compared to policies that assume complete adaptation of the human to the robot.
Citation
@inproceedings{nikolaidis2017game,
author = {Nikolaidis, Stefanos and Nath, Swaprava and Procaccia, Ariel D. and Srinivasa, Siddhartha},
title = {Game-Theoretic Modeling of Human Adaptation in Human-Robot Collaboration},
year = {2017},
month = {March},
isbn = {9781450343367},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/2909824.3020253},
doi = {10.1145/2909824.3020253},
abstract = {In human-robot teams, humans often start with an inaccurate model of the robot capabilities. As they interact with the robot, they infer the robot's capabilities and partially adapt to the robot, i.e., they might change their actions based on the observed outcomes and the robot's actions, without replicating the robot's policy. We present a game-theoretic model of human partial adaptation to the robot, where the human responds to the robot's actions by maximizing a reward function that changes stochastically over time, capturing the evolution of their expectations of the robot's capabilities. The robot can then use this model to decide optimally between taking actions that reveal its capabilities to the human and taking the best action given the information that the human currently has. We prove that under certain observability assumptions, the optimal policy can be computed efficiently. We demonstrate through a human subject experiment that the proposed model significantly improves human-robot team performance, compared to policies that assume complete adaptation of the human to the robot.},
booktitle = {Proceedings of the 2017 ACM/IEEE International Conference on Human-Robot Interaction},
pages = {323–331},
numpages = {9},
keywords = {human-robot collaboration, human adaptation, game-theory},
location = {Vienna, Austria},
series = {HRI '17}
}
Human-Robot Mutual Adaptation in Shared Autonomy
S. Nikolaidis, Y. Zhu, D. Hsu, S. Srinivasa
2017 ACM/IEEE International Conference on Human-Robot Interaction, March 2017
Abstract
Shared autonomy integrates user input with robot autonomy in order to control a robot and help the user to complete a task. Our work aims to improve the performance of such a human-robot team: the robot tries to guide the human towards an effective strategy, sometimes against the human's own preference, while still retaining his trust. We achieve this through a principled human-robot mutual adaptation formalism. We integrate a bounded-memory adaptation model of the human into a partially observable stochastic decision model, which enables the robot to adapt to an adaptable human. When the human is adaptable, the robot guides the human towards a good strategy, maybe unknown to the human in advance. When the human is stubborn and not adaptable, the robot complies with the human's preference in order to retain their trust. In the shared autonomy setting, unlike many other common human-robot collaboration settings, only the robot actions can change the physical state of the world, and the human and robot goals are not fully observable. We address these challenges and show in a human subject experiment that the proposed mutual adaptation formalism improves human-robot team performance, while retaining a high level of user trust in the robot, compared to the common approach of having the robot strictly following participants' preference.
Citation
@inproceedings{nikolaidis2017mutual,
author = {Nikolaidis, Stefanos and Zhu, Yu Xiang and Hsu, David and Srinivasa, Siddhartha},
title = {Human-Robot Mutual Adaptation in Shared Autonomy},
year = {2017},
month = {March},
isbn = {9781450343367},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/2909824.3020252},
doi = {10.1145/2909824.3020252},
abstract = {Shared autonomy integrates user input with robot autonomy in order to control a robot and help the user to complete a task. Our work aims to improve the performance of such a human-robot team: the robot tries to guide the human towards an effective strategy, sometimes against the human's own preference, while still retaining his trust. We achieve this through a principled human-robot mutual adaptation formalism. We integrate a bounded-memory adaptation model of the human into a partially observable stochastic decision model, which enables the robot to adapt to an adaptable human. When the human is adaptable, the robot guides the human towards a good strategy, maybe unknown to the human in advance. When the human is stubborn and not adaptable, the robot complies with the human's preference in order to retain their trust. In the shared autonomy setting, unlike many other common human-robot collaboration settings, only the robot actions can change the physical state of the world, and the human and robot goals are not fully observable. We address these challenges and show in a human subject experiment that the proposed mutual adaptation formalism improves human-robot team performance, while retaining a high level of user trust in the robot, compared to the common approach of having the robot strictly following participants' preference.},
booktitle = {Proceedings of the 2017 ACM/IEEE International Conference on Human-Robot Interaction},
pages = {294–302},
numpages = {9},
keywords = {planning under uncertainty, human-robot mutual adaptation, shared autonomy},
location = {Vienna, Austria},
series = {HRI '17}
}
2016¶
Conferences ¶
Formalizing Human-Robot Mutual Adaptation: A Bounded Memory Model
S. Nikolaidis, A. Kuznetsov, D. Hsu, S. Srinivasa
The Eleventh ACM/IEEE International Conference on Human Robot Interaction, March 2016
Abstract
Mutual adaptation is critical for effective team collaboration. This paper presents a formalism for human-robot mutual adaptation in collaborative tasks. We propose the bounded-memory adaptation model (BAM), which captures human adaptive behaviours based on a bounded memory assumption. We integrate BAM into a partially observable stochastic model, which enables robot adaptation to the human. When the human is adaptive, the robot will guide the human towards a new, optimal collaborative strategy unknown to the human in advance. When the human is not willing to change their strategy, the robot adapts to the human in order to retain human trust. Human subject experiments indicate that the proposed formalism can significantly improve the effectiveness of human-robot teams, while human subject ratings on the robot performance and trust are comparable to those achieved by cross training, a state-of-the-art human-robot team training practice.
Citation
@inproceedings{nikolaidis2016mutual,
author = {Nikolaidis, Stefanos and Kuznetsov, Anton and Hsu, David and Srinivasa, Siddharta},
title = {Formalizing Human-Robot Mutual Adaptation: A Bounded Memory Model},
year = {2016},
month = {March},
isbn = {9781467383707},
publisher = {IEEE Press},
abstract = {Mutual adaptation is critical for effective team collaboration. This paper presents a formalism for human-robot mutual adaptation in collaborative tasks. We propose the bounded-memory adaptation model (BAM), which captures human adaptive behaviours based on a bounded memory assumption. We integrate BAM into a partially observable stochastic model, which enables robot adaptation to the human. When the human is adaptive, the robot will guide the human towards a new, optimal collaborative strategy unknown to the human in advance. When the human is not willing to change their strategy, the robot adapts to the human in order to retain human trust. Human subject experiments indicate that the proposed formalism can significantly improve the effectiveness of human-robot teams, while human subject ratings on the robot performance and trust are comparable to those achieved by cross training, a state-of-the-art human-robot team training practice.},
booktitle = {The Eleventh ACM/IEEE International Conference on Human Robot Interaction},
pages = {75–82},
numpages = {8},
keywords = {human-robot mutual adaptation, bounded memory, human-robot collaboration},
location = {Christchurch, New Zealand},
series = {HRI '16}
}
Viewpoint-Based Legibility Optimization
S. Nikolaidis, A. Dragan, S. Srinivasa
The Eleventh ACM/IEEE International Conference on Human Robot Interaction, March 2016
Abstract
Much robotics research has focused on intent-expressive (legible) motion. However, algorithms that can autonomously generate legible motion have implicitly made the strong assumption of an omniscient observer, with access to the robot's configuration as it changes across time. In reality, human observers have a particular viewpoint, which biases the way they perceive the motion.In this work, we free robots from this assumption and introduce the notion of an observer with a specific point of view into legibility optimization. In doing so, we account for two factors: (1) depth uncertainty induced by a particular viewpoint, and (2) occlusions along the motion, during which (part of) the robot is hidden behind some object. We propose viewpoint and occlusion models that enable autonomous generation of viewpoint-based legible motions, and show through large-scale user studies that the produced motions are significantly more legible compared to those generated assuming an omniscient observer.
Citation
@inproceedings{nikolaidis2016legibility,
author = {Nikolaidis, Stefanos and Dragan, Anca and Srinivasa, Siddharta},
title = {Viewpoint-Based Legibility Optimization},
year = {2016},
month = {March},
isbn = {9781467383707},
publisher = {IEEE Press},
abstract = {Much robotics research has focused on intent-expressive (legible) motion. However, algorithms that can autonomously generate legible motion have implicitly made the strong assumption of an omniscient observer, with access to the robot's configuration as it changes across time. In reality, human observers have a particular viewpoint, which biases the way they perceive the motion.In this work, we free robots from this assumption and introduce the notion of an observer with a specific point of view into legibility optimization. In doing so, we account for two factors: (1) depth uncertainty induced by a particular viewpoint, and (2) occlusions along the motion, during which (part of) the robot is hidden behind some object. We propose viewpoint and occlusion models that enable autonomous generation of viewpoint-based legible motions, and show through large-scale user studies that the produced motions are significantly more legible compared to those generated assuming an omniscient observer.},
booktitle = {The Eleventh ACM/IEEE International Conference on Human Robot Interaction},
pages = {271–278},
numpages = {8},
keywords = {observer viewpoint, human-robot collaboration, trajectory optimization, conveying intent},
location = {Christchurch, New Zealand},
series = {HRI '16}
}
2015¶
Journals ¶
Improved human–robot team performance through cross-training, an approach inspired by human team training practices
S. Nikolaidis, P. Lasota, R. Ramakrishnan, J. Shah
The International Journal of Robotics Research, November 2015
Abstract
We design and evaluate a method of human–robot cross-training, a validated and widely used strategy for the effective training of human teams. Cross-training is an interactive planning method in which team members iteratively switch roles with one another to learn a shared plan for the performance of a collaborative task.We first present a computational formulation of the robot mental model, which encodes the sequence of robot actions necessary for task completion and the expectations of the robot for preferred human actions, and show that the robot model is quantitatively comparable to the mental model that captures the inter-role knowledge held by the human. Additionally, we propose a quantitative measure of robot mental model convergence and an objective metric of model similarity. Based on this encoding, we formulate a human–robot cross-training method and evaluate its efficacy through experiments involving human subjects (n=60). We compare human–robot cross-training to standard reinforcement learning techniques, and show that cross-training yields statistically significant improvements in quantitative team performance measures, as well as significant differences in perceived robot performance and human trust. Finally, we discuss the objective measure of robot mental model convergence as a method to dynamically assess human errors. This study supports the hypothesis that the effective and fluent teaming of a human and a robot may best be achieved by modeling known, effective human teamwork practices.
Citation
@article{nikolaidis2015ijrr,
author = {Stefanos Nikolaidis and Przemyslaw Lasota and Ramya Ramakrishnan and Julie Shah},
title ={Improved human–robot team performance through cross-training, an approach inspired by human team training practices},
journal = {The International Journal of Robotics Research},
volume = {34},
number = {14},
pages = {1711-1730},
year = {2015},
month = {November},
doi = {10.1177/0278364915609673},
eprint = {https://doi.org/10.1177/0278364915609673},
abstract = {We design and evaluate a method of human–robot cross-training, a validated and widely used strategy for the effective training of human teams. Cross-training is an interactive planning method in which team members iteratively switch roles with one another to learn a shared plan for the performance of a collaborative task.We first present a computational formulation of the robot mental model, which encodes the sequence of robot actions necessary for task completion and the expectations of the robot for preferred human actions, and show that the robot model is quantitatively comparable to the mental model that captures the inter-role knowledge held by the human. Additionally, we propose a quantitative measure of robot mental model convergence and an objective metric of model similarity. Based on this encoding, we formulate a human–robot cross-training method and evaluate its efficacy through experiments involving human subjects (n=60). We compare human–robot cross-training to standard reinforcement learning techniques, and show that cross-training yields statistically significant improvements in quantitative team performance measures, as well as significant differences in perceived robot performance and human trust. Finally, we discuss the objective measure of robot mental model convergence as a method to dynamically assess human errors. This study supports the hypothesis that the effective and fluent teaming of a human and a robot may best be achieved by modeling known, effective human teamwork practices.}
}
Conferences ¶
Efficient Model Learning from Joint-Action Demonstrations for Human-Robot Collaborative Tasks
S. Nikolaidis, R. Ramakrishnan, K. Gu, J. Shah
Tenth Annual ACM/IEEE International Conference on Human-Robot Interaction, March 2015
Best Enabling Technologies Paper Award
Abstract
We present a framework for automatically learning human user models from joint-action demonstrations that enables a robot to compute a robust policy for a collaborative task with a human. First, the demonstrated action sequences are clustered into different human types using an unsupervised learning algorithm. A reward function is then learned for each type through the employment of an inverse reinforcement learning algorithm. The learned model is then incorporated into a mixed-observability Markov decision process (MOMDP) formulation, wherein the human type is a partially observable variable. With this framework, we can infer online the human type of a new user that was not included in the training set, and can compute a policy for the robot that will be aligned to the preference of this user. In a human subject experiment (n=30), participants agreed more strongly that the robot anticipated their actions when working with a robot incorporating the proposed framework (p<0.01), compared to manually annotating robot actions. In trials where participants faced difficulty annotating the robot actions to complete the task, the proposed framework significantly improved team efficiency (p<0.01). The robot incorporating the framework was also found to be more responsive to human actions compared to policies computed using a hand-coded reward function by a domain expert (p<0.01). These results indicate that learning human user models from joint-action demonstrations and encoding them in a MOMDP formalism can support effective teaming in human-robot collaborative tasks.
Citation
@inproceedings{nikolaidis2015joint,
author = {Nikolaidis, Stefanos and Ramakrishnan, Ramya and Gu, Keren and Shah, Julie},
title = {Efficient Model Learning from Joint-Action Demonstrations for Human-Robot Collaborative Tasks},
year = {2015},
month = {March},
isbn = {9781450328838},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/2696454.2696455},
doi = {10.1145/2696454.2696455},
abstract = {We present a framework for automatically learning human user models from joint-action demonstrations that enables a robot to compute a robust policy for a collaborative task with a human. First, the demonstrated action sequences are clustered into different human types using an unsupervised learning algorithm. A reward function is then learned for each type through the employment of an inverse reinforcement learning algorithm. The learned model is then incorporated into a mixed-observability Markov decision process (MOMDP) formulation, wherein the human type is a partially observable variable. With this framework, we can infer online the human type of a new user that was not included in the training set, and can compute a policy for the robot that will be aligned to the preference of this user. In a human subject experiment (n=30), participants agreed more strongly that the robot anticipated their actions when working with a robot incorporating the proposed framework (p<0.01), compared to manually annotating robot actions. In trials where participants faced difficulty annotating the robot actions to complete the task, the proposed framework significantly improved team efficiency (p<0.01). The robot incorporating the framework was also found to be more responsive to human actions compared to policies computed using a hand-coded reward function by a domain expert (p<0.01). These results indicate that learning human user models from joint-action demonstrations and encoding them in a MOMDP formalism can support effective teaming in human-robot collaborative tasks.},
booktitle = {Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-Robot Interaction},
pages = {189–196},
numpages = {8},
keywords = {model learning, human-robot collaboration, mixed observability markov decision process},
location = {Portland, Oregon, USA},
series = {HRI '15}
}
2013¶
Conferences ¶
Human-Robot Cross-Training: Computational Formulation, Modeling and Evaluation of a Human Team Training Strategy
S. Nikolaidis, J. Shah
8th ACM/IEEE International Conference on Human-Robot Interaction, March 2013
Abstract
We design and evaluate human-robot cross-training, a strategy widely used and validated for effective human team training. Cross-training is an interactive planning method in which a human and a robot iteratively switch roles to learn a shared plan for a collaborative task.We first present a computational formulation of the robot's interrole knowledge and show that it is quantitatively comparable to the human mental model. Based on this encoding, we formulate human-robot cross-training and evaluate it in human subject experiments (n = 36). We compare human-robot cross-training to standard reinforcement learning techniques, and show that cross-training provides statistically significant improvements in quantitative team performance measures. Additionally, significant differences emerge in the perceived robot performance and human trust. These results support the hypothesis that effective and fluent human-robot teaming may be best achieved by modeling effective practices for human teamwork.
Citation
@inproceedings{nikolaidis2013training,
author = {Nikolaidis, Stefanos and Shah, Julie},
title = {Human-Robot Cross-Training: Computational Formulation, Modeling and Evaluation of a Human Team Training Strategy},
year = {2013},
month = {March},
isbn = {9781467330558},
publisher = {IEEE Press},
abstract = {We design and evaluate human-robot cross-training, a strategy widely used and validated for effective human team training. Cross-training is an interactive planning method in which a human and a robot iteratively switch roles to learn a shared plan for a collaborative task.We first present a computational formulation of the robot's interrole knowledge and show that it is quantitatively comparable to the human mental model. Based on this encoding, we formulate human-robot cross-training and evaluate it in human subject experiments (n = 36). We compare human-robot cross-training to standard reinforcement learning techniques, and show that cross-training provides statistically significant improvements in quantitative team performance measures. Additionally, significant differences emerge in the perceived robot performance and human trust. These results support the hypothesis that effective and fluent human-robot teaming may be best achieved by modeling effective practices for human teamwork.},
booktitle = {Proceedings of the 8th ACM/IEEE International Conference on Human-Robot Interaction},
pages = {33–40},
numpages = {8},
keywords = {cross-training, shared mental models, human-robot team fluency},
location = {Tokyo, Japan},
series = {HRI '13}
}
2012¶
Conferences ¶
Optimization of Temporal Dynamics for Adaptive Human-Robot Interaction in Assembly Manufacturing
R. Wilcox, S. Nikolaidis, J. Shah
Robotics: Science and Systems, July 2012
Abstract
Human-robot collaboration presents an opportunity to improve the efficiency of manufacturing and assembly processes, particularly for aerospace manufacturing where tight integration and variability in the build process make physical isolation of robotic-only work challenging. In this paper, we develop a robotic scheduling and control capability that adapts to the changing preferences of a human co-worker or supervisor while providing strong guarantees for synchronization and timing of activities. This innovation is realized through dynamic execution of a flexible optimal scheduling policy that accommodates temporal disturbance. We describe the Adaptive Preferences Algorithm that computes the flexible scheduling policy and show empirically that execution is fast, robust, and adaptable to changing preferences for workflow. We achieve satisfactory computation times, on the order of seconds for moderately-sized problems, and demonstrate the capability for human-robot teaming using a small industrial robot.
Citation
@INPROCEEDINGS{wilcox2012rss,
AUTHOR = {Ronald Wilcox and Stefanos Nikolaidis and Julie Shah},
TITLE = {Optimization of Temporal Dynamics for Adaptive Human-Robot Interaction in Assembly Manufacturing},
BOOKTITLE = {Proceedings of Robotics: Science and Systems},
YEAR = {2012},
ADDRESS = {Sydney, Australia},
MONTH = {July},
DOI = {10.15607/RSS.2012.VIII.056},
abstract = {Human-robot collaboration presents an opportunity to improve the efficiency of manufacturing and assembly processes, particularly for aerospace manufacturing where tight integration and variability in the build process make physical isolation of robotic-only work challenging. In this paper, we develop a robotic scheduling and control capability that adapts to the changing preferences of a human co-worker or supervisor while providing strong guarantees for synchronization and timing of activities. This innovation is realized through dynamic execution of a flexible optimal scheduling policy that accommodates temporal disturbance. We describe the Adaptive Preferences Algorithm that computes the flexible scheduling policy and show empirically that execution is fast, robust, and adaptable to changing preferences for workflow. We achieve satisfactory computation times, on the order of seconds for moderately-sized problems, and demonstrate the capability for human-robot teaming using a small industrial robot.},
}
2009¶
Conferences ¶
Optimal arrangement of ceiling cameras for home service robots using genetic algorithms
S. Nikolaidis, T. Arai
RO-MAN 2009 - The 18th IEEE International Symposium on Robot and Human Interactive Communication, September 2009
Abstract
In the near future robots will be used in home environments to provide assistance for the elderly and challenged people. As home environments are complicated, external sensors like ceiling cameras need to be placed on the environment to provide the robot with information about its position. The pose of cameras influences the area covered by the cameras, as well as the error of the robot localization. We examine the problem of the finding the arrangement of ceiling cameras at home environments that maximizes the area covered and minimizes the localization error. Genetic algorithms are proposed for the single and multi-objective optimization problem. Simulation results indicate that we can obtain the optimal arrangement of cameras that satisfies the given objectives and the required constraints.
Citation
@INPROCEEDINGS{nikolaidis2009arrangement,
author={Nikolaidis, Stefanos and Arai, Tamio},
booktitle={RO-MAN 2009 - The 18th IEEE International Symposium on Robot and Human Interactive Communication},
title={Optimal arrangement of ceiling cameras for home service robots using genetic algorithms},
year={2009},
month={September},
volume={},
number={},
pages={573-580},
doi={10.1109/ROMAN.2009.5326341},
abstract={In the near future robots will be used in home environments to provide assistance for the elderly and challenged people. As home environments are complicated, external sensors like ceiling cameras need to be placed on the environment to provide the robot with information about its position. The pose of cameras influences the area covered by the cameras, as well as the error of the robot localization. We examine the problem of the finding the arrangement of ceiling cameras at home environments that maximizes the area covered and minimizes the localization error. Genetic algorithms are proposed for the single and multi-objective optimization problem. Simulation results indicate that we can obtain the optimal arrangement of cameras that satisfies the given objectives and the required constraints.}
}
Optimal camera placement considering mobile robot trajectory
S. Nikolaidis, R. Ueda, A. Hayashi, T. Arai
2008 IEEE International Conference on Robotics and Biomimetics, February 2009
Abstract
In the near future robots will be used in home environments to provide assistance for the elderly and challenged people. The arrangement of sensors influences greatly the quality of information provided to the robot. We, therefore, examine the problem of the optimal arrangement of vision sensors for the case of a robot following a pre-defined path. A methodology to evaluate the arrangement of sensors is proposed, focusing on the case of a home environment with ceiling cameras. Simulation results indicate that we can obtain sub-optimal and practical arrangement with the minimum number of sensors which satisfies the necessary condition.
Citation
@inproceedings{nikolaidis2009placement,
author={Nikolaidis, Stefanos and Ueda, Ryuichi and Hayashi, Akinobu and Arai, Tamio},
booktitle={2008 IEEE International Conference on Robotics and Biomimetics},
title={Optimal camera placement considering mobile robot trajectory},
year={2009},
month={February},
volume={},
number={},
pages={1393-1396},
doi={10.1109/ROBIO.2009.4913204},
abstract={In the near future robots will be used in home environments to provide assistance for the elderly and challenged people. The arrangement of sensors influences greatly the quality of information provided to the robot. We, therefore, examine the problem of the optimal arrangement of vision sensors for the case of a robot following a pre-defined path. A methodology to evaluate the arrangement of sensors is proposed, focusing on the case of a home environment with ceiling cameras. Simulation results indicate that we can obtain sub-optimal and practical arrangement with the minimum number of sensors which satisfies the necessary condition.},
}
Global Pose Estimation of Multiple Cameras with Particle Filters
R. Ueda, S. Nikolaidis, A. Hayashi, T. Arai
Distributed Autonomous Robotic Systems 8, January 2009
Abstract
Though image processing algorithms are sophisticated and provided as software libraries, it is still difficult to assure that complicated programs can work in various situations. In this paper, we propose a novel global pose estimation method for network cameras to actualize auto-calibration. This method uses native information from images. The sets of partial information are integrated with particle filters. Though some kinds of limitation still exist in the method, we can verify that the particle filters can deal with the nonlinearity of estimation with the experiment.
Citation
@inbook{ueda2009pose,
author="Ueda, Ryuichi and Nikolaidis, Stefanos and Hayashi, Akinobu and Arai, Tamio",
editor="Asama, Hajime and Kurokawa, Haruhisa and Ota, Jun and Sekiyama, Kosuke",
title="Global Pose Estimation of Multiple Cameras with Particle Filters",
bookTitle="Distributed Autonomous Robotic Systems 8",
year="2009",
month="January",
publisher="Springer Berlin Heidelberg",
address="Berlin, Heidelberg",
pages="73--82",
abstract="Though image processing algorithms are sophisticated and provided as software libraries, it is still difficult to assure that complicated programs can work in various situations. In this paper, we propose a novel global pose estimation method for network cameras to actualize auto-calibration. This method uses native information from images. The sets of partial information are integrated with particle filters. Though some kinds of limitation still exist in the method, we can verify that the particle filters can deal with the nonlinearity of estimation with the experiment.",
isbn="978-3-642-00644-9",
doi="10.1007/978-3-642-00644-9_7",
url="https://doi.org/10.1007/978-3-642-00644-9_7"
}
2008¶
Conferences ¶
Real-Time Detection And Visualization of Clarinet Bad Sounds
A. Gkiokas, K. Perifanos, S. Nikolaidis
11th Int. Conference on Digital Audio Effects (DAFx-08), September 2008
Abstract
This paper describes an approach on real-time performance visualization in the context of music education. A tool is described that produces sound visualizations during a student performance that are intuitively linked to common mistakes frequently observed in the performances of novice to intermediate students. The paper discusses the case of clarinet students. Nevertheless, the approach is also well suited for a wide range of wind or other instruments where similar mistakes are often encountered.
Citation
@inproceedings{gkiokas2008detection,
author={Aggelos Gkiokas and Kostas Perifanos and Stefanos Nikolaidis},
booktitle={Proceedings of the 11th International Conference on Digital Audio Effects},
title={Real-Time Detection And Visualization of Clarinet Bad Sounds},
year={2008},
month={September},
abstract={This paper describes an approach on real-time performance visualization in the context of music education. A tool is described that produces sound visualizations during a student performance that are intuitively linked to common mistakes frequently observed in the performances of novice to intermediate students. The paper discusses the case of clarinet students. Nevertheless, the approach is also well suited for a wide range of wind or other instruments where similar mistakes are often encountered.},
}
2007¶
Conferences ¶
RFID Based Object Localization System Using Ceiling Cameras with Particle Filter
P. Kamol, S. Nikolaidis, R. Ueda, T. Arai
Future Generation Communication and Networking (FGCN 2007), December 2007
Abstract
In this paper, we propose an object localization method for home environments. This method utilizes RFID equipments, a mobile robot and some ceiling cameras. The RFID system estimates a rough position of each object. The autonomous robot with RFID antennas explores the environment so as to detect other objects on the floor. Each object that is attached an RFID tag, is then recognized by utilizing its feature information stored in this tag. Finally, the precise localization of each object is achieved by the ceiling cameras with particle filters. The accuracy and the robustness of the proposed method are verified through an experiment.
Citation
@inproceedings{prachya2007rfid,
author={Kamol, Prachya and Nikolaidis, Stefanos and Ueda, Ryuichi and Arai, Tamio},
booktitle={Future Generation Communication and Networking (FGCN 2007)},
title={RFID Based Object Localization System Using Ceiling Cameras with Particle Filter},
year={2007},
month={December},
volume={2},
number={},
pages={37-42},
doi={10.1109/FGCN.2007.194},
url={https://ieeexplore.ieee.org/document/4426200},
abstract={In this paper, we propose an object localization method for home environments. This method utilizes RFID equipments, a mobile robot and some ceiling cameras. The RFID system estimates a rough position of each object. The autonomous robot with RFID antennas explores the environment so as to detect other objects on the floor. Each object that is attached an RFID tag, is then recognized by utilizing its feature information stored in this tag. Finally, the precise localization of each object is achieved by the ceiling cameras with particle filters. The accuracy and the robustness of the proposed method are verified through an experiment.},
}