In this article, we present RISE—a Robotics Integration and Scenario-Management Extensible-Architecture—for designing human–robot dialogs and conducting Human–Robot Interaction (HRI) studies. In current HRI research, interdisciplinarity in the creation and implementation of interaction studies is becoming increasingly important. In addition, there is a lack of reproducibility of the research results. With the presented open-source architecture, we aim to address these two topics. Therefore, we discuss the advantages and disadvantages of various existing tools from different sub-fields within robotics. Requirements for an architecture can be derived from this overview of the literature, which 1) supports interdisciplinary research, 2) allows reproducibility of the research, and 3) is accessible to other researchers in the field of HRI. With our architecture, we tackle these requirements by providing a Graphical User Interface which explains the robot behavior and allows introspection into the current state of the dialog. Additionally, it offers controlling possibilities to easily conduct Wizard of Oz studies. To achieve transparency, the dialog is modeled explicitly, and the robot behavior can be configured. Furthermore, the modular architecture offers an interface for external features and sensors and is expandable to new robots and modalities.
Frontiers in Robotics and AI
Introduction: Robotic exoskeletons are emerging technologies that have demonstrated their effectiveness in assisting with Activities of Daily Living. However, kinematic disparities between human and robotic joints can result in misalignment between humans and exoskeletons, leading to discomfort and potential user injuries.
Methods: In this paper, we present an ergonomic knee exoskeleton based on a dual four-bar linkage mechanism powered by hydraulic artificial muscles for stair ascent assistance. The device comprises two asymmetric four-bar linkage mechanisms on the medial and lateral sides to accommodate the internal rotation of the knee and address the kinematic discrepancies between these sides. A genetic algorithm was employed to optimize the parameters of the four-bar linkage mechanism to minimize misalignment between human and exoskeleton knee joints. The proposed device was evaluated through two experiments. The first experiment measured the reduction in undesired load due to misalignment, while the second experiment evaluated the device’s effectiveness in assisting stair ascent in a healthy subject.
Results: The experimental results indicate that the proposed device has a significantly reduced undesired load compared to the traditional revolute joint, decreasing from 14.15 N and 18.32 N to 1.88 N and 1.07 N on the medial and lateral sides, respectively. Moreover, a substantial reduction in muscle activities during stair ascent was observed, with a 55.94% reduction in surface electromyography signal.
Discussion: The reduced undesired load of the proposed dual four-bar linkage mechanism highlights the importance of the adopted asymmetrical design for reduced misalignment and increased comfort. Moreover, the proposed device was effective at reducing the effort required during stair ascent.
The present research is innovative as we followed a user-centered approach to implement and train two working memory architectures on an industrial RB-KAIROS + robot: GRU, a state-of-the-art architecture, and WorkMATe, a biologically-inspired alternative. Although user-centered approaches are essential to create a comfortable and safe HRI, they are still rare in industrial settings. Closing this research gap, we conducted two online user studies with large heterogeneous samples. The major aim of these studies was to evaluate the RB-KAIROS + robot’s appearance, movements, and perceived memory functions before (User Study 1) and after the implementation and training of robot working memory (User Study 2). In User Study 1, we furthermore explored participants’ ideas about robot memory and what aspects of the robot’s movements participants found positive and what aspects they would change. The effects of participants’ demographic background and attitudes were controlled for. In User Study 1, participants’ overall evaluations of the robot were moderate. Participant age and negative attitudes toward robots led to more negative robot evaluations. According to exploratory analyses, these effects were driven by perceived low experience with robots. Participants expressed clear ideas of robot memory and precise suggestions for a safe, efficient, and comfortable robot navigation which are valuable for further research and development. In User Study 2, the implementation of WorkMATe and GRU led to more positive evaluations of perceived robot memory, but not of robot appearance and movements. Participants’ robot evaluations were driven by their positive views of robots. Our results demonstrate that considering potential users’ views can greatly contribute to an efficient and positively perceived robot navigation, while users’ experience with robots is crucial for a positive HRI.
Affective behaviors enable social robots to not only establish better connections with humans but also serve as a tool for the robots to express their internal states. It has been well established that emotions are important to signal understanding in Human-Robot Interaction (HRI). This work aims to harness the power of Large Language Models (LLM) and proposes an approach to control the affective behavior of robots. By interpreting emotion appraisal as an Emotion Recognition in Conversation (ERC) tasks, we used GPT-3.5 to predict the emotion of a robot’s turn in real-time, using the dialogue history of the ongoing conversation. The robot signaled the predicted emotion using facial expressions. The model was evaluated in a within-subjects user study (N = 47) where the model-driven emotion generation was compared against conditions where the robot did not display any emotions and where it displayed incongruent emotions. The participants interacted with the robot by playing a card sorting game that was specifically designed to evoke emotions. The results indicated that the emotions were reliably generated by the LLM and the participants were able to perceive the robot’s emotions. It was found that the robot expressing congruent model-driven facial emotion expressions were perceived to be significantly more human-like, emotionally appropriate, and elicit a more positive impression. Participants also scored significantly better in the card sorting game when the robot displayed congruent facial expressions. From a technical perspective, the study shows that LLMs can be used to control the affective behavior of robots reliably in real-time. Additionally, our results could be used in devising novel human-robot interactions, making robots more effective in roles where emotional interaction is important, such as therapy, companionship, or customer service.
This paper summarizes the structure and findings from the first Workshop on Troubles and Failures in Conversations between Humans and Robots. The workshop was organized to bring together a small, interdisciplinary group of researchers working on miscommunication from two complementary perspectives. One group of technology-oriented researchers was made up of roboticists, Human-Robot Interaction (HRI) researchers and dialogue system experts. The second group involved experts from conversation analysis, cognitive science, and linguistics. Uniting both groups of researchers is the belief that communication failures between humans and machines need to be taken seriously and that a systematic analysis of such failures may open fruitful avenues in research beyond current practices to improve such systems, including both speech-centric and multimodal interfaces. This workshop represents a starting point for this endeavour. The aim of the workshop was threefold: Firstly, to establish an interdisciplinary network of researchers that share a common interest in investigating communicative failures with a particular view towards robotic speech interfaces; secondly, to gain a partial overview of the “failure landscape” as experienced by roboticists and HRI researchers; and thirdly, to determine the potential for creating a robotic benchmark scenario for testing future speech interfaces with respect to the identified failures. The present article summarizes both the “failure landscape” surveyed during the workshop as well as the outcomes of the attempt to define a benchmark scenario.
Interaction with artificial social agents is often designed based on models of human interaction and dialogue. While this is certainly useful for basic interaction mechanisms, it has been argued that social communication strategies and social language use, a “particularly human” ability, may not be appropriate and transferable to interaction with artificial conversational agents. In this paper, we present qualitative research exploring whether users expect artificial agents to use politeness—a fundamental mechanism of social communication—in language-based human-robot interaction. Based on semi-structured interviews, we found that humans mostly ascribe a functional, rule-based use of polite language to humanoid robots and do not expect them to apply socially motivated politeness strategies that they expect in human interaction. This study 1) provides insights for interaction design for social robots’ politeness use from a user perspective, and 2) contributes to politeness research based on the analysis of our participants’ perspectives on politeness.
Identifying an accurate dynamics model remains challenging for humanoid robots. The difficulty is mainly due to the following two points. First, a good initial model is required to evaluate the feasibility of motions for data acquisition. Second, a highly nonlinear optimization problem needs to be solved to design movements to acquire the identification data. To cope with the first point, in this paper, we propose a curriculum of identification to gradually learn an accurate dynamics model from an unreliable initial model. For the second point, we propose using a large-scale human motion database to efficiently design the humanoid movements for the parameter identification. The contribution of our study is developing a humanoid identification method that does not require the good initial model and does not need to solve the highly nonlinear optimization problem. We showed that our curriculum-based approach was able to more efficiently identify humanoid model parameters than a method that just randomly picked reference motions for identification. We evaluated our proposed method in a simulation experiment and demonstrated that our curriculum was led to obtain a wide variety of motion data for efficient parameter estimation. Consequently, our approach successfully identified an accurate model of an 18-DoF, simulated upper-body humanoid robot.
The number of older adults living alone is rapidly increasing. Loneliness in older adults not only degrade their quality of life but also causes troubles such as heavy burden on the medical staff, especially when cognitive decline is present. Social robots could be used in several ways to reduce such problems. As a first step towards this goal, we introduced conversation robots into the homes of older adults with cognitive decline to evaluate the robot’s availability and acceptance during several months. The study involved two steps, one for evaluating the robustness of the proposed robotic system, and the second one to examine the long-term acceptance of social robots by older adults with cognitive decline living alone. Our data shows that after several weeks of human-robot interaction, the participants continued to use the robot and successfully integrated them into their lives. These results open the possibility of further research involving how sustained interaction can be achieved, as well as which factors contributed to the acceptance of the robot.
The remarkable growth of unmanned aerial vehicles (UAVs) has also sparked concerns about safety measures during their missions. To advance towards safer autonomous aerial robots, this work presents a vision-based solution to ensuring safe autonomous UAV landings with minimal infrastructure. During docking maneuvers, UAVs pose a hazard to people in the vicinity. In this paper, we propose the use of a single omnidirectional panoramic camera pointing upwards from a landing pad to detect and estimate the position of people around the landing area. The images are processed in real-time in an embedded computer, which communicates with the onboard computer of approaching UAVs to transition between landing, hovering or emergency landing states. While landing, the ground camera also aids in finding an optimal position, which can be required in case of low-battery or when hovering is no longer possible. We use a YOLOv7-based object detection model and a XGBooxt model for localizing nearby people, and the open-source ROS and PX4 frameworks for communication, interfacing, and control of the UAV. We present both simulation and real-world indoor experimental results to show the efficiency of our methods.
Introduction: Our work introduces a real-time robotic localization and mapping system for buried pipe networks.
Methods: The system integrates non-vision-based exploration and navigation with an active-vision-based localization and topological mapping algorithm. This algorithm is selectively activated at topologically key locations, such as junctions. Non-vision-based sensors are employed to detect junctions, minimizing the use of visual data and limiting the number of images taken within junctions.
Results: The primary aim is to provide an accurate and efficient mapping of the pipe network while ensuring real-time performance and reduced computational requirements.
Discussion: Simulation results featuring robots with fully autonomous control in a virtual pipe network environment are presented. These simulations effectively demonstrate the feasibility of our approach in principle, offering a practical solution for mapping and localization in buried pipes.
Introduction: This study was to examine whether inter-user haptic feedback would have a differential impact on skill acquisition based on the nature of the surgical task involved. Specifically, we hypothesized that haptic feedback would facilitate target orientation more than cutting tasks in the context of laparoscopic surgery.
Methods: Ten novice participants were recruited and assigned to one of two training groups. Each group underwent six half-hour training sessions dedicated to laparoscopic pattern-cutting tasks. In the haptic group, five participants received expert guidance during the training sessions, whereas the remaining five participants in the control group engaged in self-practice. All trials were recorded on video, enabling a comparative analysis of task performance between the participants’ left hand (target manipulation) and right hand (cutting task). Additionally, the number of haptic feedback instances provided to the trainees in the haptic group was recorded.
Results: Practice led to a reduction in total task time, grasping time, and cutting errors. However, no significant differences were observed between the two training groups, except for the grasping time, where haptic feedback significantly reduced the grasping time compared to the control group. Moreover, the frequency of haptic feedback instances provided to the trainees was notably higher for the grasping than for the cutting task.
Discussion: Our study suggests that haptic feedback has a more substantial impact on orientation tasks than on cutting tasks in laparoscopic surgery training. However, we acknowledge that a larger sample size would provide a more robust evaluation of this effect.
People often form immediate expectations about other people, or groups of people, based on visual appearance and characteristics of their voice and speech. These stereotypes, often inaccurate or overgeneralized, may translate to robots that carry human-like qualities. This study aims to explore if nationality-based preconceptions regarding appearance and accents can be found in people’s perception of a virtual and a physical social robot. In an online survey with 80 subjects evaluating different first-language-influenced accents of English and nationality-influenced human-like faces for a virtual robot, we find that accents, in particular, lead to preconceptions on perceived competence and likeability that correspond to previous findings in social science research. In a physical interaction study with 74 participants, we then studied if the perception of competence and likeability is similar after interacting with a robot portraying one of four different nationality representations from the online survey. We find that preconceptions on national stereotypes that appeared in the online survey vanish or are overshadowed by factors related to general interaction quality. We do, however, find some effects of the robot’s stereotypical alignment with the subject group, with Swedish subjects (the majority group in this study) rating the Swedish-accented robot as less competent than the international group, but, on the other hand, recalling more facts from the Swedish robot’s presentation than the international group does. In an extension in which the physical robot was replaced by a virtual robot interacting in the same scenario online, we further found the same results that preconceptions are of less importance after actual interactions, hence demonstrating that the differences in the ratings of the robot between the online survey and the interaction is not due to the interaction medium. We hence conclude that attitudes towards stereotypical national representations in HRI have a weak effect, at least for the user group included in this study (primarily educated young students in an international setting).
Postural stabilization during rapid and powerful hopping actions represents a significant challenge for legged robotics. One strategy utilized by humans to negotiate this difficulty is the robust activation of biarticular thigh muscles. Guided by this physiological principle, this study aims to enhance the postural stability of a hopping robot through the emulation of this human mechanism. A legged robot powered by pneumatic artificial muscles (PAMs) was designed to mimic human anatomical structures. A critical aspect of this development was creating a tension-oriented stretch reflex system engineered to initiate muscle activation in response to perturbations. Our research encompassed three experiments: 1) assessing the trunk pitch angle with and without the integration of stretch reflexes, 2) evaluating the consistency of hops made with and without reflexes, and 3) understanding the correlation between the reflex strength equilibrium in the biarticular thigh muscles and trunk pitch angle. The results indicated that the integration of the stretch reflex minimized perturbations, thereby allowing the robot to perform double the continuous hops. As hypothesized, adjusting the reflex strength equilibrium caused a shift in the angle. This reflex mechanism offers potential application to PAM-driven robots and signifies a promising avenue for enhancing postural stability in diverse forms of locomotion, including walking and running.
Humans regularly use all inner surfaces of the hand during manipulation, whereas traditional formulations for robots tend to use only the tips of their fingers, limiting overall dexterity. In this paper, we explore the use of the whole hand during spatial robotic dexterous within-hand manipulation. We present a novel four-fingered robotic hand called the Model B, which is designed and controlled using a straight-forward potential energy-based motion model that is based on the hand configuration and applied actuator torques. In this way the hand-object system is driven to a new desired configuration, often through sliding and rolling between the object and hand, and with the fingers “caging” the object to prevent ejection. This paper presents the first ever application of the energy model in three dimensions, which was used to compare the theoretical manipulability of popular robotic hands, which then inspired the design of the Model B. We experimentally validate the hand’s performance with extensive benchtop experimentation with test objects and real world objects, as well as on a robotic arm, and demonstrate complex spatial caging manipulation on a variety of objects in all six object dimensions (three translation and three rotation) using all inner surfaces of the fingers and the palm.
Background: Robot-assisted fracture reduction systems can potentially reduce the risk of infection and improve outcomes, leading to significant health and economic benefits. However, these systems are still in the laboratory stage and not yet ready for commercialization due to unresolved difficulties. While previous reviews have focused on individual technologies, system composition, and surgical stages, a comprehensive review is necessary to assist future scholars in selecting appropriate research directions for clinical use.
Methods: A literature review using Google Scholar identified articles on robot-assisted fracture reduction systems. A comprehensive search yielded 17,800, 18,100, and 16,700 results for “fracture reduction,” “computer-assisted orthopedic surgery,” and “robot-assisted fracture reduction,” respectively. Approximately 340 articles were selected, and 90 highly relevant articles were chosen for further reading after reviewing the abstracts.
Results and Conclusion: Robot-assisted fracture reduction systems offer several benefits, including improved reduction accuracy, reduced physical work and radiation exposure, enhanced preoperative planning and intraoperative visualization, and shortened learning curve for skill acquisition. In the future, these systems will become integrated and practical, with automatic preoperative planning and high intraoperative safety.
Introduction: Physical therapy is crucial to rehabilitating hand function needed for activities of daily living after neurological traumas such as traumatic brain injury (TBI). Virtual reality (VR) can motivate participation in motor rehabilitation therapies. This study examines how multimodal feedback in VR to train grasp-and-place function will impact the neurological and motor responses in TBI participants (n = 7) compared to neurotypicals (n = 13).
Methods: We newly incorporated VR with our existing intelligent glove system to seamlessly enhance the augmented visual and audio feedback to inform participants about grasp security. We then assessed how multimodal feedback (audio plus visual cues) impacted electroencephalography (EEG) power, grasp-and-place task performance (motion pathlength, completion time), and electromyography (EMG) measures.
Results: After training with multimodal feedback, electroencephalography (EEG) alpha power significantly increased for TBI and neurotypical groups. However, only the TBI group demonstrated significantly improved performance or significant shifts in EMG activity.
Discussion: These results suggest that the effectiveness of motor training with augmented sensory feedback will depend on the nature of the feedback and the presence of neurological dysfunction. Specifically, adding sensory cues may better consolidate early motor learning when neurological dysfunction is present. Computerized interfaces such as virtual reality offer a powerful platform to personalize rehabilitative training and improve functional outcomes based on neuropathology.
Can we conceive machines that can formulate autonomous intentions and make conscious decisions? If so, how would this ability affect their ethical behavior? Some case studies help us understand how advances in understanding artificial consciousness can contribute to creating ethical AI systems.
Successful conversational interaction with a social robot requires not only an assessment of a user’s contribution to an interaction, but also awareness of their emotional and attitudinal states as the interaction unfolds. To this end, our research aims to systematically trigger, but then interpret human behaviors to track different states of potential user confusion in interaction so that systems can be primed to adjust their policies in light of users entering confusion states. In this paper, we present a detailed human-robot interaction study to prompt, investigate, and eventually detect confusion states in users. The study itself employs a Wizard-of-Oz (WoZ) style design with a Pepper robot to prompt confusion states for task-oriented dialogues in a well-defined manner. The data collected from 81 participants includes audio and visual data, from both the robot’s perspective and the environment, as well as participant survey data. From these data, we evaluated the correlations of induced confusion conditions with multimodal data, including eye gaze estimation, head pose estimation, facial emotion detection, silence duration time, and user speech analysis—including emotion and pitch analysis. Analysis shows significant differences of participants’ behaviors in states of confusion based on these signals, as well as a strong correlation between confusion conditions and participants own self-reported confusion scores. The paper establishes strong correlations between confusion levels and these observable features, and lays the ground or a more complete social and affect oriented strategy for task-oriented human-robot interaction. The contributions of this paper include the methodology applied, dataset, and our systematic analysis.
Gait is an important basic function of human beings and an integral part of life. Many mental and physical abnormalities can cause noticeable differences in a person’s gait. Abnormal gait can lead to serious consequences such as falls, limited mobility and reduced life satisfaction. Gait analysis, which includes joint kinematics, kinetics, and dynamic Electromyography (EMG) data, is now recognized as a clinically useful tool that can provide both quantifiable and qualitative information on performance to aid in treatment planning and evaluate its outcome. With the assistance of new artificial intelligence (AI) technology, the traditional medical environment has undergone great changes. AI has the potential to reshape medicine, making gait analysis more accurate, efficient and accessible. In this study, we analyzed basic information about gait analysis and AI articles that met inclusion criteria in the WoS Core Collection database from 1992–2022, and the VosViewer software was used for web visualization and keyword analysis. Through bibliometric and visual analysis, this article systematically introduces the research status of gait analysis and AI. We introduce the application of artificial intelligence in clinical gait analysis, which affects the identification and management of gait abnormalities found in various diseases. Machine learning (ML) and artificial neural networks (ANNs) are the most often utilized AI methods in gait analysis. By comparing the predictive capability of different AI algorithms in published studies, we evaluate their potential for gait analysis in different situations. Furthermore, the current challenges and future directions of gait analysis and AI research are discussed, which will also provide valuable reference information for investors in this field.