Feed aggregator

We propose a tool-use model that enables a robot to act toward a provided goal. It is important to consider features of the four factors; tools, objects actions, and effects at the same time because they are related to each other and one factor can influence the others. The tool-use model is constructed with deep neural networks (DNNs) using multimodal sensorimotor data; image, force, and joint angle information. To allow the robot to learn tool-use, we collect training data by controlling the robot to perform various object operations using several tools with multiple actions that leads different effects. Then the tool-use model is thereby trained and learns sensorimotor coordination and acquires relationships among tools, objects, actions and effects in its latent space. We can give the robot a task goal by providing an image showing the target placement and orientation of the object. Using the goal image with the tool-use model, the robot detects the features of tools and objects, and determines how to act to reproduce the target effects automatically. Then the robot generates actions adjusting to the real time situations even though the tools and objects are unknown and more complicated than trained ones.

Tactile hands-only training is particularly important for medical palpation. Generally, equipment for palpation training is expensive, static, or provides too few study cases to practice on. We have therefore developed a novel haptic surface concept for palpation training, using ferrogranular jamming. The concept’s design consists of a tactile field spanning 260 x 160 mm, and uses ferromagnetic granules to alter shape, position, and hardness of palpable irregularities. Granules are enclosed in a compliant vacuum-sealed chamber connected to a pneumatic system. A variety of geometric shapes (output) can be obtained by manipulating and arranging granules with permanent magnets. The tactile hardness of the palpable output can be controlled by adjusting the chamber’s vacuum level. A psychophysical experiment (N = 28) investigated how people interact with the palpable surface and evaluated the proposed concept. Untrained participants characterized irregularities with different position, form, and hardness through palpation, and their performance was evaluated. A baseline (no irregularity) was compared to three irregularity conditions: two circular shapes with different hardness (Hard Lump and Soft Lump), and an Annulus shape. 100% of participants correctly identified an irregularity in the three irregularity conditions, whereas 78.6% correctly identified baseline. Overall agreement between participants was high (κ= 0.723). The Intersection over Union (IoU) for participants sketched outline over the actual shape was IoU Mdn = 79.3% for Soft Lump, IoU Mdn = 68.8% for Annulus, and IoU Mdn = 76.7% for Hard Lump. The distance from actual to drawn center was Mdn = 6.4 mm (Soft Lump), Mdn = 5.3 mm (Annulus), and Mdn = 7.4 mm (Hard Lump), which are small distances compared to the size of the field. The participants subjectively evaluated Soft Lump to be significantly softer than Hard Lump and Annulus. Moreover, 71% of participants thought they improved their palpation skills throughout the experiment. Together, these results show that the concept can render irregularities with different position, form, and hardness, and that users are able to locate and characterize these through palpation. Participants experienced an improvement in palpation skills throughout the experiment, which indicates the concepts feasibility as a palpation training device.

The storytelling lens in human-computer interaction has primarily focused on personas, design fiction, and other stories crafted by designers, yet informal personal narratives from everyday people have not been considered meaningful data, such as storytelling from older adults. Storytelling may provide a clear path to conceptualize how technologies such as social robots can support the lives of older or disabled individuals. To explore this, we engaged 28 older adults in a year-long co-design process, examining informal stories told by older adults as a means of generating and expressing technology ideas and needs. This paper presents an analysis of participants’ stories around their prior experience with technology, stories shaped by social context, and speculative scenarios for the future of social robots. From this analysis, we present suggestions for social robot design, considerations of older adults’ values around technology design, and promotion of participant stories as sources for design knowledge and shifting perspectives of older adults and technology.

Communication delay represents a fundamental challenge in telerobotics: on one hand, it compromises the stability of teleoperated robots, on the other hand, it decreases the user's awareness of the designated task. In scientific literature, such a problem has been addressed both with statistical models and neural networks (NN) to perform sensor prediction, while keeping the user in full control of the robot's motion. We propose shared control as a tool to compensate and mitigate the effect of communication delay. Shared control has been proven to enhance precision and speed in reaching and manipulation tasks, especially in the medical and surgical fields. We analyze the effects of added delay and propose a teleoperated leader-follower architecture that both implements a predictive system and shared control, in a 1-dimensional reaching and recognition task with haptic feedback. We propose 4 different control modalities of increasing autonomy: non-predictive human control (hc), predictive human control (phc), shared predictive human-robot control (phrc), and predictive robot control (prc). When analyzing how the added delay affects the subjects' performance, the results show that the hc is very sensitive to the delay: users are not able to stop at the desired position and trajectories exhibit wide oscillations. The degree of autonomy introduced is shown to be effective in decreasing the total time requested to accomplish the task. Furthermore, we provide a deep analysis of environmental interaction forces and performed trajectories. Overall, the shared control modality, phrc, represents a good trade-off, having peak performance in accuracy and task time, a good reaching speed and a moderate contact with the object of interest.

Augmenting the physical strength of a human operator during unpredictable human-directed (volitional) movements is a relevant capability for several proposed exoskeleton applications, including mobility augmentation, manual material handling, and tool operation. Unlike controllers and augmentation systems designed for repetitive tasks (e.g., walking), we approach physical strength augmentation by a task-agnostic method of force amplification—using force/torque sensors at the human–machine interface to estimate the human task force, and then amplifying it with the exoskeleton. We deploy an amplification controller that is integrated into a complete whole-body control framework for controlling exoskeletons that includes human-led foot transitions, inequality constraints, and a computationally efficient prioritization. A powered lower-body exoskeleton is used to demonstrate behavior of the control framework in a lab environment. This exoskeleton can assist the operator in lifting an unknown backpack payload while remaining fully backdrivable.

Remote teleoperation of robots can broaden the reach of domain specialists across a wide range of industries such as home maintenance, health care, light manufacturing, and construction. However, current direct control methods are impractical, and existing tools for programming robot remotely have focused on users with significant robotic experience. Extending robot remote programming to end users, i.e., users who are experts in a domain but novices in robotics, requires tools that balance the rich features necessary for complex teleoperation tasks with ease of use. The primary challenge to usability is that novice users are unable to specify complete and robust task plans to allow a robot to perform duties autonomously, particularly in highly variable environments. Our solution is to allow operators to specify shorter sequences of high-level commands, which we call task-level authoring, to create periods of variable robot autonomy. This approach allows inexperienced users to create robot behaviors in uncertain environments by interleaving exploration, specification of behaviors, and execution as separate steps. End users are able to break down the specification of tasks and adapt to the current needs of the interaction and environments, combining the reactivity of direct control to asynchronous operation. In this paper, we describe a prototype system contextualized in light manufacturing and its empirical validation in a user study where 18 participants with some programming experience were able to perform a variety of complex telemanipulation tasks with little training. Our results show that our approach allowed users to create flexible periods of autonomy and solve rich manipulation tasks. Furthermore, participants significantly preferred our system over comparative more direct interfaces, demonstrating the potential of our approach for enabling end users to effectively perform remote robot programming.

This study proposes two novel methods for determining the muscular internal force (MIF) based on joint stiffness, using an MIF feedforward controller for the musculoskeletal system. The controller was developed in a previous study, where we found that it could be applied to achieve any desired end-point position without the use of sensors, by providing the MIF as a feedforward input to individual muscles. However, achieving motion with good response and low stiffness using the system, posed a challenge. Furthermore, the controller was subject to an ill-posed problem, where the input could not be uniquely determined. We propose two methods to improve the control performance of this controller. The first method involves determining a MIF that can independently control the response and stiffness at a desired position, and the second method involves the definition of an arbitrary vector that describes the stiffnesses at the initial and desired positions to uniquely determine the MIF balance at each position. The numerical simulation results reported in this study demonstrate the effectiveness of both proposed methods.



This week we have a special DARPA SubT edition of Video Friday, both because the SubT Final is happening this week and is amazing, and also because (if I'm being honest) the SubT Final is happening this week and is amazing and I've spent all week covering it mostly in a cave with zero access to Internet. Win-win, right? So today, videos to watch are DARPA's recaps of the preliminary competition days, plus (depending on when you're tuning in) a livestream of the prize round highlights, the awards ceremony, and the SubT Summit with roundtable discussions featuring both the Virtual and Systems track teams.

DARPA Subterranean Challenge Final Event Day 1- Introduction to the SubT Challenge

DARPA Subterranean Challenge Final Event - Day 2 - Competition Coverage

DARPA Subterranean Challenge Final Event - Day 3 - Competition Coverage

DARPA Subterranean Challenge Final Event Day 4 - Prize Round Coverage

DARPA Subterranean Challenge Final Event Day 4 - Awards Ceremony and SubT Summit



This week we have a special DARPA SubT edition of Video Friday, both because the SubT Final is happening this week and is amazing, and also because (if I'm being honest) the SubT Final is happening this week and is amazing and I've spent all week covering it mostly in a cave with zero access to Internet. Win-win, right? So today, videos to watch are DARPA's recaps of the preliminary competition days, plus (depending on when you're tuning in) a livestream of the prize round highlights, the awards ceremony, and the SubT Summit with roundtable discussions featuring both the Virtual and Systems track teams.

DARPA Subterranean Challenge Final Event Day 1- Introduction to the SubT Challenge

DARPA Subterranean Challenge Final Event - Day 2 - Competition Coverage

DARPA Subterranean Challenge Final Event - Day 3 - Competition Coverage

DARPA Subterranean Challenge Final Event Day 4 - Prize Round Coverage

DARPA Subterranean Challenge Final Event Day 4 - Awards Ceremony and SubT Summit

Soft continuum robots have been accepted as a promising category of biomedical robots, accredited to the robots’ inherent compliance that makes them safely interact with their surroundings. In its application of minimally invasive surgery, such a continuum concept shares the same view of robotization for conventional endoscopy/laparoscopy. Different from rigid-link robots with accurate analytical kinematics/dynamics, soft robots encounter modeling uncertainties due to intrinsic and extrinsic factors, which would deteriorate the model-based control performances. However, the trade-off between flexibility and controllability of soft manipulators may not be readily optimized but would be demanded for specific kinds of modeling approaches. To this end, data-driven modeling strategies making use of machine learning algorithms would be an encouraging way out for the control of soft continuum robots. In this article, we attempt to overview the current state of kinematic/dynamic model-free control schemes for continuum manipulators, particularly by learning-based means, and discuss their similarities and differences. Perspectives and trends in the development of new control methods are also investigated through the review of existing limitations and challenges.

Robots are an opportunity for interactive and engaging learning activities. In this paper we consider the premise that haptic force feedback delivered through a held robot can enrich learning of science-related concepts by building physical intuition as learners design experiments and physically explore them to solve problems they have posed. Further, we conjecture that combining this rich feedback with pen-and-paper interactions, e.g., to sketch experiments they want to try, could lead to fluid interactions and benefit focus. However, a number of technical barriers interfere with testing this approach, and making it accessible to learners and their teachers. In this paper, we propose a framework for Physically Assisted Learning based on stages of experiential learning which can guide designers in developing and evaluating effective technology, and which directs focus on how haptic feedback could assist with design and explore learning stages. To this end, we demonstrated a possible technical pathway to support the full experience of designing an experiment by drawing a physical system on paper, then interacting with it physically after the system recognizes the sketch, interprets as a model and renders it haptically. Our proposed framework is rooted in theoretical needs and current advances for experiential learning, pen-paper interaction and haptic technology. We further explain how to instantiate the PAL framework using available technologies and discuss a path forward to a larger vision of physically assisted learning.

Lower-limb exoskeletons are a promising option to increase the mobility of persons with leg impairments in a near future. However, it is still challenging for them to ensure the necessary stability and agility to face obstacles, particularly the variety that makes the urban environment. That is why most of the lower-limb exoskeletons must be used with crutches: the stability and agility features are deferred to the patient. Clinical experience shows that the use of crutches not only leads to shoulder pain and exhaustion, but also fully occupies the hands for daily tasks. In November 2020, Wandercraft presented Atalante Evolution, the first self-stabilized and crutch-less exoskeleton, to the powered exoskeleton race of the Cybathlon 2020 Global Edition. The Cybathlon aims at promoting research and development in the field of powered assistive technology to the public, contrary to the Paralympics where only participants with unpowered assistive technology are allowed. The race is designed to represent the challenges that a person could face every day in their environment: climbing stairs, walking through rough terrain, or descending ramps. Atalante Evolution is a 12 degree-of-freedom exoskeleton capable of moving dynamically with a complete paraplegic person. The challenge of this competition is to generate and execute new dynamic motions in a short time, to achieve different tasks. In this paper, an overview of Atalante Evolution system and of our framework for dynamic trajectory generation based on the direct collocation method will be presented. Next, the flexibility and efficiency of the dynamic motion generation framework are demonstrated by our tools developed for generating the important variety of stable motions required by the competition. A smartphone application has been developed to allow the pilot to choose between different modes and to control the motion direction according to the real situation to reach a destination. The advanced mechatronic design and the active cooperation of the pilot with the device will also be highlighted. As a result, Atalante Evolution allowed the pilot to complete four out of six obstacles, without crutches. Our developments lead to stable dynamic movements of the exoskeleton, hands-free walking, more natural stand-up and turning moves, and consequently a better physical condition of the pilot after the race compared to the challengers. The versatility and good results of these developments give hope that exoskeletons will soon be able to evolve in challenging everyday-life environments, allowing patients to live a normal life in complete autonomy.



Each of the DARPA Subterranean Challenge teams is allowed to bring up to 20 people to the Louisville Mega Cavern for the final event. Of those 20 people, only five can accompany the robots to the course staging area to set up the robots. And of those five, just one person can be what DARPA calls the Human Supervisor.

The Human Supervisor role, which most teams refer to as Robot Operator, is the only person allowed to interface with the robots while they're on the course. Or, it's probably more accurate to say that the team's base station computer is the only thing allowed to interface with robots on the course, and the human operator is the only person allowed to use the base station. The operator can talk to their teammates at the staging area, but that's about it—the rest of the team can't even look at the base station screens.

Robot operator is a unique job that can be different for each team, depending on what kinds of robots that team has deployed, how autonomous those robots are, and what strategy the team is using during the competition. On the second day of the SubT preliminary competition, we talked with robot operators from all eight Systems Track teams to learn more about their robots, exactly what they do during the competition runs, and their approach to autonomy.

"DARPA is interested in approaches that are highly autonomous without the need for substantive human interventions; capable of remotely mapping and/or navigating complex and dynamic terrain; and able to operate with degraded and unreliable communication links. The team is permitted to have a single Human Supervisor at a Base Station… The Human Supervisor is permitted to view, access, and/or analyze both course data and status data. Only the Human Supervisor is permitted to use wireless communications with the systems during the competition run."

DARPA's idea here is that most of the robots competing in SubT will be mostly autonomous most of the time, hence their use of "supervisor" rather than "operator." Requiring substantial human-in-the-loop-ness is problematic for a couple of reasons—first, direct supervision requires constant communication, and we've seen how problematic communication can be on the SubT course. And second, operation means the need for a skilled and experienced operator, which is fine if you're a SubT team that's been practicing for years but could be impractical for a system of robots that's being deployed operationally.

So how are teams making the robot operator role work, and how close are they to being robot supervisors instead? I went around the team garages on the second day of preliminary runs, and asked each team operator the same three questions about their roles. I also asked the operators, "What is one question I should I ask the next operator I talk to?" I added this as a bonus question, with each operator answering a question suggested by a different team operator.

Team RobotikaRobot Operator: Martin Dlouhy

Tell me about the team of robots that you're operating and why you think it's the optimal team for exploring underground environments.

This is the third time we've participated in a SubT event; we've tried various robots, small ones, bigger ones, but for us, these two robots seem to be optimal. Because we are flying from Czech Republic, the robots have to fit in our checked luggage. We also don't have the smaller robots or the drones that we had because like three weeks ago, we didn't even know if we would be allowed to enter the United States. So this is optimal for what we can bring to the competition, and we would like to demonstrate that we can do something with a simple solution.

Once your team of robots is on the course, what do you do during the run?

We have two robots, so it's easier than for some other teams. When the robots are in network range, I have some small tools to locally analyze data to help find artifacts that are hard for the robots to see, like the cellphone or the gas source. If everything goes fine, I basically don't have to be there. We've been more successful in the Virtual SubT competition because over half our team are software developers. We've really pushed hard to make the Virtual and System software as close as possible, and in Virtual, it's fully autonomous from beginning to end. There's one step that I do manually as operator—the robots have neural networks to recognize artifacts, but it's on me to click confirm to submit the artifact reports to DARPA.

What autonomous decisions would you like your robots to be able to make that they aren't currently making, and what would it take to make that possible?

I would actually like an operator-less solution, and we could run it, but it's still useful to have a human operator—it's safer for the robot, because it's obvious to a human when the robot is not doing well.

Bonus operator question: What are the lowest and highest level decisions you have to make?

The lowest level is, I open the code and change it on the fly. I did it yesterday to change some of the safety parameters. I do this all the time, it's normal. The highest level is asking the team, "guys, how are we going to run our robots today."

Team MARBLERobot Operator: Dan Riley

Tell me about the team of robots that you're operating and why you think it's the optimal team for exploring underground environments.

We've been using the Huskies [wheeled robots] since the beginning of the competition, it's a reliable platform with a lot of terrain capability. It's a workhorse that can do a lot of stuff. We were also using a tank-like robot at one time, but we had traversability issues so we decided to drop that one for this competition. We also had UAVs, because there's a lot of value in not having to worry about the ground while getting to areas that you can't get to with a ground robot, but unfortunately we had to drop that too because of the number of people and time that we had. We decided to focus on what we knew we could do well, and make sure that our baseline system was super solid. And we added the Spot robots within the last two months mostly to access areas that the Huskies can't, like going up and down stairs and tricky terrain. It's fast, and we really like it.

Our team of robots is closely related to our deployment strategy. The way our planner and multi-robot coordination works is that the first robot really just plows through the course looking for big frontiers and new areas, and then subsequent robots will fill in the space behind looking for more detail. So we deploy the Spots first to push the environment since they're faster than the Huskies, and the Huskies will follow along and fill in the communications network.

We know we don't want to run five robots tomorrow. Before we got here, we saw the huge cavern and thought that running more robots would be better. But based on the first couple runs, we now know that the space inside is much smaller, so we think four robots is good.

Once your team of robots is on the course, what do you do during the run?

The main thing I'm watching for is artifact reports from robots. While I'm waiting for artifact reports, I'm monitoring where the robots are going, and mainly I want to see them going to new areas. If I see them backtracking or going where another robot has explored already, I have the ability to send them new goal points in another area. When I get an artifact report, I look at the image to verify that it's a good report. For objects that may not be visible, like the cell phone [which has to be detected through the wireless signal it emits], if it's early in the mission I'll generally wait and see if I get any other reports from another robot on it. The localization isn't great on those artifacts, so once I do submit, if it doesn't score, I have to look around to find an area where it might be. For instance, we found this giant room with lots of shelves and stuff, and that's a great place to put a cell phone, and sure enough, that's where the cell phone was.

What autonomous decisions would you like your robots to be able to make that they aren't currently making, and what would it take to make that possible?

We pride ourselves on our autonomy. From the very beginning, that was our goal, and actually in earlier competitions I had very little control over the robot, I could not even send it a goal point. All I was getting was reports—it was a one-way street of information. I might have been able to stop the robot, but that was about it. Later on, we added the goal point capability and an option to drive the robot if I need to take over to get it out of a situation.

I'm actually the lead for our Virtual Track team as well, and that's already decision-free. We're running the exact same software stack on our robots, and the only difference is that the virtual system also does artifact reporting. Honestly, I'd say that we're more effective having the human be able to make some decisions, but the exact same system works pretty well without having any human at all.

Bonus operator question: How much sleep did you get last night?

I got eight hours, and I could have had more, except I sat around watching TV for a while. We stressed ourselves out a lot during the first two competitions, and we had so many problems. It was horrible, so we said, "we're not doing that again!" A lot of our problems started with the setup and launching phase, just getting the robots started up and ready to go and out of the gate. So we spent a ton of time making sure that our startup procedures were all automated. And when you're able to start up easily, things just go well.

Team ExplorerRobot Operator: Chao Cao

Tell me about the team of robots that you're operating and why you think it's the optimal team for exploring underground environments.

We tried to diversify our robots for the different kinds of environments in the challenge. We have wheeled vehicles, aerial vehicles, and legged vehicles (Spot robots). Our wheeled vehicles are different sizes; two are relatively big and one is smaller, and two are articulated in the middle to give them better mobility performance in rough terrain. Our smaller drones can be launched from the bigger ground robots, and we have a larger drone with better battery life and more payload.

In total, there are 11 robots, which is quite a lot to be managed by a single human operator under a constrained time limit, but if we manage those robots well, we can explore quite a large three dimensional area.

Once your team of robots is on the course, what do you do during the run?

Most of the time, to be honest, it's like playing a video game. It's about allocating resources to gain rewards (which in this case are artifacts) by getting the robots spread out to maximize coverage of the course. I'm monitoring the status of the robots, where they're at, and what they're doing. Most of the time I rely on the autonomy of the robots, including for exploration, coordination between multiple robots, and detecting artifacts. But there are still times when the robots might need my help, for example yesterday one of the bigger robots got itself stuck in the cave branch but I was able to intervene and get it to drive out.

What autonomous decisions would you like your robots to be able to make that they aren't currently making, and what would it take to make that possible?

Humans have a semantic understanding of the environment. Just by looking at a camera image, I can predict what an environment will be like and how risky it will be, but robots don't have that kind of higher level decision capability. So I might want a specific kind of robot to go into a specific kind of environment based on what I see, and I can redirect robots to go into areas that are a better fit for them. For me as an operator, at least from my personal experience, I think it's still quite challenging for robots to perform this kind of semantic understanding, and I still have to make those decisions.

Bonus operator question: What is your flow for decision making?

Before each run, we'll have a discussion among all the team members to figure out a rough game plan, including a deployment sequence—which robots go first, should the drones be launched from the ground vehicles or from the staging area. During the run, things are changing, and I have to make decisions based on the environment. I'll talk to the pit crew about what I can see through the base station, and then I'll make an initial proposal based on my instincts for what I think we should do. But I'm very focused during the run and have a lot of tasks to do, so my teammates will think about time constraints and how conservative we want to be and where other robots are because I can't think through all of those possibilities, and then they'll give me feedback. Usually this back and forth is quick and smooth.

The Robot Operator is the only person allowed to interface with the robots while they're on the course—the operators pretty much controls the entire run by themselves.DARPA

Team CTU-CRAS-NORLABRobot Operator: Vojtech Salnsky

Tell me about the team of robots that you're operating and why you think it's the optimal team for exploring underground environments.

We chose many different platforms. We have some tracked robots, wheeled robots, Spot robots, and some other experimental UGVs [small hexapods and one big hexapod], and every UGV has a different ability to traverse terrain, and we are trying to cover all possible locomotion types to be able to traverse anything on the course. Besides the UGVs, we're using UAVs as well that are able to go through both narrow corridors and bigger spaces.

We brought a large number of robots, but the number that we're using, about ten, is enough to be able to explore a large part of the environment. Deploying more would be really hard for the pit crew of only five people, and there isn't enough space for more robots.

Once your team of robots is on the course, what do you do during the run?

It differs run by run, but the robots are mostly autonomous, so they decide where to go and I'm looking for artifact detections uploaded by the robots and approving or disapproving them. If I see that a robot is stuck somewhere, I can help it decide where to go. If it looks like a robot may lose communications, I can move some robots to make a chain from other robots to extend our network. I can do high level direction for exploration, but I don't have to—the robots are updating their maps and making decisions to best explore the whole environment.

What autonomous decisions would you like your robots to be able to make that they aren't currently making, and what would it take to make that possible?

Terrain assessment is subtle. At a higher level, the operator has to decide where to send a walking robot and where to send a rolling robot. It's tiny details on the ground and a feeling about the environment that help the operator make those decisions, and that is not done autonomously.

Bonus operator question: How much bandwidth do you have?

I'm on the edge. I have a map, I have some subsampled images, I have detections, I have topological maps, but it would be better to have everything in 4K and dense point clouds.

Team CSIRO Data61Robot Operator: Brendan Tidd

Tell me about the team of robots that you're operating and why you think it's the optimal team for exploring underground environments.

We've got three robot types that are here today—Spot legged robots, big tracked robots called Titans, and drones. The legged ones have been pretty amazing, especially for urban environments with narrow stairs and doorways. The tracked robots are really good in the tricky terrain of cave environments. And the drones can obviously add situational awareness from higher altitudes and detect those high artifacts.

Once your team of robots is on the course, what do you do during the run?

We use the term "operator" but I'm actually supervising. Our robots are all autonomous, they all know how to divide and conquer, they're all going to optimize exploring for depth, trying to split up where they can and not get in each other's way. In particular the Spots and the Titans have a special relationship where the Titan will give way to the Spot if they ever cross paths, for obvious reasons. So my role during the run is to coordinate node placement, that's something that we haven't automated—we've got a lot of information that comes back that I use to decide on good places to put nodes, and probably the next step is to automate that process. I also decide where to launch the drone. The launch itself is one click, but it still requires me to know where a good place is. If everything goes right, in general the robots will just do their thing.

What autonomous decisions would you like your robots to be able to make that they aren't currently making, and what would it take to make that possible?

The node drop thing is vital, but I think it's quite a complex thing to automate because there are so many different aspects to consider. The node mesh is very dynamic, it's affected by all the robots that are around it and obviously by the environment. Similarly, the drone launch, but that requires the robots to know when it's worth it to launch a drone. So those two things, but also pushing on the nav stack to make sure it can handle the crazy stuff. And I guess the other side is the detection. It's not a trivial thing knowing what's a false positive or not, that's a hard thing to automate.

Bonus operator question: How stressed are you, knowing that it's just you controlling all the robots during the run?

Coping with that is a thing! I've got music playing when I'm operating, I actually play in a metal band and we get on stage sometimes and the feeling is very similar, so it's really helpful to have the music there. But also the team, you know? I'm confident in our system, and if I wasn't, that would really affect my mental state. But we test a lot, and all that preparedness helps with the stress.

Team CoSTARRobot Operator: Kyohei Otsu

Tell me about the team of robots that you're operating and why you think it's the optimal team for exploring underground environments.

We have wheeled vehicles, legged vehicles, and aerial drones, so we can cover many terrains, handle stairs, and fly over obstacles. We picked three completely different mobility systems to be able to use many different strategies. The robots can autonomously adjust their roles by themselves; some explore, some help with communication for other robots. The number of robots we use depends on the environment—yesterday we deployed seven robots onto the course because we assumed that the environment would be huge, but it's a bit smaller than we expected, so we'll adapt our number to fit that environment.

Once your team of robots is on the course, what do you do during the run?

Our robots are autonomous, and I think we have very good autonomy software. During setup the robots need some operator attention; I have to make sure that everything is working including sensors, mobility systems, and all the algorithms. But after that, once I send the robot into the course, I totally forget about it and focus on another robot. Sometimes I intervene to better distribute our team of robots—that's something that a human is good at, using prior knowledge to understand the environment. And I look at artifact reports, that's most of my job.

In the first phases of the Subterranean Challenge, we were getting low level information from the robots and sometimes using low level commands. But as the project proceeded and our technology matured, we found that it was too difficult for the operator, so we added functionality for the robot to make all of those low level decisions, and the operator just deals with high level decisions.

What autonomous decisions would you like your robots to be able to make that they aren't currently making, and what would it take to make that possible? [answered by CoSTAR co-Team Lead Joel Burdick]

Two things: the system reports that it thinks it found an artifact, and the operator has to confirm yes or no. He has to also confirm that the location seems right. The other thing is that our multi-robot coordination isn't as sophisticated as it could be, so the operator may have to retask robots to different areas. If we had another year, we'd be much closer to automating those things.

Bonus Operator Question: Would you prefer if your system was completely autonomous and your job was not necessary?

Yeah, I'd prefer that!

Team Coordinated RoboticsRobot Operator: Kevin Knoedler

Tell me about the team of robots that you're operating and why you think it's the optimal team for exploring underground environments.

The ideal mix in my mind is a fleet of small drones with lidar, but they are very hard to test, and very hard to get right. Ground vehicles aren't necessarily easier to get right, but they're easier to test, and if you can test something, you're a lot more likely to succeed. So that's really the big difference with the team of robots we have here.

Once your team of robots is on the course, what do you do during the run?

Some of the robots have an automatic search function where if they find something they report back, and what I'd like to be doing is just monitoring. But, the search function only works in larger areas. So right now the goal is for me to drive them through the narrow areas, get them into the wider areas, and let them go, but getting them to that search area is something that I mostly need to do manually one at a time.

What autonomous decisions would you like your robots to be able to make that they aren't currently making, and what would it take to make that possible?

Ideally, the robots would be able to get through those narrow areas on their own. It's actually a simpler problem to solve than larger areas, it's just not where we focused our effort.

Bonus operator question: How many interfaces do you use to control your robots?

We have one computer with two monitors, one controller, and that's it.

Team CERBERUSRobot Operator: Marco Tranzatto

Tell me about the team of robots that you're operating and why you think it's the optimal team for exploring underground environments.

We have a mix of legged and flying robots, supported by a rover carrying a wireless antenna. The idea is to take legged robots for harsh environments where wheel robots may not perform as well, combined with aerial scouts that can explore the environment fast to provide initial situational awareness to the operator so that I can decide where to deploy the legged machines. So the goal is to combine the legged and flying robots in a unified mission to give as much information as possible to the human operator. We also had some bigger robots, but we found them to be a bit too big for the environment that DARPA has prepared for us, so we're not going to deploy them.

Once your team of robots is on the course, what do you do during the run?

We use two main modes: one is fully autonomous on the robots, and the other one is supervised autonomy where I have an overview of what the robots are doing and can override specific actions. Based on the high level information that I can see, I can decide to control a single robot to give it a manual waypoint to reposition it to a different frontier inside the environment. I can go from high level control down to giving these single commands, but the commands are still relatively high level, like "go here and explore." Each robot has artifact scoring capabilities, and all these artifact detections are sent to the base station once the robot is in communication range, and the human operator has to say, "okay this looks like a possible artifact so I accept it" and then can submit the position either as reported by the robot or the optimized position reported by the mapping server.

What autonomous decisions would you like your robots to be able to make that they aren't currently making, and what would it take to make that possible?

Each robot is autonomous by itself. But the cooperation between robots is still like… The operator has to set bounding boxes to tell each robot where to explore. The operator has a global overview, and then inside these boxes, the robots are autonomous. So I think at the moment in our pipeline, we still need a centralized human supervisor to say which robot explores in which direction. We are close to automating this, but we're not there yet.

Bonus operator question: What is one thing you would add to make your life as an operator easier?

I would like to have a more centralized way to give commands to the robots. At the moment I need to select each robot and give it a specific command. It would be very helpful to have a centralized map where I can tell a robot to say explore in a given area while considering data from a different robot. This was in our plan, but we didn't manage to deploy it yet.



Each of the DARPA Subterranean Challenge teams is allowed to bring up to 20 people to the Louisville Mega Cavern for the final event. Of those 20 people, only five can accompany the robots to the course staging area to set up the robots. And of those five, just one person can be what DARPA calls the Human Supervisor.

The Human Supervisor role, which most teams refer to as Robot Operator, is the only person allowed to interface with the robots while they're on the course. Or, it's probably more accurate to say that the team's base station computer is the only thing allowed to interface with robots on the course, and the human operator is the only person allowed to use the base station. The operator can talk to their teammates at the staging area, but that's about it—the rest of the team can't even look at the base station screens.

Robot operator is a unique job that can be different for each team, depending on what kinds of robots that team has deployed, how autonomous those robots are, and what strategy the team is using during the competition. On the second day of the SubT preliminary competition, we talked with robot operators from all eight Systems Track teams to learn more about their robots, exactly what they do during the competition runs, and their approach to autonomy.

"DARPA is interested in approaches that are highly autonomous without the need for substantive human interventions; capable of remotely mapping and/or navigating complex and dynamic terrain; and able to operate with degraded and unreliable communication links. The team is permitted to have a single Human Supervisor at a Base Station… The Human Supervisor is permitted to view, access, and/or analyze both course data and status data. Only the Human Supervisor is permitted to use wireless communications with the systems during the competition run."

DARPA's idea here is that most of the robots competing in SubT will be mostly autonomous most of the time, hence their use of "supervisor" rather than "operator." Requiring substantial human-in-the-loop-ness is problematic for a couple of reasons—first, direct supervision requires constant communication, and we've seen how problematic communication can be on the SubT course. And second, operation means the need for a skilled and experienced operator, which is fine if you're a SubT team that's been practicing for years but could be impractical for a system of robots that's being deployed operationally.

So how are teams making the robot operator role work, and how close are they to being robot supervisors instead? I went around the team garages on the second day of preliminary runs, and asked each team operator the same three questions about their roles. I also asked the operators, "What is one question I should I ask the next operator I talk to?" I added this as a bonus question, with each operator answering a question suggested by a different team operator.

Team RobotikaRobot Operator: Martin Dlouhy

Tell me about the team of robots that you're operating and why you think it's the optimal team for exploring underground environments.

This is the third time we've participated in a SubT event; we've tried various robots, small ones, bigger ones, but for us, these two robots seem to be optimal. Because we are flying from Czech Republic, the robots have to fit in our checked luggage. We also don't have the smaller robots or the drones that we had because like three weeks ago, we didn't even know if we would be allowed to enter the United States. So this is optimal for what we can bring to the competition, and we would like to demonstrate that we can do something with a simple solution.

Once your team of robots is on the course, what do you do during the run?

We have two robots, so it's easier than for some other teams. When the robots are in network range, I have some small tools to locally analyze data to help find artifacts that are hard for the robots to see, like the cellphone or the gas source. If everything goes fine, I basically don't have to be there. We've been more successful in the Virtual SubT competition because over half our team are software developers. We've really pushed hard to make the Virtual and System software as close as possible, and in Virtual, it's fully autonomous from beginning to end. There's one step that I do manually as operator—the robots have neural networks to recognize artifacts, but it's on me to click confirm to submit the artifact reports to DARPA.

What autonomous decisions would you like your robots to be able to make that they aren't currently making, and what would it take to make that possible?

I would actually like an operator-less solution, and we could run it, but it's still useful to have a human operator—it's safer for the robot, because it's obvious to a human when the robot is not doing well.

Bonus operator question: What are the lowest and highest level decisions you have to make?

The lowest level is, I open the code and change it on the fly. I did it yesterday to change some of the safety parameters. I do this all the time, it's normal. The highest level is asking the team, "guys, how are we going to run our robots today."

Team MARBLERobot Operator: Dan Riley

Tell me about the team of robots that you're operating and why you think it's the optimal team for exploring underground environments.

We've been using the Huskies [wheeled robots] since the beginning of the competition, it's a reliable platform with a lot of terrain capability. It's a workhorse that can do a lot of stuff. We were also using a tank-like robot at one time, but we had traversability issues so we decided to drop that one for this competition. We also had UAVs, because there's a lot of value in not having to worry about the ground while getting to areas that you can't get to with a ground robot, but unfortunately we had to drop that too because of the number of people and time that we had. We decided to focus on what we knew we could do well, and make sure that our baseline system was super solid. And we added the Spot robots within the last two months mostly to access areas that the Huskies can't, like going up and down stairs and tricky terrain. It's fast, and we really like it.

Our team of robots is closely related to our deployment strategy. The way our planner and multi-robot coordination works is that the first robot really just plows through the course looking for big frontiers and new areas, and then subsequent robots will fill in the space behind looking for more detail. So we deploy the Spots first to push the environment since they're faster than the Huskies, and the Huskies will follow along and fill in the communications network.

We know we don't want to run five robots tomorrow. Before we got here, we saw the huge cavern and thought that running more robots would be better. But based on the first couple runs, we now know that the space inside is much smaller, so we think four robots is good.

Once your team of robots is on the course, what do you do during the run?

The main thing I'm watching for is artifact reports from robots. While I'm waiting for artifact reports, I'm monitoring where the robots are going, and mainly I want to see them going to new areas. If I see them backtracking or going where another robot has explored already, I have the ability to send them new goal points in another area. When I get an artifact report, I look at the image to verify that it's a good report. For objects that may not be visible, like the cell phone [which has to be detected through the wireless signal it emits], if it's early in the mission I'll generally wait and see if I get any other reports from another robot on it. The localization isn't great on those artifacts, so once I do submit, if it doesn't score, I have to look around to find an area where it might be. For instance, we found this giant room with lots of shelves and stuff, and that's a great place to put a cell phone, and sure enough, that's where the cell phone was.

What autonomous decisions would you like your robots to be able to make that they aren't currently making, and what would it take to make that possible?

We pride ourselves on our autonomy. From the very beginning, that was our goal, and actually in earlier competitions I had very little control over the robot, I could not even send it a goal point. All I was getting was reports—it was a one-way street of information. I might have been able to stop the robot, but that was about it. Later on, we added the goal point capability and an option to drive the robot if I need to take over to get it out of a situation.

I'm actually the lead for our Virtual Track team as well, and that's already decision-free. We're running the exact same software stack on our robots, and the only difference is that the virtual system also does artifact reporting. Honestly, I'd say that we're more effective having the human be able to make some decisions, but the exact same system works pretty well without having any human at all.

Bonus operator question: How much sleep did you get last night?

I got eight hours, and I could have had more, except I sat around watching TV for a while. We stressed ourselves out a lot during the first two competitions, and we had so many problems. It was horrible, so we said, "we're not doing that again!" A lot of our problems started with the setup and launching phase, just getting the robots started up and ready to go and out of the gate. So we spent a ton of time making sure that our startup procedures were all automated. And when you're able to start up easily, things just go well.

Team ExplorerRobot Operator: Chao Cao

Tell me about the team of robots that you're operating and why you think it's the optimal team for exploring underground environments.

We tried to diversify our robots for the different kinds of environments in the challenge. We have wheeled vehicles, aerial vehicles, and legged vehicles (Spot robots). Our wheeled vehicles are different sizes; two are relatively big and one is smaller, and two are articulated in the middle to give them better mobility performance in rough terrain. Our smaller drones can be launched from the bigger ground robots, and we have a larger drone with better battery life and more payload.

In total, there are 11 robots, which is quite a lot to be managed by a single human operator under a constrained time limit, but if we manage those robots well, we can explore quite a large three dimensional area.

Once your team of robots is on the course, what do you do during the run?

Most of the time, to be honest, it's like playing a video game. It's about allocating resources to gain rewards (which in this case are artifacts) by getting the robots spread out to maximize coverage of the course. I'm monitoring the status of the robots, where they're at, and what they're doing. Most of the time I rely on the autonomy of the robots, including for exploration, coordination between multiple robots, and detecting artifacts. But there are still times when the robots might need my help, for example yesterday one of the bigger robots got itself stuck in the cave branch but I was able to intervene and get it to drive out.

What autonomous decisions would you like your robots to be able to make that they aren't currently making, and what would it take to make that possible?

Humans have a semantic understanding of the environment. Just by looking at a camera image, I can predict what an environment will be like and how risky it will be, but robots don't have that kind of higher level decision capability. So I might want a specific kind of robot to go into a specific kind of environment based on what I see, and I can redirect robots to go into areas that are a better fit for them. For me as an operator, at least from my personal experience, I think it's still quite challenging for robots to perform this kind of semantic understanding, and I still have to make those decisions.

Bonus operator question: What is your flow for decision making?

Before each run, we'll have a discussion among all the team members to figure out a rough game plan, including a deployment sequence—which robots go first, should the drones be launched from the ground vehicles or from the staging area. During the run, things are changing, and I have to make decisions based on the environment. I'll talk to the pit crew about what I can see through the base station, and then I'll make an initial proposal based on my instincts for what I think we should do. But I'm very focused during the run and have a lot of tasks to do, so my teammates will think about time constraints and how conservative we want to be and where other robots are because I can't think through all of those possibilities, and then they'll give me feedback. Usually this back and forth is quick and smooth.

The Robot Operator is the only person allowed to interface with the robots while they're on the course—the operators pretty much controls the entire run by themselves.DARPA

Team CTU-CRAS-NORLABRobot Operator: Vojtech Salnsky

Tell me about the team of robots that you're operating and why you think it's the optimal team for exploring underground environments.

We chose many different platforms. We have some tracked robots, wheeled robots, Spot robots, and some other experimental UGVs [small hexapods and one big hexapod], and every UGV has a different ability to traverse terrain, and we are trying to cover all possible locomotion types to be able to traverse anything on the course. Besides the UGVs, we're using UAVs as well that are able to go through both narrow corridors and bigger spaces.

We brought a large number of robots, but the number that we're using, about ten, is enough to be able to explore a large part of the environment. Deploying more would be really hard for the pit crew of only five people, and there isn't enough space for more robots.

Once your team of robots is on the course, what do you do during the run?

It differs run by run, but the robots are mostly autonomous, so they decide where to go and I'm looking for artifact detections uploaded by the robots and approving or disapproving them. If I see that a robot is stuck somewhere, I can help it decide where to go. If it looks like a robot may lose communications, I can move some robots to make a chain from other robots to extend our network. I can do high level direction for exploration, but I don't have to—the robots are updating their maps and making decisions to best explore the whole environment.

What autonomous decisions would you like your robots to be able to make that they aren't currently making, and what would it take to make that possible?

Terrain assessment is subtle. At a higher level, the operator has to decide where to send a walking robot and where to send a rolling robot. It's tiny details on the ground and a feeling about the environment that help the operator make those decisions, and that is not done autonomously.

Bonus operator question: How much bandwidth do you have?

I'm on the edge. I have a map, I have some subsampled images, I have detections, I have topological maps, but it would be better to have everything in 4K and dense point clouds.

Team CSIRO Data61Robot Operator: Brendan Tidd

Tell me about the team of robots that you're operating and why you think it's the optimal team for exploring underground environments.

We've got three robot types that are here today—Spot legged robots, big tracked robots called Titans, and drones. The legged ones have been pretty amazing, especially for urban environments with narrow stairs and doorways. The tracked robots are really good in the tricky terrain of cave environments. And the drones can obviously add situational awareness from higher altitudes and detect those high artifacts.

Once your team of robots is on the course, what do you do during the run?

We use the term "operator" but I'm actually supervising. Our robots are all autonomous, they all know how to divide and conquer, they're all going to optimize exploring for depth, trying to split up where they can and not get in each other's way. In particular the Spots and the Titans have a special relationship where the Titan will give way to the Spot if they ever cross paths, for obvious reasons. So my role during the run is to coordinate node placement, that's something that we haven't automated—we've got a lot of information that comes back that I use to decide on good places to put nodes, and probably the next step is to automate that process. I also decide where to launch the drone. The launch itself is one click, but it still requires me to know where a good place is. If everything goes right, in general the robots will just do their thing.

What autonomous decisions would you like your robots to be able to make that they aren't currently making, and what would it take to make that possible?

The node drop thing is vital, but I think it's quite a complex thing to automate because there are so many different aspects to consider. The node mesh is very dynamic, it's affected by all the robots that are around it and obviously by the environment. Similarly, the drone launch, but that requires the robots to know when it's worth it to launch a drone. So those two things, but also pushing on the nav stack to make sure it can handle the crazy stuff. And I guess the other side is the detection. It's not a trivial thing knowing what's a false positive or not, that's a hard thing to automate.

Bonus operator question: How stressed are you, knowing that it's just you controlling all the robots during the run?

Coping with that is a thing! I've got music playing when I'm operating, I actually play in a metal band and we get on stage sometimes and the feeling is very similar, so it's really helpful to have the music there. But also the team, you know? I'm confident in our system, and if I wasn't, that would really affect my mental state. But we test a lot, and all that preparedness helps with the stress.

Team CoSTARRobot Operator: Kyohei Otsu

Tell me about the team of robots that you're operating and why you think it's the optimal team for exploring underground environments.

We have wheeled vehicles, legged vehicles, and aerial drones, so we can cover many terrains, handle stairs, and fly over obstacles. We picked three completely different mobility systems to be able to use many different strategies. The robots can autonomously adjust their roles by themselves; some explore, some help with communication for other robots. The number of robots we use depends on the environment—yesterday we deployed seven robots onto the course because we assumed that the environment would be huge, but it's a bit smaller than we expected, so we'll adapt our number to fit that environment.

Once your team of robots is on the course, what do you do during the run?

Our robots are autonomous, and I think we have very good autonomy software. During setup the robots need some operator attention; I have to make sure that everything is working including sensors, mobility systems, and all the algorithms. But after that, once I send the robot into the course, I totally forget about it and focus on another robot. Sometimes I intervene to better distribute our team of robots—that's something that a human is good at, using prior knowledge to understand the environment. And I look at artifact reports, that's most of my job.

In the first phases of the Subterranean Challenge, we were getting low level information from the robots and sometimes using low level commands. But as the project proceeded and our technology matured, we found that it was too difficult for the operator, so we added functionality for the robot to make all of those low level decisions, and the operator just deals with high level decisions.

What autonomous decisions would you like your robots to be able to make that they aren't currently making, and what would it take to make that possible? [answered by CoSTAR co-Team Lead Joel Burdick]

Two things: the system reports that it thinks it found an artifact, and the operator has to confirm yes or no. He has to also confirm that the location seems right. The other thing is that our multi-robot coordination isn't as sophisticated as it could be, so the operator may have to retask robots to different areas. If we had another year, we'd be much closer to automating those things.

Bonus Operator Question: Would you prefer if your system was completely autonomous and your job was not necessary?

Yeah, I'd prefer that!

Team Coordinated RoboticsRobot Operator: Kevin Knoedler

Tell me about the team of robots that you're operating and why you think it's the optimal team for exploring underground environments.

The ideal mix in my mind is a fleet of small drones with lidar, but they are very hard to test, and very hard to get right. Ground vehicles aren't necessarily easier to get right, but they're easier to test, and if you can test something, you're a lot more likely to succeed. So that's really the big difference with the team of robots we have here.

Once your team of robots is on the course, what do you do during the run?

Some of the robots have an automatic search function where if they find something they report back, and what I'd like to be doing is just monitoring. But, the search function only works in larger areas. So right now the goal is for me to drive them through the narrow areas, get them into the wider areas, and let them go, but getting them to that search area is something that I mostly need to do manually one at a time.

What autonomous decisions would you like your robots to be able to make that they aren't currently making, and what would it take to make that possible?

Ideally, the robots would be able to get through those narrow areas on their own. It's actually a simpler problem to solve than larger areas, it's just not where we focused our effort.

Bonus operator question: How many interfaces do you use to control your robots?

We have one computer with two monitors, one controller, and that's it.

Team CERBERUSRobot Operator: Marco Tranzatto

Tell me about the team of robots that you're operating and why you think it's the optimal team for exploring underground environments.

We have a mix of legged and flying robots, supported by a rover carrying a wireless antenna. The idea is to take legged robots for harsh environments where wheel robots may not perform as well, combined with aerial scouts that can explore the environment fast to provide initial situational awareness to the operator so that I can decide where to deploy the legged machines. So the goal is to combine the legged and flying robots in a unified mission to give as much information as possible to the human operator. We also had some bigger robots, but we found them to be a bit too big for the environment that DARPA has prepared for us, so we're not going to deploy them.

Once your team of robots is on the course, what do you do during the run?

We use two main modes: one is fully autonomous on the robots, and the other one is supervised autonomy where I have an overview of what the robots are doing and can override specific actions. Based on the high level information that I can see, I can decide to control a single robot to give it a manual waypoint to reposition it to a different frontier inside the environment. I can go from high level control down to giving these single commands, but the commands are still relatively high level, like "go here and explore." Each robot has artifact scoring capabilities, and all these artifact detections are sent to the base station once the robot is in communication range, and the human operator has to say, "okay this looks like a possible artifact so I accept it" and then can submit the position either as reported by the robot or the optimized position reported by the mapping server.

What autonomous decisions would you like your robots to be able to make that they aren't currently making, and what would it take to make that possible?

Each robot is autonomous by itself. But the cooperation between robots is still like… The operator has to set bounding boxes to tell each robot where to explore. The operator has a global overview, and then inside these boxes, the robots are autonomous. So I think at the moment in our pipeline, we still need a centralized human supervisor to say which robot explores in which direction. We are close to automating this, but we're not there yet.

Bonus operator question: What is one thing you would add to make your life as an operator easier?

I would like to have a more centralized way to give commands to the robots. At the moment I need to select each robot and give it a specific command. It would be very helpful to have a centralized map where I can tell a robot to say explore in a given area while considering data from a different robot. This was in our plan, but we didn't manage to deploy it yet.



This article is part of our special report on AI, “The Great AI Reckoning.

"I should probably not be standing this close," I think to myself, as the robot slowly approaches a large tree branch on the floor in front of me. It's not the size of the branch that makes me nervous—it's that the robot is operating autonomously, and that while I know what it's supposed to do, I'm not entirely sure what it will do. If everything works the way the roboticists at the U.S. Army Research Laboratory (ARL) in Adelphi, Md., expect, the robot will identify the branch, grasp it, and drag it out of the way. These folks know what they're doing, but I've spent enough time around robots that I take a small step backwards anyway.

The robot, named RoMan, for Robotic Manipulator, is about the size of a large lawn mower, with a tracked base that helps it handle most kinds of terrain. At the front, it has a squat torso equipped with cameras and depth sensors, as well as a pair of arms that were harvested from a prototype disaster-response robot originally developed at NASA's Jet Propulsion Laboratory for a DARPA robotics competition. RoMan's job today is roadway clearing, a multistep task that ARL wants the robot to complete as autonomously as possible. Instead of instructing the robot to grasp specific objects in specific ways and move them to specific places, the operators tell RoMan to "go clear a path." It's then up to the robot to make all the decisions necessary to achieve that objective.

The ability to make decisions autonomously is not just what makes robots useful, it's what makes robots robots. We value robots for their ability to sense what's going on around them, make decisions based on that information, and then take useful actions without our input. In the past, robotic decision making followed highly structured rules—if you sense this, then do that. In structured environments like factories, this works well enough. But in chaotic, unfamiliar, or poorly defined settings, reliance on rules makes robots notoriously bad at dealing with anything that could not be precisely predicted and planned for in advance.

RoMan, along with many other robots including home vacuums, drones, and autonomous cars, handles the challenges of semistructured environments through artificial neural networks—a computing approach that loosely mimics the structure of neurons in biological brains. About a decade ago, artificial neural networks began to be applied to a wide variety of semistructured data that had previously been very difficult for computers running rules-based programming (generally referred to as symbolic reasoning) to interpret. Rather than recognizing specific data structures, an artificial neural network is able to recognize data patterns, identifying novel data that are similar (but not identical) to data that the network has encountered before. Indeed, part of the appeal of artificial neural networks is that they are trained by example, by letting the network ingest annotated data and learn its own system of pattern recognition. For neural networks with multiple layers of abstraction, this technique is called deep learning.

Even though humans are typically involved in the training process, and even though artificial neural networks were inspired by the neural networks in human brains, the kind of pattern recognition a deep learning system does is fundamentally different from the way humans see the world. It's often nearly impossible to understand the relationship between the data input into the system and the interpretation of the data that the system outputs. And that difference—the "black box" opacity of deep learning—poses a potential problem for robots like RoMan and for the Army Research Lab.

In chaotic, unfamiliar, or poorly defined settings, reliance on rules makes robots notoriously bad at dealing with anything that could not be precisely predicted and planned for in advance.

This opacity means that robots that rely on deep learning have to be used carefully. A deep-learning system is good at recognizing patterns, but lacks the world understanding that a human typically uses to make decisions, which is why such systems do best when their applications are well defined and narrow in scope. "When you have well-structured inputs and outputs, and you can encapsulate your problem in that kind of relationship, I think deep learning does very well," says Tom Howard, who directs the University of Rochester's Robotics and Artificial Intelligence Laboratory and has developed natural-language interaction algorithms for RoMan and other ground robots. "The question when programming an intelligent robot is, at what practical size do those deep-learning building blocks exist?" Howard explains that when you apply deep learning to higher-level problems, the number of possible inputs becomes very large, and solving problems at that scale can be challenging. And the potential consequences of unexpected or unexplainable behavior are much more significant when that behavior is manifested through a 170-kilogram two-armed military robot.

After a couple of minutes, RoMan hasn't moved—it's still sitting there, pondering the tree branch, arms poised like a praying mantis. For the last 10 years, the Army Research Lab's Robotics Collaborative Technology Alliance (RCTA) has been working with roboticists from Carnegie Mellon University, Florida State University, General Dynamics Land Systems, JPL, MIT, QinetiQ North America, University of Central Florida, the University of Pennsylvania, and other top research institutions to develop robot autonomy for use in future ground-combat vehicles. RoMan is one part of that process.

The "go clear a path" task that RoMan is slowly thinking through is difficult for a robot because the task is so abstract. RoMan needs to identify objects that might be blocking the path, reason about the physical properties of those objects, figure out how to grasp them and what kind of manipulation technique might be best to apply (like pushing, pulling, or lifting), and then make it happen. That's a lot of steps and a lot of unknowns for a robot with a limited understanding of the world.

This limited understanding is where the ARL robots begin to differ from other robots that rely on deep learning, says Ethan Stump, chief scientist of the AI for Maneuver and Mobility program at ARL. "The Army can be called upon to operate basically anywhere in the world. We do not have a mechanism for collecting data in all the different domains in which we might be operating. We may be deployed to some unknown forest on the other side of the world, but we'll be expected to perform just as well as we would in our own backyard," he says. Most deep-learning systems function reliably only within the domains and environments in which they've been trained. Even if the domain is something like "every drivable road in San Francisco," the robot will do fine, because that's a data set that has already been collected. But, Stump says, that's not an option for the military. If an Army deep-learning system doesn't perform well, they can't simply solve the problem by collecting more data.

ARL's robots also need to have a broad awareness of what they're doing. "In a standard operations order for a mission, you have goals, constraints, a paragraph on the commander's intent—basically a narrative of the purpose of the mission—which provides contextual info that humans can interpret and gives them the structure for when they need to make decisions and when they need to improvise," Stump explains. In other words, RoMan may need to clear a path quickly, or it may need to clear a path quietly, depending on the mission's broader objectives. That's a big ask for even the most advanced robot. "I can't think of a deep-learning approach that can deal with this kind of information," Stump says.

Robots at the Army Research Lab test autonomous navigation techniques in rough terrain [top, middle] with the goal of being able to keep up with their human teammates. ARL is also developing robots with manipulation capabilities [bottom] that can interact with objects so that humans don't have to.Evan Ackerman

While I watch, RoMan is reset for a second try at branch removal. ARL's approach to autonomy is modular, where deep learning is combined with other techniques, and the robot is helping ARL figure out which tasks are appropriate for which techniques. At the moment, RoMan is testing two different ways of identifying objects from 3D sensor data: UPenn's approach is deep-learning-based, while Carnegie Mellon is using a method called perception through search, which relies on a more traditional database of 3D models. Perception through search works only if you know exactly which objects you're looking for in advance, but training is much faster since you need only a single model per object. It can also be more accurate when perception of the object is difficult—if the object is partially hidden or upside-down, for example. ARL is testing these strategies to determine which is the most versatile and effective, letting them run simultaneously and compete against each other.

Perception is one of the things that deep learning tends to excel at. "The computer vision community has made crazy progress using deep learning for this stuff," says Maggie Wigness, a computer scientist at ARL. "We've had good success with some of these models that were trained in one environment generalizing to a new environment, and we intend to keep using deep learning for these sorts of tasks, because it's the state of the art."

ARL's modular approach might combine several techniques in ways that leverage their particular strengths. For example, a perception system that uses deep-learning-based vision to classify terrain could work alongside an autonomous driving system based on an approach called inverse reinforcement learning, where the model can rapidly be created or refined by observations from human soldiers. Traditional reinforcement learning optimizes a solution based on established reward functions, and is often applied when you're not necessarily sure what optimal behavior looks like. This is less of a concern for the Army, which can generally assume that well-trained humans will be nearby to show a robot the right way to do things. "When we deploy these robots, things can change very quickly," Wigness says. "So we wanted a technique where we could have a soldier intervene, and with just a few examples from a user in the field, we can update the system if we need a new behavior." A deep-learning technique would require "a lot more data and time," she says.

It's not just data-sparse problems and fast adaptation that deep learning struggles with. There are also questions of robustness, explainability, and safety. "These questions aren't unique to the military," says Stump, "but it's especially important when we're talking about systems that may incorporate lethality." To be clear, ARL is not currently working on lethal autonomous weapons systems, but the lab is helping to lay the groundwork for autonomous systems in the U.S. military more broadly, which means considering ways in which such systems may be used in the future.

The requirements of a deep network are to a large extent misaligned with the requirements of an Army mission, and that's a problem.

Safety is an obvious priority, and yet there isn't a clear way of making a deep-learning system verifiably safe, according to Stump. "Doing deep learning with safety constraints is a major research effort. It's hard to add those constraints into the system, because you don't know where the constraints already in the system came from. So when the mission changes, or the context changes, it's hard to deal with that. It's not even a data question; it's an architecture question." ARL's modular architecture, whether it's a perception module that uses deep learning or an autonomous driving module that uses inverse reinforcement learning or something else, can form parts of a broader autonomous system that incorporates the kinds of safety and adaptability that the military requires. Other modules in the system can operate at a higher level, using different techniques that are more verifiable or explainable and that can step in to protect the overall system from adverse unpredictable behaviors. "If other information comes in and changes what we need to do, there's a hierarchy there," Stump says. "It all happens in a rational way."

Nicholas Roy, who leads the Robust Robotics Group at MIT and describes himself as "somewhat of a rabble-rouser" due to his skepticism of some of the claims made about the power of deep learning, agrees with the ARL roboticists that deep-learning approaches often can't handle the kinds of challenges that the Army has to be prepared for. "The Army is always entering new environments, and the adversary is always going to be trying to change the environment so that the training process the robots went through simply won't match what they're seeing," Roy says. "So the requirements of a deep network are to a large extent misaligned with the requirements of an Army mission, and that's a problem."

Roy, who has worked on abstract reasoning for ground robots as part of the RCTA, emphasizes that deep learning is a useful technology when applied to problems with clear functional relationships, but when you start looking at abstract concepts, it's not clear whether deep learning is a viable approach. "I'm very interested in finding how neural networks and deep learning could be assembled in a way that supports higher-level reasoning," Roy says. "I think it comes down to the notion of combining multiple low-level neural networks to express higher level concepts, and I do not believe that we understand how to do that yet." Roy gives the example of using two separate neural networks, one to detect objects that are cars and the other to detect objects that are red. It's harder to combine those two networks into one larger network that detects red cars than it would be if you were using a symbolic reasoning system based on structured rules with logical relationships. "Lots of people are working on this, but I haven't seen a real success that drives abstract reasoning of this kind."

For the foreseeable future, ARL is making sure that its autonomous systems are safe and robust by keeping humans around for both higher-level reasoning and occasional low-level advice. Humans might not be directly in the loop at all times, but the idea is that humans and robots are more effective when working together as a team. When the most recent phase of the Robotics Collaborative Technology Alliance program began in 2009, Stump says, "we'd already had many years of being in Iraq and Afghanistan, where robots were often used as tools. We've been trying to figure out what we can do to transition robots from tools to acting more as teammates within the squad."

RoMan gets a little bit of help when a human supervisor points out a region of the branch where grasping might be most effective. The robot doesn't have any fundamental knowledge about what a tree branch actually is, and this lack of world knowledge (what we think of as common sense) is a fundamental problem with autonomous systems of all kinds. Having a human leverage our vast experience into a small amount of guidance can make RoMan's job much easier. And indeed, this time RoMan manages to successfully grasp the branch and noisily haul it across the room.

Turning a robot into a good teammate can be difficult, because it can be tricky to find the right amount of autonomy. Too little and it would take most or all of the focus of one human to manage one robot, which may be appropriate in special situations like explosive-ordnance disposal but is otherwise not efficient. Too much autonomy and you'd start to have issues with trust, safety, and explainability.

"I think the level that we're looking for here is for robots to operate on the level of working dogs," explains Stump. "They understand exactly what we need them to do in limited circumstances, they have a small amount of flexibility and creativity if they are faced with novel circumstances, but we don't expect them to do creative problem-solving. And if they need help, they fall back on us."

RoMan is not likely to find itself out in the field on a mission anytime soon, even as part of a team with humans. It's very much a research platform. But the software being developed for RoMan and other robots at ARL, called Adaptive Planner Parameter Learning (APPL), will likely be used first in autonomous driving, and later in more complex robotic systems that could include mobile manipulators like RoMan. APPL combines different machine-learning techniques (including inverse reinforcement learning and deep learning) arranged hierarchically underneath classical autonomous navigation systems. That allows high-level goals and constraints to be applied on top of lower-level programming. Humans can use teleoperated demonstrations, corrective interventions, and evaluative feedback to help robots adjust to new environments, while the robots can use unsupervised reinforcement learning to adjust their behavior parameters on the fly. The result is an autonomy system that can enjoy many of the benefits of machine learning, while also providing the kind of safety and explainability that the Army needs. With APPL, a learning-based system like RoMan can operate in predictable ways even under uncertainty, falling back on human tuning or human demonstration if it ends up in an environment that's too different from what it trained on.

It's tempting to look at the rapid progress of commercial and industrial autonomous systems (autonomous cars being just one example) and wonder why the Army seems to be somewhat behind the state of the art. But as Stump finds himself having to explain to Army generals, when it comes to autonomous systems, "there are lots of hard problems, but industry's hard problems are different from the Army's hard problems." The Army doesn't have the luxury of operating its robots in structured environments with lots of data, which is why ARL has put so much effort into APPL, and into maintaining a place for humans. Going forward, humans are likely to remain a key part of the autonomous framework that ARL is developing. "That's what we're trying to build with our robotics systems," Stump says. "That's our bumper sticker: 'From tools to teammates.' "

This article appears in the October 2021 print issue as "Deep Learning Goes to Boot Camp."

Special Report: The Great AI Reckoning

READ NEXT: 7 Revealing Ways AIs Fail

Or see the full report for more articles on the future of AI.



This article is part of our special report on AI, “The Great AI Reckoning.

"I should probably not be standing this close," I think to myself, as the robot slowly approaches a large tree branch on the floor in front of me. It's not the size of the branch that makes me nervous—it's that the robot is operating autonomously, and that while I know what it's supposed to do, I'm not entirely sure what it will do. If everything works the way the roboticists at the U.S. Army Research Laboratory (ARL) in Adelphi, Md., expect, the robot will identify the branch, grasp it, and drag it out of the way. These folks know what they're doing, but I've spent enough time around robots that I take a small step backwards anyway.

The robot, named RoMan, for Robotic Manipulator, is about the size of a large lawn mower, with a tracked base that helps it handle most kinds of terrain. At the front, it has a squat torso equipped with cameras and depth sensors, as well as a pair of arms that were harvested from a prototype disaster-response robot originally developed at NASA's Jet Propulsion Laboratory for a DARPA robotics competition. RoMan's job today is roadway clearing, a multistep task that ARL wants the robot to complete as autonomously as possible. Instead of instructing the robot to grasp specific objects in specific ways and move them to specific places, the operators tell RoMan to "go clear a path." It's then up to the robot to make all the decisions necessary to achieve that objective.

The ability to make decisions autonomously is not just what makes robots useful, it's what makes robots robots. We value robots for their ability to sense what's going on around them, make decisions based on that information, and then take useful actions without our input. In the past, robotic decision making followed highly structured rules—if you sense this, then do that. In structured environments like factories, this works well enough. But in chaotic, unfamiliar, or poorly defined settings, reliance on rules makes robots notoriously bad at dealing with anything that could not be precisely predicted and planned for in advance.

RoMan, along with many other robots including home vacuums, drones, and autonomous cars, handles the challenges of semistructured environments through artificial neural networks—a computing approach that loosely mimics the structure of neurons in biological brains. About a decade ago, artificial neural networks began to be applied to a wide variety of semistructured data that had previously been very difficult for computers running rules-based programming (generally referred to as symbolic reasoning) to interpret. Rather than recognizing specific data structures, an artificial neural network is able to recognize data patterns, identifying novel data that are similar (but not identical) to data that the network has encountered before. Indeed, part of the appeal of artificial neural networks is that they are trained by example, by letting the network ingest annotated data and learn its own system of pattern recognition. For neural networks with multiple layers of abstraction, this technique is called deep learning.

Even though humans are typically involved in the training process, and even though artificial neural networks were inspired by the neural networks in human brains, the kind of pattern recognition a deep learning system does is fundamentally different from the way humans see the world. It's often nearly impossible to understand the relationship between the data input into the system and the interpretation of the data that the system outputs. And that difference—the "black box" opacity of deep learning—poses a potential problem for robots like RoMan and for the Army Research Lab.

In chaotic, unfamiliar, or poorly defined settings, reliance on rules makes robots notoriously bad at dealing with anything that could not be precisely predicted and planned for in advance.

This opacity means that robots that rely on deep learning have to be used carefully. A deep-learning system is good at recognizing patterns, but lacks the world understanding that a human typically uses to make decisions, which is why such systems do best when their applications are well defined and narrow in scope. "When you have well-structured inputs and outputs, and you can encapsulate your problem in that kind of relationship, I think deep learning does very well," says Tom Howard, who directs the University of Rochester's Robotics and Artificial Intelligence Laboratory and has developed natural-language interaction algorithms for RoMan and other ground robots. "The question when programming an intelligent robot is, at what practical size do those deep-learning building blocks exist?" Howard explains that when you apply deep learning to higher-level problems, the number of possible inputs becomes very large, and solving problems at that scale can be challenging. And the potential consequences of unexpected or unexplainable behavior are much more significant when that behavior is manifested through a 170-kilogram two-armed military robot.

After a couple of minutes, RoMan hasn't moved—it's still sitting there, pondering the tree branch, arms poised like a praying mantis. For the last 10 years, the Army Research Lab's Robotics Collaborative Technology Alliance (RCTA) has been working with roboticists from Carnegie Mellon University, Florida State University, General Dynamics Land Systems, JPL, MIT, QinetiQ North America, University of Central Florida, the University of Pennsylvania, and other top research institutions to develop robot autonomy for use in future ground-combat vehicles. RoMan is one part of that process.

The "go clear a path" task that RoMan is slowly thinking through is difficult for a robot because the task is so abstract. RoMan needs to identify objects that might be blocking the path, reason about the physical properties of those objects, figure out how to grasp them and what kind of manipulation technique might be best to apply (like pushing, pulling, or lifting), and then make it happen. That's a lot of steps and a lot of unknowns for a robot with a limited understanding of the world.

This limited understanding is where the ARL robots begin to differ from other robots that rely on deep learning, says Ethan Stump, chief scientist of the AI for Maneuver and Mobility program at ARL. "The Army can be called upon to operate basically anywhere in the world. We do not have a mechanism for collecting data in all the different domains in which we might be operating. We may be deployed to some unknown forest on the other side of the world, but we'll be expected to perform just as well as we would in our own backyard," he says. Most deep-learning systems function reliably only within the domains and environments in which they've been trained. Even if the domain is something like "every drivable road in San Francisco," the robot will do fine, because that's a data set that has already been collected. But, Stump says, that's not an option for the military. If an Army deep-learning system doesn't perform well, they can't simply solve the problem by collecting more data.

ARL's robots also need to have a broad awareness of what they're doing. "In a standard operations order for a mission, you have goals, constraints, a paragraph on the commander's intent—basically a narrative of the purpose of the mission—which provides contextual info that humans can interpret and gives them the structure for when they need to make decisions and when they need to improvise," Stump explains. In other words, RoMan may need to clear a path quickly, or it may need to clear a path quietly, depending on the mission's broader objectives. That's a big ask for even the most advanced robot. "I can't think of a deep-learning approach that can deal with this kind of information," Stump says.

Robots at the Army Research Lab test autonomous navigation techniques in rough terrain [top, middle] with the goal of being able to keep up with their human teammates. ARL is also developing robots with manipulation capabilities [bottom] that can interact with objects so that humans don't have to.Evan Ackerman

While I watch, RoMan is reset for a second try at branch removal. ARL's approach to autonomy is modular, where deep learning is combined with other techniques, and the robot is helping ARL figure out which tasks are appropriate for which techniques. At the moment, RoMan is testing two different ways of identifying objects from 3D sensor data: UPenn's approach is deep-learning-based, while Carnegie Mellon is using a method called perception through search, which relies on a more traditional database of 3D models. Perception through search works only if you know exactly which objects you're looking for in advance, but training is much faster since you need only a single model per object. It can also be more accurate when perception of the object is difficult—if the object is partially hidden or upside-down, for example. ARL is testing these strategies to determine which is the most versatile and effective, letting them run simultaneously and compete against each other.

Perception is one of the things that deep learning tends to excel at. "The computer vision community has made crazy progress using deep learning for this stuff," says Maggie Wigness, a computer scientist at ARL. "We've had good success with some of these models that were trained in one environment generalizing to a new environment, and we intend to keep using deep learning for these sorts of tasks, because it's the state of the art."

ARL's modular approach might combine several techniques in ways that leverage their particular strengths. For example, a perception system that uses deep-learning-based vision to classify terrain could work alongside an autonomous driving system based on an approach called inverse reinforcement learning, where the model can rapidly be created or refined by observations from human soldiers. Traditional reinforcement learning optimizes a solution based on established reward functions, and is often applied when you're not necessarily sure what optimal behavior looks like. This is less of a concern for the Army, which can generally assume that well-trained humans will be nearby to show a robot the right way to do things. "When we deploy these robots, things can change very quickly," Wigness says. "So we wanted a technique where we could have a soldier intervene, and with just a few examples from a user in the field, we can update the system if we need a new behavior." A deep-learning technique would require "a lot more data and time," she says.

It's not just data-sparse problems and fast adaptation that deep learning struggles with. There are also questions of robustness, explainability, and safety. "These questions aren't unique to the military," says Stump, "but it's especially important when we're talking about systems that may incorporate lethality." To be clear, ARL is not currently working on lethal autonomous weapons systems, but the lab is helping to lay the groundwork for autonomous systems in the U.S. military more broadly, which means considering ways in which such systems may be used in the future.

The requirements of a deep network are to a large extent misaligned with the requirements of an Army mission, and that's a problem.

Safety is an obvious priority, and yet there isn't a clear way of making a deep-learning system verifiably safe, according to Stump. "Doing deep learning with safety constraints is a major research effort. It's hard to add those constraints into the system, because you don't know where the constraints already in the system came from. So when the mission changes, or the context changes, it's hard to deal with that. It's not even a data question; it's an architecture question." ARL's modular architecture, whether it's a perception module that uses deep learning or an autonomous driving module that uses inverse reinforcement learning or something else, can form parts of a broader autonomous system that incorporates the kinds of safety and adaptability that the military requires. Other modules in the system can operate at a higher level, using different techniques that are more verifiable or explainable and that can step in to protect the overall system from adverse unpredictable behaviors. "If other information comes in and changes what we need to do, there's a hierarchy there," Stump says. "It all happens in a rational way."

Nicholas Roy, who leads the Robust Robotics Group at MIT and describes himself as "somewhat of a rabble-rouser" due to his skepticism of some of the claims made about the power of deep learning, agrees with the ARL roboticists that deep-learning approaches often can't handle the kinds of challenges that the Army has to be prepared for. "The Army is always entering new environments, and the adversary is always going to be trying to change the environment so that the training process the robots went through simply won't match what they're seeing," Roy says. "So the requirements of a deep network are to a large extent misaligned with the requirements of an Army mission, and that's a problem."

Roy, who has worked on abstract reasoning for ground robots as part of the RCTA, emphasizes that deep learning is a useful technology when applied to problems with clear functional relationships, but when you start looking at abstract concepts, it's not clear whether deep learning is a viable approach. "I'm very interested in finding how neural networks and deep learning could be assembled in a way that supports higher-level reasoning," Roy says. "I think it comes down to the notion of combining multiple low-level neural networks to express higher level concepts, and I do not believe that we understand how to do that yet." Roy gives the example of using two separate neural networks, one to detect objects that are cars and the other to detect objects that are red. It's harder to combine those two networks into one larger network that detects red cars than it would be if you were using a symbolic reasoning system based on structured rules with logical relationships. "Lots of people are working on this, but I haven't seen a real success that drives abstract reasoning of this kind."

For the foreseeable future, ARL is making sure that its autonomous systems are safe and robust by keeping humans around for both higher-level reasoning and occasional low-level advice. Humans might not be directly in the loop at all times, but the idea is that humans and robots are more effective when working together as a team. When the most recent phase of the Robotics Collaborative Technology Alliance program began in 2009, Stump says, "we'd already had many years of being in Iraq and Afghanistan, where robots were often used as tools. We've been trying to figure out what we can do to transition robots from tools to acting more as teammates within the squad."

RoMan gets a little bit of help when a human supervisor points out a region of the branch where grasping might be most effective. The robot doesn't have any fundamental knowledge about what a tree branch actually is, and this lack of world knowledge (what we think of as common sense) is a fundamental problem with autonomous systems of all kinds. Having a human leverage our vast experience into a small amount of guidance can make RoMan's job much easier. And indeed, this time RoMan manages to successfully grasp the branch and noisily haul it across the room.

Turning a robot into a good teammate can be difficult, because it can be tricky to find the right amount of autonomy. Too little and it would take most or all of the focus of one human to manage one robot, which may be appropriate in special situations like explosive-ordnance disposal but is otherwise not efficient. Too much autonomy and you'd start to have issues with trust, safety, and explainability.

"I think the level that we're looking for here is for robots to operate on the level of working dogs," explains Stump. "They understand exactly what we need them to do in limited circumstances, they have a small amount of flexibility and creativity if they are faced with novel circumstances, but we don't expect them to do creative problem-solving. And if they need help, they fall back on us."

RoMan is not likely to find itself out in the field on a mission anytime soon, even as part of a team with humans. It's very much a research platform. But the software being developed for RoMan and other robots at ARL, called Adaptive Planner Parameter Learning (APPL), will likely be used first in autonomous driving, and later in more complex robotic systems that could include mobile manipulators like RoMan. APPL combines different machine-learning techniques (including inverse reinforcement learning and deep learning) arranged hierarchically underneath classical autonomous navigation systems. That allows high-level goals and constraints to be applied on top of lower-level programming. Humans can use teleoperated demonstrations, corrective interventions, and evaluative feedback to help robots adjust to new environments, while the robots can use unsupervised reinforcement learning to adjust their behavior parameters on the fly. The result is an autonomy system that can enjoy many of the benefits of machine learning, while also providing the kind of safety and explainability that the Army needs. With APPL, a learning-based system like RoMan can operate in predictable ways even under uncertainty, falling back on human tuning or human demonstration if it ends up in an environment that's too different from what it trained on.

It's tempting to look at the rapid progress of commercial and industrial autonomous systems (autonomous cars being just one example) and wonder why the Army seems to be somewhat behind the state of the art. But as Stump finds himself having to explain to Army generals, when it comes to autonomous systems, "there are lots of hard problems, but industry's hard problems are different from the Army's hard problems." The Army doesn't have the luxury of operating its robots in structured environments with lots of data, which is why ARL has put so much effort into APPL, and into maintaining a place for humans. Going forward, humans are likely to remain a key part of the autonomous framework that ARL is developing. "That's what we're trying to build with our robotics systems," Stump says. "That's our bumper sticker: 'From tools to teammates.' "

This article appears in the October 2021 print issue as "Deep Learning Goes to Boot Camp."

Special Report: The Great AI Reckoning

READ NEXT: 7 Revealing Ways AIs Fail

Or see the full report for more articles on the future of AI.



Jetpacks might sound fun, but learning how to control a pair of jet engines strapped to your back is no easy feat. Now a British startup wants to simplify things by developing a jetpack with an autopilot system that makes operating it more like controlling a high-end drone than learning how to fly.

Jetpacks made the leap from sci-fi to the real world as far back as the 1960s, but since then the they haven't found much use outside of gimmicky appearances in movies and halftime shows. In recent years though, the idea has received renewed interest. And its proponents are keen to show that the technology is no longer just for stuntmen and may even have practical applications.

American firm Jetpack Aviation will teach anyone to fly its JB-10 jetpack for a cool $4,950 and recently sold its latest JB-12 model to an "undisclosed military." And an Iron Man-like, jet-powered flying suit developed by British start-up Gravity Industries has been tested as a way for marines to board ships and as a way to get medics to the top of mountains quickly.

Flying jetpacks can take a lot of training to master though. That's what prompted Hollywood animatronics expert Matt Denton and Royal Navy Commander Antony Quinn to found Maverick Aviation, and develop one that takes the complexities of flight control out the pilot's hands.

The Maverick Jetpack features four miniature jet turbines attached to an aluminum, titanium and carbon fiber frame, and will travel at up to 30 miles per hour. But the secret ingredient is software that automatically controls the engines to maintain a stable hover, and seamlessly convert the pilot's instructions into precise movements.

"It's going to be very much like flying a drone," says Denton. "We wanted to come up with something that anyone could fly. It's all computer-controlled and you'll just be using the joystick."

One of the key challenges, says Denton, was making the engines responsive enough to allow the rapid tweaks required for flight stabilization. This is relatively simple to achieve on a drone, whose electric motors can be adjusted in a blink of an eye, but jet turbines can take several seconds to ramp up and down between zero and full power.

To get around this, the company added servos to each turbine that let them move independently to quickly alter the direction of thrust—a process known as thrust vectoring. By shifting the alignment of the four engines the flight control software can keep the jetpack perfectly positioned using feedback from inertial measurement units, GPS, altimeters and ground distance sensors. Simple directional instructions from the pilot can also be automatically translated into the required low-level tweaks to the turbines.

It's a clever way to improve the mobility of the system, says Ben Akih-Kumgeh, an associate professor of aerospace engineering at Syracuse University. "It's not only a smart way of overcoming any lag that you may have, but it also helps with the lifespan of the engine," he adds. “[In] any mechanical system, the durability depends on how often you change the operating conditions."

The software is fairly similar to a conventional drone flight controller, says Denton, but they have had to accommodate some additional complexities. Thrust magnitude and thrust direction have to be managed by separate control loops due to their very different reaction times, but they still need to sync up seamlessly to coordinate adjustments. The entire control process is also complicated by the fact that the jetpack has a human strapped to it.

"Once you've got a shifting payload, like a person who's wobbling their arms around and moving their legs, then it does become a much more complex problem," says Denton.

In the long run, says Denton, the company hopes to add higher-level functions that could allow the jetpack to move automatically between points marked on a map. The hope is that by automating as much of the flight control as possible, users will be able to focus on the task at hand, whether that's fixing a wind turbine or inspecting a construction site.

Surrendering so much control to a computer might give some pause for thought, but Denton says there will be plenty of redundancy built in. "The idea will be that we'll have plenty of fallback modes where, if part of the system fails, it'll fall back to a more manual flight mode," he said. "The user would have training to basically tackle any of those conditions."

It might be sometime before you can start basic training, though, as the company has yet to fly their turbine-powered jetpack. Currently, flight testing is being conducted on an scaled down model powered by electric ducted fans, says Denton, though their responsiveness has been deliberately dulled so they behave like turbines. The company is hoping to conduct the first human test flights next summer.

Don't get your hopes up about commuting to work by jetpack any time soon though, says Akih-Kumgeh. The huge amount of noise these devices produce make it unlikely that they would be allowed to operate within city limits. The near term applications are more likely to be search and rescue missions where time and speed trump efficiency, he says.



Jetpacks might sound fun, but learning how to control a pair of jet engines strapped to your back is no easy feat. Now a British startup wants to simplify things by developing a jetpack with an autopilot system that makes operating it more like controlling a high-end drone than learning how to fly.

Jetpacks made the leap from sci-fi to the real world as far back as the 1960s, but since then the they haven't found much use outside of gimmicky appearances in movies and halftime shows. In recent years though, the idea has received renewed interest. And its proponents are keen to show that the technology is no longer just for stuntmen and may even have practical applications.

American firm Jetpack Aviation will teach anyone to fly its JB-10 jetpack for a cool $4,950 and recently sold its latest JB-12 model to an "undisclosed military." And an Iron Man-like, jet-powered flying suit developed by British start-up Gravity Industries has been tested as a way for marines to board ships and as a way to get medics to the top of mountains quickly.

Flying jetpacks can take a lot of training to master though. That's what prompted Hollywood animatronics expert Matt Denton and Royal Navy Commander Antony Quinn to found Maverick Aviation, and develop one that takes the complexities of flight control out the pilot's hands.

The Maverick Jetpack features four miniature jet turbines attached to an aluminum, titanium and carbon fiber frame, and will travel at up to 30 miles per hour. But the secret ingredient is software that automatically controls the engines to maintain a stable hover, and seamlessly convert the pilot's instructions into precise movements.

"It's going to be very much like flying a drone," says Denton. "We wanted to come up with something that anyone could fly. It's all computer-controlled and you'll just be using the joystick."

One of the key challenges, says Denton, was making the engines responsive enough to allow the rapid tweaks required for flight stabilization. This is relatively simple to achieve on a drone, whose electric motors can be adjusted in a blink of an eye, but jet turbines can take several seconds to ramp up and down between zero and full power.

To get around this, the company added servos to each turbine that let them move independently to quickly alter the direction of thrust—a process known as thrust vectoring. By shifting the alignment of the four engines the flight control software can keep the jetpack perfectly positioned using feedback from inertial measurement units, GPS, altimeters and ground distance sensors. Simple directional instructions from the pilot can also be automatically translated into the required low-level tweaks to the turbines.

It's a clever way to improve the mobility of the system, says Ben Akih-Kumgeh, an associate professor of aerospace engineering at Syracuse University. "It's not only a smart way of overcoming any lag that you may have, but it also helps with the lifespan of the engine," he adds. “[In] any mechanical system, the durability depends on how often you change the operating conditions."

The software is fairly similar to a conventional drone flight controller, says Denton, but they have had to accommodate some additional complexities. Thrust magnitude and thrust direction have to be managed by separate control loops due to their very different reaction times, but they still need to sync up seamlessly to coordinate adjustments. The entire control process is also complicated by the fact that the jetpack has a human strapped to it.

"Once you've got a shifting payload, like a person who's wobbling their arms around and moving their legs, then it does become a much more complex problem," says Denton.

In the long run, says Denton, the company hopes to add higher-level functions that could allow the jetpack to move automatically between points marked on a map. The hope is that by automating as much of the flight control as possible, users will be able to focus on the task at hand, whether that's fixing a wind turbine or inspecting a construction site.

Surrendering so much control to a computer might give some pause for thought, but Denton says there will be plenty of redundancy built in. "The idea will be that we'll have plenty of fallback modes where, if part of the system fails, it'll fall back to a more manual flight mode," he said. "The user would have training to basically tackle any of those conditions."

It might be sometime before you can start basic training, though, as the company has yet to fly their turbine-powered jetpack. Currently, flight testing is being conducted on an scaled down model powered by electric ducted fans, says Denton, though their responsiveness has been deliberately dulled so they behave like turbines. The company is hoping to conduct the first human test flights next summer.

Don't get your hopes up about commuting to work by jetpack any time soon though, says Akih-Kumgeh. The huge amount of noise these devices produce make it unlikely that they would be allowed to operate within city limits. The near term applications are more likely to be search and rescue missions where time and speed trump efficiency, he says.

Social Robots are coming. They are being designed to enter our lives and help in everything from childrearing to elderly care, from household chores to personal therapy, and the list goes on. There is great promise that these machines will further the progress that their predecessors achieved, enhancing our lives and alleviating us of the many tasks with which we would rather not be occupied. But there is a dilemma. On the one hand, these machines are just that, machines. Accordingly, some thinkers propose that we maintain this perspective and relate to Social Robots as “tools”. Yet, in treating them as such, it is argued, we deny our own natural empathy, ultimately inculcating vicious as opposed to virtuous dispositions. Many thinkers thus apply Kant’s approach to animals—“he who is cruel to animals becomes hard also in his dealings with men”—contending that we must not maltreat robots lest we maltreat humans. On the other hand, because we innately anthropomorphize entities that behave with autonomy and mobility (let alone entities that exhibit beliefs, desires and intentions), we become emotionally entangled with them. Some thinkers actually encourage such relationships. But there are problems here also. For starters, many maintain that it is imprudent to have “empty,” unidirectional relationships for we will then fail to appreciate authentic reciprocal relationships. Furthermore, such relationships can lead to our being manipulated, to our shunning of real human interactions as “messy,” to our incorrectly allocating resources away from humans, and more. In this article, I review the various positions on this issue and propose an approach that I believe sits in the middle ground between the one extreme of treating Social Robots as mere machines versus the other extreme of accepting Social Robots as having human-like status. I call the approach “The Virtuous Servant Owner” and base it on the virtue ethics of the medieval Jewish philosopher Maimonides.

Recently, advancements in computational machinery have facilitated the integration of artificial intelligence (AI) to almost every field and industry. This fast-paced development in AI and sensing technologies have stirred an evolution in the realm of robotics. Concurrently, augmented reality (AR) applications are providing solutions to a myriad of robotics applications, such as demystifying robot motion intent and supporting intuitive control and feedback. In this paper, research papers combining the potentials of AI and AR in robotics over the last decade are presented and systematically reviewed. Four sources for data collection were utilized: Google Scholar, Scopus database, the International Conference on Robotics and Automation 2020 proceedings, and the references and citations of all identified papers. A total of 29 papers were analyzed from two perspectives: a theme-based perspective showcasing the relation between AR and AI, and an application-based analysis highlighting how the robotics application was affected. These two sections are further categorized based on the type of robotics platform and the type of robotics application, respectively. We analyze the work done and highlight some of the prevailing limitations hindering the field. Results also explain how AR and AI can be combined to solve the model-mismatch paradigm by creating a closed feedback loop between the user and the robot. This forms a solid base for increasing the efficiency of the robotic application and enhancing the user’s situational awareness, safety, and acceptance of AI robots. Our findings affirm the promising future for robust integration of AR and AI in numerous robotic applications.

Pages