Nowadays, robotics applications requiring the execution of complex tasks in real-world scenarios are still facing many challenges related to highly unstructured and dynamic environments in domains such as emergency response and search and rescue where robots have to operate for prolonged periods trading off computational performance with increased power autonomy and vice versa. In particular, there is a crucial need for robots capable of adapting to such settings while at the same time providing robustness and extended power autonomy. A possible approach to overcome the conflicting demand of a computational performing system with the need for long power autonomy is represented by cloud robotics, which can boost the computational capabilities of the robot while reducing the energy consumption by exploiting the offload of resources to the cloud. Nevertheless, the communication constraint due to limited bandwidth, latency, and connectivity, typical of field robotics, makes cloud-enabled robotics solutions challenging to deploy in real-world applications. In this context, we designed and realized the XBot2D software architecture, which provides a hybrid cloud manager capable of dynamically and seamlessly allocating robotics skills to perform a distributed computation based on the current network condition and the required latency, and computational/energy resources of the robot in use. The proposed framework leverage on the two dimensions, i.e., 2D (local and cloud), in a transparent way for the user, providing support for Real-Time (RT) skills execution on the local robot, as well as machine learning and A.I. resources on the cloud with the possibility to automatically relocate the above based on the required performances and communication quality. XBot2D implementation and its functionalities are presented and validated in realistic tasks involving the CENTAURO robot and the Amazon Web Service Elastic Computing Cloud (AWS EC2) infrastructure with different network conditions.
Feed aggregator
Episode 3: Drones That Can Fly Better Than You Can
Evan Ackerman: I’m Evan Ackerman, and welcome to Chatbot, a new podcast from IEEE Spectrum where robotics experts interview each other about things that they find fascinating. On this episode of Chatbot, we’ll be talking with Davide Scaramuzza and Adam Bry about agile autonomous drones. Adam Bry is the CEO of Skydio, a company that makes consumer camera drones with an astonishing amount of skill at autonomous tracking and obstacle avoidance. Foundation for Skydio’s drones can be traced back to Adam’s work on autonomous agile drones at MIT, and after spending a few years at Google working on Project Wing’s delivery drones, Adam cofounded Skydio in 2014. Skydio is currently on their third generation of consumer drones, and earlier this year, the company brought on three PhD students from Davide’s lab to expand their autonomy team. Davide Scaramuzza directs the Robotics and Perception group at the University of Zürich. His lab is best known for developing extremely agile drones that can autonomously navigate through complex environments at very high speeds. Faster, it turns out, than even the best human drone racing champions. Davide’s drones rely primarily on computer vision, and he’s also been exploring potential drone applications for a special kind of camera called an event camera, which is ideal for fast motion under challenging lighting conditions. So Davide, you’ve been doing drone research for a long time now, like a decade, at least, if not more.
Davide Scaramuzza: Since 2009. 15 years.
Ackerman: So what still fascinates you about drones after so long?
Scaramuzza: So what fascinates me about drones is their freedom. So that was the reason why I decided, back then in 2009, to actually move from ground robots—I was working at the time on self-driving cars—to drones. And actually, the trigger was when Google announced the self-driving car project, and then for me and many researchers, it was clear that actually many things were now transitioning from academia to industry, and so we had to come up with new ideas and things. And then with my PhD adviser at that time [inaudible] we realized, actually, that drones, especially quadcopters, were just coming out, but they were all remote controlled or they were actually using GPS. And so then we said, “What about flying drones autonomously, but with the onboard cameras?” And this had never been done until then. But what fascinates me about drones is the fact that, actually, they can overcome obstacles on the ground very quickly, and especially, this can be very useful for many applications that matter to us all today, like, first of all, search and rescue, but also other things like inspection of difficult infrastructures like bridges, power [inaudible] oil platforms, and so on.
Ackerman: And Adam, your drones are doing some of these things, many of these things. And of course, I am fascinated by drones and by what your drone is able to do, but I’m curious. When you introduce it to people who have maybe never seen it, how do you describe, I guess, almost the magic of what it can do?
Adam Bry: So the way that we think about it is pretty simple. Our basic goal is to build in the skills of an expert pilot into the drone itself, which involves a little bit of hardware. It means we need sensors that see everything in every direction and we need a powerful computer on board, but is mostly a software problem. And it becomes quite application-specific. So for consumers, for example, our drones can follow and film moving subjects and avoid obstacles and create this incredibly compelling dynamic footage. And the goal there is really what would happen if you had the world’s best drone pilot flying that thing, trying to film something in an interesting, compelling way. We want to make that available to anybody using one of our products, even if they’re not an expert pilot, and even if they’re not at the controls when it’s flying itself. So you can just put it in your hand, tell it to take off, it’ll turn around and start tracking you, and then you can do whatever else you want to do, and the drone takes care of the rest. In the industrial world, it’s entirely different. So for inspection applications, say, for a bridge, you just tell the drone, “Here’s the structure or scene that I care about,” and then we have a product called 3D Scan that will automatically explore it, build a real-time 3D map, and then use that map to take high-resolution photos of the entire structure.
And to follow on a bit to what Davide was saying, I mean, I think if you sort of abstract away a bit and think about what capability do drones offer, thinking about camera drones, it’s basically you can put an image sensor or, really, any kind of sensor anywhere you want, any time you want, and then the extra thing that we’re bringing in is without needing to have a person there to control it. And I think the combination of all those things together is transformative, and we’re seeing the impact of that in a lot of these applications today, but I think that that really— realizing the full potential is a 10-, 20-year kind of project.
Ackerman: It’s interesting when you talk about the way that we can think about the Skydio drone is like having an expert drone pilot to fly this thing, because there’s so much skill involved. And Davide, I know that you’ve been working on very high-performance drones that can maybe challenge even some of these expert pilots in performance. And I’m curious, when expert drone pilots come in and see what your drones can do autonomously for the first time, is it scary for them? Are they just excited? How do they react?
Scaramuzza: First of all, actually, they say, “Wow.” So they can not believe what they see. But then they get super excited, but at the same time, nervous. So we started working on autonomous drone racing five years ago, but in the first three years, we have been flying very slowly, like three meters per second. So they were really snails. But then in the last two years is when actually we started really pushing the limits, both in control and planning and perception. So these are our most recent drone, by the way. And now we can really fly at the same level of agility as humans. Not yet at the level to beat human, but we are very, very close. So we started the collaboration with Marvin, who is the Swiss champion, and he’s only— now he’s 16 years old. So last year he was 15 years old. So he’s a boy. And he actually was very mad at the drone. So he was super, super nervous when he saw this. So he didn’t even smile the first time. He was always saying, “I can do better. I can do better.” So actually, his reaction was quite scared. He was scared, actually, by what the drone was capable of doing, but he knew that, basically, we were using the motion capture. Now [inaudible] try to play in a fair comparison with a fair setting where both the autonomous drone and the human-piloted drone are using both onboard perceptions or egocentric vision, then things might end up differently.
Because in fact, actually, our vision-based drone, so flying with onboard vision, was quite slow. But actually now, after one year of pushing, we are at a level, actually, that we can fly a vision-based drone at the level of Marvin, and we are even a bit better than Marvin at the current moment, using only onboard vision. So we can fly— in this arena, the space allows us to go up to 72 kilometers per hour. We reached the 72 kilometers per hour, and we even beat Marvin in three consecutive laps so far. So that’s [inaudible]. But we want to now also compete against other pilots, other world champions, and see what’s going to happen.
Ackerman: Okay. That’s super impressive.
Bry: Can I jump in and ask a question?
Ackerman: Yeah, yeah, yeah.
Bry: I’m interested if you— I mean, since you’ve spent a lot of time with the expert pilots, if you learn things from the way that they think and fly, or if you just view them as a benchmark to try to beat, and the algorithms are not so much inspired by what they do.
Scaramuzza: So we did all these things. So we did it also in a scientific manner. So first, of course, we interviewed them. We asked any sort of question, what type of features are you actually focusing your attention, and so on, how much is the people around you, the supporters actually influencing you, and the hearing the other opponents actually screaming while they control [inaudible] influencing you. So there is all these psychological effects that, of course, influencing pilots during a competition. But then what we tried to do scientifically is to really understand, first of all, what is the latency of a human pilot. So there have been many studies that have been done for car racing, Formula One, back in the 80s and 90s. So basically, they put eye trackers and tried to understand— they tried to understand, basically, what is the latency between what you see until basically you act on your steering wheel. And so we tried to do the same for human pilots. So we basically installed an eye tracking device on our subjects. So we called 20 subjects from all across Switzerland, some people also from outside Switzerland, with different levels of expertise.
But they were quite good. Okay? We are not talking about median experts, but actually already very good experts. And then we would let them rehearse on the track, and then basically, we were capturing their eye gazes, and then we basically measured the time latency between changes in eye gaze and changes in throttle commands on the joystick. And we measured, and this latency was 220 milliseconds.
Ackerman: Wow. That’s high.
Scaramuzza: That includes the brain latency and the behavioral latency. So that time to send the control commands, once you process the information, the visual information to the fingers. So—
Bry: I think [crosstalk] it might just be worth, for the audience anchoring that, what’s the typical control latency for a digital control loop. It’s— I mean, I think it’s [crosstalk].
Scaramuzza: It’s typically in the— it’s typically in the order of— well, from images to control commands, usually 20 milliseconds, although we can also fly with the much higher latencies. It really depends on the speed you want to achieve. But typically, 20 milliseconds. So if you compare 20 milliseconds versus the 220 milliseconds of the human, you can already see that, eventually, the machine should beat the human. Then the other thing that you asked me was, what did we learn from human pilots? So what we learned was— interestingly, we learned that basically they were always pushing the throttle of the joystick at the maximum thrust, but actually, this is—
Bry: Because that’s very consistent with optimal control theory.
Scaramuzza: Exactly. But what we then realized, and they told us, was that it was interesting for them to observe that actually, for the AI, was better to brake earlier rather than later as the human was actually doing. And we published these results in Science Robotics last summer. And we did this actually using an algorithm that computes the time optimal trajectory from the start to the finish through all the gates, and by exploiting the full quadrotor dynamical model. So it’s really using not approximation, not point-mass model, not polynomial trajectories. The full quadrotor model, it takes a lot to compute, let me tell you. It takes like one hour or more, depending on the length of the trajectory, but it does a very good job, to a point that Gabriel Kocher, who works for the Drone Racing League, told us, “Ah, this is very interesting. So I didn’t know, actually, I can push even faster if I start braking before this gate.”
Bry: Yeah, it seems like it went the other way around. The optimal control strategy taught the human something.
Ackerman: Davide, do you have some questions for Adam?
Scaramuzza: Yes. So since you mentioned that basically, one of the scenarios or one of the applications that you are targeting, it is basically cinematography, where basically, you want to take amazing shots at the level of Hollywood, maybe producers, using your autonomous drones. And this is actually very interesting. So what I want to ask you is, in general, so going beyond cinematography, if you look at the performance of autonomous drones in general, it still looks to me that, for generic applications, they are still behind human pilot performance. I’m thinking of beyond cinematography and beyond the racing. I’m thinking of search and rescue operations and many things. So my question to Adam is, do you think that providing a higher level of agility to your platform could potentially unlock new use cases or even extend existing use cases of the Skydio drones?
Bry: You’re asking specifically about agility, flight agility, like responsiveness and maneuverability?
Scaramuzza: Yes. Yes. Exactly.
Bry: I think that it is— I mean, in general, I think that most things with drones have this kind of product property where the more you get better at something, the better it’s going to be for most users, and the more applications will be unlocked. And this is true for a lot of things. It’s true for some things that we even wish it wasn’t true for, like flight time. Like the longer the flight time, the more interesting and cool things people are going to be able to do with it, and there’s kind of no upper limit there. Different use cases, it might taper off, but you’re going to unlock more and more use cases the longer you can fly. I think that agility is one of these parameters where the more, the better, although I will say it’s not the thing that I feel like we’re hitting a ceiling on now in terms of being able to provide value to our users. There are cases within different applications. So for example, search and rescue, being able to fly through a really tight gap or something, where it would be useful. And for capturing cinematic videos, similar story, like being able to fly at high speed through some really challenging course, where I think it would make a difference. So I think that there are areas out there in user groups that we’re currently serving where it would matter, but I don’t think it’s like the— it’s not the thing that I feel like we’re hitting right now in terms of sort of the lowest-hanging fruit to unlock more value for users. Yeah.
Scaramuzza: So you believe, though, that in the long term, actually achieving human-level agility would actually be added value for your drones?
Bry: Definitely. Yeah. I mean, one sort of mental model that I think about for the long-term direction of the products is looking at what birds can do. And the agility that birds have and the kinds of maneuvers that that makes them capable of, and being able to land in tricky places, or being able to slip through small gaps, or being able to change direction quickly, that affords them capability that I think is definitely useful to have in drones and would unlock some value. But I think the other really interesting thing is that the autonomy problem spans multiple sort of ranges of hierarchy, and when you get towards the top, there’s human judgment that I think is very— I mean, it’s crucial to a lot of things that people want to do with drones, and it’s very difficult to automate, and I think it’s actually relatively low value to automate. So for example, in a search and rescue mission, a person might have— a search and rescue worker might have very particular context on where somebody is likely to be stuck or maybe be hiding or something that would be very difficult to encode into a drone. They might have some context from a clue that came up earlier in the case or something about the environment or something about the weather.
And so one of the things that we think a lot about in how we build our products—we’re a company. We’re trying to make useful stuff for people, so we have a pretty pragmatic approach on these fronts— is basically— we’re not religiously committed to automating everything. We’re basically trying to automate the things where we can give the best tool to somebody to then apply the judgment that they have as a person and an operator to get done what they want to get done.
Scaramuzza: And actually, yeah, now that you mentioned this, I have another question. So I’ve watched many of your previous tech talks and also interacted with you guys at conferences. So what I learned—and correct me if I’m wrong—is that you’re using a lot of deep learning on the perception side, so as part of a 3D construction, semantic understanding. But it seems to me that on the control and planning side, you’re still relying basically on optimal control. And I wanted to ask you, so if this is the case, are you happy there with optimal control? We also know that Boston Dynamics is actually using only optimal control. Actually, they even claim they are not using any deep learning in control and planning. So is this actually also what you experience? And if this is the case, do you believe in the future, actually, you will be using deep learning also in planning and control, and where exactly do you see the benefits of deep learning there?
Bry: Yeah, that’s a super interesting question. So what you described at a high level is essentially right. So our perception stack— and we do a lot of different things in perception, but we’re pretty heavily using deep learning throughout, for semantic understanding, for spatial understanding, and then our planning and control stack is based on more conventional kind of optimal control optimization and full-state feedback control techniques, and it generally works pretty well. Having said that, we did— we put out a blog post on this. We did a research project where we basically did end-to-end— pretty close to an end-to-end learning system where we replaced a good chunk of the planning stack with something that was based on machine learning, and we got it to the point where it was good enough for flight demonstrations. And for the amount of work that we put into it, relative to the capability that we got, I think the results were really compelling. And my general outlook on this stuff— I think that the planning and controls is an area where the models, I think, provide a lot of value. Having a structured model based on physics and first principles does provide a lot of value, and it’s admissible to that kind of modeling. You can write down the mass and the inertia and the rotor parameters, and the physics of quadcopters are such that those things tend to be pretty accurate and tend to work pretty well, and by starting with that structure, you can come up with quite a capable system.
Having said that, I think that the— to me, the trajectory of machine learning and deep learning is such that eventually I think it will dominate almost everything, because being able to learn based on data and having these representations that are incredibly flexible and can encode sort of subtle relationships that might exist but wouldn’t fall out of a more conventional physics model, I think is really powerful, and then I also think being able to do more end-to-end stuff where subtle sort of second- or third-order perception impact— or second- or third-order perception or real world, physical world things can then trickle through into planning and control actions, I think is also quite powerful. So generally, that’s the direction I see us going, and we’ve done some research on this. And I think the way you’ll see it going is we’ll use sort of the same optimal control structure we’re using now, but we’ll inject more learning into it, and then eventually, the thing might evolve to the point where it looks more like a deep network in end-to-end.
Scaramuzza: Now, earlier you mentioned that you foresee that in the future, drones will be flying more agilely, similar to human pilots, and even in tight spaces. You mentioned passing through a narrow gap or even in a small corridor. So when you navigate in tight spaces, of course, ground effect is very strong. So do you guys then model these aerodynamic effects, ground effect— not just ground effect. Do you try to model all possible aerodynamic effects, especially when you fly close to structures?
Bry: It’s an interesting question. So today we don’t model— we estimate the wind. We estimate the local wind velocity—and we’ve actually found that we can do that pretty accurately—around the drone, and then the local wind that we’re estimating gets fed back into the control system to compensate. And so that’s kind of like a catch-all bucket for— you could think about ground effect as like a variation— this is not exactly how it works, obviously, but you could think about it as like a variation in the local wind, and our response times on those, like the ability to estimate wind and then feed it back into control, is pretty quick, although it’s not instantaneous. So if we had like a feed forward model where we knew as we got close to structures, “This is how the wind is likely to vary,” we could probably do slightly better. And I think you’re— what you’re pointing at here, I basically agree with. I think the more that you kind of try to squeeze every drop of performance out of these things you’re flying with maximum agility in very dense environments, the more these things start to matter, and I could see us wanting to do something like that in the future, and that stuff’s fun. I think it’s fun when you sort of hit the limit and then you have to invent better new algorithms and bring more information to bear to get the performance that you want.
On this— perhaps related. You can tell me. So you guys have done a lot of work with event cameras, and I think that you were— this might not be right, but from what I’ve seen, I think you were one of the first, if not the first, to put event cameras on quadcopters. I’d be very interested in— and you’ve probably told these stories a lot, but I still think it’d be interesting to hear. What steered you towards event cameras? How did you find out about them, and what made you decide to invest in research in them?
Scaramuzza: [crosstalk] first of all, let me explain what an event camera is. An event camera is a camera that has also pixels, but differently from a standard camera, an event camera only sends information when there is motion. So if there is no motion, then the camera doesn’t stream any information. Now, the camera does this through smart pixels, differently from a standard camera, where every pixel triggers information the same time at equidistant time intervals. In an event camera, the pixels are smart, and they only trigger information whenever a pixel detects motion. Usually, a motion is recorded as a change of intensity. And the stream of events happens asynchronously, and therefore, the byproduct of this is that you don’t get frames, but you only get a stream of information continuously in time with microsecond temporal resolution. So one of the key advantages of event cameras is that, basically, you can actually record phenomena that actually would take expensive high-speed cameras to perceive. But the key difference with a standard camera is that an event camera works in differential mode. And because it works in differential mode, by basically capturing per-pixel intensity differences, it consumes very little power, and it also has no motion blur, because it doesn’t accumulate photons over time.
So I would say that for robotics, what I— because you asked me how did I find out. So what I really, really saw, actually, that was very useful for robotics about event cameras were two particular things. First of all, the very high temporal resolution, because this can be very useful for safety, critical systems. And I’m thinking about drones, but also to avoid collisions in the automotive setting, because now we are also working in automotive settings as well. And also when you have to navigate in low-light environments, where using a standard camera with the high exposure times, you would actually be coping with a lot of motion blur that would actually cause a feature loss and other artifacts, like impossibility to detect objects and so on. So event cameras excel at this. No motion blur and very low latency. Another thing that could be also very interesting for especially lightweight robotics—and I’m thinking of micro drones—would be actually the fact that they consume also very little power. So little power, in fact, just to be on an event camera consumes one milliwatt, on average, because in fact, the power consumption depends on the dynamics of the scene. If nothing moves, then the power consumption is very negligible. If something moves, it is between one milliwatt or maximum 10 milliwatt.
Now, the interesting thing is that if you then couple event cameras with the spiking neuromorphic chips that also consume less than one milliwatt, you can actually mount them on a micro drones, and you can do amazing things, and we started working on it. The problem is that how do you train spiking networks? But that’s another story. Other interesting things where I see potential applications of event cameras are also, for example— now, think about your keyframe features of the Skydio drones. And here what you are doing, guys, is that basically, you are flying the drones around, and then you’re trying to send 3D positions and orientation of where you would like then [inaudible] to fly faster through. But the images have been captured while the drone is still. So basically, you move the drone to a certain position, you orient it in the direction where later you want it to fly, and then you record the position and orientation, and later, the drone will fly agilely through it. But that means that, basically, the drone should be able to relocalize fast with respect to this keyframe. Well, at some point, there are failure modes. We already know it. Failure modes. When the illumination goes down and there is motion blur, and this is actually something where I see, actually, the event camera could be beneficial. And then other things, of course [crosstalk]—
Ackerman: Do you agree with that, Adam?
Bry: Say again?
Ackerman: Do you agree, Adam?
Bry: I guess I’m— and this is why kind of I’m asking the question. I’m very curious about event cameras. When I have kind of the pragmatic hat on of trying to build these systems and make them as useful as possible, I see event cameras as quite complementary to traditional cameras. So it’s hard for me to see a future where, for example, on our products, we would be only using event cameras. But I can certainly imagine a future where, if they were compelling from a size, weight, cost standpoint, we would have them as an additional sensing mode to get a lot of the benefits that Davide is talking about. And I don’t know if that’s a research direction that you guys are thinking about. And in a research context, I think it’s very cool and interesting to see what can you do with just an event camera. I think that the most likely scenario to me is that they would become like a complementary sensor, and there’s probably a lot of interesting things to be done of using standard cameras and event cameras side by side and getting the benefits of both, because I think that the context that you get from a conventional camera that’s just giving you full static images of the scene, combined with an event camera could be quite interesting. You can imagine using the event camera to sharpen and get better fidelity out of the conventional camera, and you could use the event camera for faster response times, but it gives you less of a global picture than the conventional camera. So Davide’s smiling. Maybe I’m— I’m sure he’s thought about all these ideas as well.
Scaramuzza: Yeah. We have been working on that exact thing, combining event cameras with standard cameras, now for the past three years. So initially, when we started almost 10 years ago, of course, we only focused on event cameras alone, because it was intellectually very challenging. But the reality is that an event camera—let’s not forget—it’s a differential sensor. So it’s only complementary with standard camera. You will never get the full absolute intensity from out of an event camera. We show that you can actually reproduce the grayscale intensity up to an unknown absolute intensity with very high fidelity, by the way, but it’s only complementary to a standard camera, as you correctly said. So actually, you already mentioned everything we are working on and we have also already published. So for example, you mentioned unblurring blurry frames. This also has already been done, not by my group, but a group of Richard Hartley at the University of Canberra in Australia. And what we also showed in my group last year is that you can also generate super slow motion video by combining an event camera with a standard camera, by basically using the events in the blind time between two frames to interpolate and generate arbitrary frames at any arbitrary time. And so we show that we could actually upsample a low frame rate video by a factor of 50, and this with only consuming one-fortieth of the memory footprint. And this is interesting, because—
Bry: Do you think from— this is a curiosity question. From a hardware standpoint, I’m wondering if it’ll go the next— go even a bit further, like if we’ll just start to see image sensors that do both together. I mean, you could certainly imagine just putting the two pieces of silicon right next to each other, or— I don’t know enough about image sensor design, but even at the pixel level, you could have pixel— like just superimposed on the same piece of silicon. You could have event pixels next to standard accumulation pixels and get both sets of data out of one sensor.
Scaramuzza: Exactly. So both things have been done. So—
Bry: [crosstalk].
Scaramuzza: —the latest one I described, we actually installed an event camera side by side with a very high-resolution standard camera. But there is already an event camera called DAVIS that outputs both frames and events between the frames. This has been available already since 2016, but at the very low resolution, and only last year it reached the VGA resolution. That’s why we are combining—
Bry: That’s like [crosstalk].
Scaramuzza: —an event camera with a high-resolution standard camera, because want to basically see what we could possibly do one day when these event cameras are also available [inaudible] resolution together with a standard camera overlaid on the same pixel array. But there is a good news, because you also asked me another question about cost of this camera. So the price, as you know very well, drops as soon as there is a mass product for it. The good news is that Samsung has now a product called SmartThings Vision Sensor that basically is conceived for indoor home monitoring, so to basically detect people falling at home, and this device automatically triggers an emergency call. So this device is using an event camera, and it costs €180, which is much less than the cost of an event camera when you buy it from these companies. It’s around €3,000. So that’s a very good news. Now, if there will be other bigger applications, we can expect that the price would go down a lot, below even $5. That’s what these companies are openly saying. I mean, what I expect, honestly, is that it will follow what we experience with the time-of-flight cameras. I mean, the first time-of-flight cameras cost around $15,000, and then 15 years later, they were below $150. I’m thinking of the first Kinect tool that was time-of-flight and so on. And now we have them in all sorts of smartphones. So it all depends on the market.
Ackerman: Maybe one more question from each of you guys, if you’ve got one you’ve been saving for the end.
Scaramuzza: Okay. The very last question [inaudible]. Okay. I ask, Adam, and then you tell me if you want to answer or rather not. It’s, of course, about defense. So the question I prepared, I told Evan. So I read in the news that Skydio donated 300K of equivalent of drones to Ukraine. So my question is, what are your views on military use or dual use of quadcopters, and what is the philosophy of Skydio regarding defense applications of drones? I don’t know if you want to answer.
Bry: Yeah, that’s a great question. I’m happy to answer that. So our mission, which we’ve talked about quite publicly, is to make the world more productive, creative, and safe with autonomous flight. And the position that we’ve taken, and which I feel very strongly about, is that working with the militaries of free democracies is very much in alignment and in support of that mission. So going back three or four years, we’ve been working with the US Army. We won the Army’s short-range reconnaissance program, which was essentially a competition to select the official kind of soldier-carried quadcopter for the US Army. And the broader trend there, which I think is really interesting and in line with what we’ve seen in other technology categories, is basically the consumer and civilian technology just raced ahead of the traditional defense systems. The military has been using drones for decades, but their soldier-carried systems were these multi-hundred-thousand-dollar things that are quite clunky, quite difficult to use, not super capable. And our products and other products in the consumer world basically got to the point where they had comparable and, in many cases, superior capability at a fraction of the cost.
And I think— to the credit of the US military and other departments of defense and ministries of defense around the world, I think people realized that and decided that they were better off going with these kind of dual-use systems that were predominantly designed and scaled in civilian markets, but also had defense applicability. And that’s what we’ve done as a company. So it’s essentially our consumer civilian product that’s extended and tweaked in a couple of ways, like the radios, some of the security protocols, to serve defense customers. And I’m super proud of the work that we’re doing in Ukraine. So we’ve donated $300,000 worth of systems. At this point, we’ve sold way, way more than that, and we have hundreds of systems in Ukraine that are being used by Ukrainian defense forces, and I think that’s good important work. The final piece of this that I’ll say is we’ve also decided and we aren’t doing and we won’t put weapons on our drones. So we’re not going to build actual munition systems, which I think is— I don’t think there’s anything ethically wrong with that. Ultimately, militaries need weapons systems, and those have an important role to play, but it’s just not something that we want to do as a company, and is kind of out of step with the dual-use philosophy, which is really how we approach these things.
I have a question that I’m— it’s aligned with some of what we’ve talked about, but I’m very interested in how you think about and focus the research in your lab, now that this stuff is becoming more and more commercialized. There’s companies like us and others that are building real products based on a lot of the algorithms that have come out of academia. And in general, I think it’s an incredibly exciting time where the pace of progress is accelerating, there’s more and more interesting algorithms out there, and it seems like there’s benefits flowing both ways between research labs and between these companies, but I’m very interested in how you’re thinking about that these days.
Scaramuzza: Yes. It’s a very interesting question. So first of all, I think of you also as a robotics company. And so what you are demonstrating is what [inaudible] of robotics in navigation and perception can do, and the fact that you can do it on a drone, it means you can also do it on other robots. And that actually is a call for us researchers, because it pushes us to think of new venues where we can actually contribute. Otherwise, it looks like everything has been done. And so what, for example, we have been working on in my lab is trying to— so towards the goal of achieving human-level performance, how do humans do navigate? They don’t do ultimate control and geometric 3D reconstruction. We have a brain that does everything end to end, or at least with the [inaudible] subnetworks. So one thing that we have been playing with has been now deep learning for already now, yeah, six years. But in the last two years, we realized, actually, that you can do a lot with deep networks, and also, they have some advantages compared to the usual traditional autonomy architectures— architecture of autonomous robots. So what is the standard way to control robots, be it flying or ground? You have [inaudible] estimation. They have a perception. So basically, special AI, semantic understanding. Then you have localization, path planning, and control.
Now, all these modules are basically communicating with one another. Of course, you want them to communicate in a smart way, because you want to also try to plan trajectories that facilitate perception, so you have no motion blur while you navigate, and so on. But somehow, they are always conceived by humans. And so what we are trying to understand is whether you can actually replace some of these blocks or even all blocks and up to each point with deep networks, which begs the question, can you even train a policy end to end that takes as input some sort of sensory, like either images or even sensory obstructions, and outputs control commands of some sort of output abstraction, like [inaudible] or like waypoints? And what we found out is that, yes, this can be done. Of course, the problem is that for training these policies, you need a lot of data. And how do you generate this data? You can not fly drones in the real world. So we started working more and more in simulation. So now we are actually training all these things in simulation, even for forests. And thanks to the video game engines like Unity, now you can download a lot of these 3D environments and then deploy your algorithms there that train and teach a drone to fly in just a bunch of hours rather than flying and crashing drones in the real world, which is very costly as well. But the problem is that we need better simulators.
We need better simulators, and I’m not just thinking of for the realism. I think that one is actually somewhat solved. So I think we need the better physics like aerodynamic effects and other non-idealities. These are difficult to model. So we are also working on these kind of things. And then, of course, another big thing would be you would like to have a navigation policy that is able to abstract and generalize to different type of tasks, and possibly, at some point, even tell your drone or robot a high-level description of the task, and the drone or the robot would actually accomplish the task. That would be the dream. I think that the robotics community, we are moving towards that.
Bry: Yeah. I agree. I agree, and I’m excited about it.
Ackerman: We’ve been talking with Adam Bry from Skydio and Davide Scaramuzza from the University of Zürich about agile autonomous drones, and thanks again to our guests for joining us. For Chatbot and IEEE Spectrum, I’m Evan Ackerman.
Episode 3: Drones That Can Fly Better Than You Can
Evan Ackerman: I’m Evan Ackerman, and welcome to Chatbot, a new podcast from IEEE Spectrum where robotics experts interview each other about things that they find fascinating. On this episode of Chatbot, we’ll be talking with Davide Scaramuzza and Adam Bry about agile autonomous drones. Adam Bry is the CEO of Skydio, a company that makes consumer camera drones with an astonishing amount of skill at autonomous tracking and obstacle avoidance. Foundation for Skydio’s drones can be traced back to Adam’s work on autonomous agile drones at MIT, and after spending a few years at Google working on Project Wing’s delivery drones, Adam cofounded Skydio in 2014. Skydio is currently on their third generation of consumer drones, and earlier this year, the company brought on three PhD students from Davide’s lab to expand their autonomy team. Davide Scaramuzza directs the Robotics and Perception group at the University of Zürich. His lab is best known for developing extremely agile drones that can autonomously navigate through complex environments at very high speeds. Faster, it turns out, than even the best human drone racing champions. Davide’s drones rely primarily on computer vision, and he’s also been exploring potential drone applications for a special kind of camera called an event camera, which is ideal for fast motion under challenging lighting conditions. So Davide, you’ve been doing drone research for a long time now, like a decade, at least, if not more.
Davide Scaramuzza: Since 2009. 15 years.
Ackerman: So what still fascinates you about drones after so long?
Scaramuzza: So what fascinates me about drones is their freedom. So that was the reason why I decided, back then in 2009, to actually move from ground robots—I was working at the time on self-driving cars—to drones. And actually, the trigger was when Google announced the self-driving car project, and then for me and many researchers, it was clear that actually many things were now transitioning from academia to industry, and so we had to come up with new ideas and things. And then with my PhD adviser at that time [inaudible] we realized, actually, that drones, especially quadcopters, were just coming out, but they were all remote controlled or they were actually using GPS. And so then we said, “What about flying drones autonomously, but with the onboard cameras?” And this had never been done until then. But what fascinates me about drones is the fact that, actually, they can overcome obstacles on the ground very quickly, and especially, this can be very useful for many applications that matter to us all today, like, first of all, search and rescue, but also other things like inspection of difficult infrastructures like bridges, power [inaudible] oil platforms, and so on.
Ackerman: And Adam, your drones are doing some of these things, many of these things. And of course, I am fascinated by drones and by what your drone is able to do, but I’m curious. When you introduce it to people who have maybe never seen it, how do you describe, I guess, almost the magic of what it can do?
Adam Bry: So the way that we think about it is pretty simple. Our basic goal is to build in the skills of an expert pilot into the drone itself, which involves a little bit of hardware. It means we need sensors that see everything in every direction and we need a powerful computer on board, but is mostly a software problem. And it becomes quite application-specific. So for consumers, for example, our drones can follow and film moving subjects and avoid obstacles and create this incredibly compelling dynamic footage. And the goal there is really what would happen if you had the world’s best drone pilot flying that thing, trying to film something in an interesting, compelling way. We want to make that available to anybody using one of our products, even if they’re not an expert pilot, and even if they’re not at the controls when it’s flying itself. So you can just put it in your hand, tell it to take off, it’ll turn around and start tracking you, and then you can do whatever else you want to do, and the drone takes care of the rest. In the industrial world, it’s entirely different. So for inspection applications, say, for a bridge, you just tell the drone, “Here’s the structure or scene that I care about,” and then we have a product called 3D Scan that will automatically explore it, build a real-time 3D map, and then use that map to take high-resolution photos of the entire structure.
And to follow on a bit to what Davide was saying, I mean, I think if you sort of abstract away a bit and think about what capability do drones offer, thinking about camera drones, it’s basically you can put an image sensor or, really, any kind of sensor anywhere you want, any time you want, and then the extra thing that we’re bringing in is without needing to have a person there to control it. And I think the combination of all those things together is transformative, and we’re seeing the impact of that in a lot of these applications today, but I think that that really— realizing the full potential is a 10-, 20-year kind of project.
Ackerman: It’s interesting when you talk about the way that we can think about the Skydio drone is like having an expert drone pilot to fly this thing, because there’s so much skill involved. And Davide, I know that you’ve been working on very high-performance drones that can maybe challenge even some of these expert pilots in performance. And I’m curious, when expert drone pilots come in and see what your drones can do autonomously for the first time, is it scary for them? Are they just excited? How do they react?
Scaramuzza: First of all, actually, they say, “Wow.” So they can not believe what they see. But then they get super excited, but at the same time, nervous. So we started working on autonomous drone racing five years ago, but in the first three years, we have been flying very slowly, like three meters per second. So they were really snails. But then in the last two years is when actually we started really pushing the limits, both in control and planning and perception. So these are our most recent drone, by the way. And now we can really fly at the same level of agility as humans. Not yet at the level to beat human, but we are very, very close. So we started the collaboration with Marvin, who is the Swiss champion, and he’s only— now he’s 16 years old. So last year he was 15 years old. So he’s a boy. And he actually was very mad at the drone. So he was super, super nervous when he saw this. So he didn’t even smile the first time. He was always saying, “I can do better. I can do better.” So actually, his reaction was quite scared. He was scared, actually, by what the drone was capable of doing, but he knew that, basically, we were using the motion capture. Now [inaudible] try to play in a fair comparison with a fair setting where both the autonomous drone and the human-piloted drone are using both onboard perceptions or egocentric vision, then things might end up differently.
Because in fact, actually, our vision-based drone, so flying with onboard vision, was quite slow. But actually now, after one year of pushing, we are at a level, actually, that we can fly a vision-based drone at the level of Marvin, and we are even a bit better than Marvin at the current moment, using only onboard vision. So we can fly— in this arena, the space allows us to go up to 72 kilometers per hour. We reached the 72 kilometers per hour, and we even beat Marvin in three consecutive laps so far. So that’s [inaudible]. But we want to now also compete against other pilots, other world champions, and see what’s going to happen.
Ackerman: Okay. That’s super impressive.
Bry: Can I jump in and ask a question?
Ackerman: Yeah, yeah, yeah.
Bry: I’m interested if you— I mean, since you’ve spent a lot of time with the expert pilots, if you learn things from the way that they think and fly, or if you just view them as a benchmark to try to beat, and the algorithms are not so much inspired by what they do.
Scaramuzza: So we did all these things. So we did it also in a scientific manner. So first, of course, we interviewed them. We asked any sort of question, what type of features are you actually focusing your attention, and so on, how much is the people around you, the supporters actually influencing you, and the hearing the other opponents actually screaming while they control [inaudible] influencing you. So there is all these psychological effects that, of course, influencing pilots during a competition. But then what we tried to do scientifically is to really understand, first of all, what is the latency of a human pilot. So there have been many studies that have been done for car racing, Formula One, back in the 80s and 90s. So basically, they put eye trackers and tried to understand— they tried to understand, basically, what is the latency between what you see until basically you act on your steering wheel. And so we tried to do the same for human pilots. So we basically installed an eye tracking device on our subjects. So we called 20 subjects from all across Switzerland, some people also from outside Switzerland, with different levels of expertise.
But they were quite good. Okay? We are not talking about median experts, but actually already very good experts. And then we would let them rehearse on the track, and then basically, we were capturing their eye gazes, and then we basically measured the time latency between changes in eye gaze and changes in throttle commands on the joystick. And we measured, and this latency was 220 milliseconds.
Ackerman: Wow. That’s high.
Scaramuzza: That includes the brain latency and the behavioral latency. So that time to send the control commands, once you process the information, the visual information to the fingers. So—
Bry: I think [crosstalk] it might just be worth, for the audience anchoring that, what’s the typical control latency for a digital control loop. It’s— I mean, I think it’s [crosstalk].
Scaramuzza: It’s typically in the— it’s typically in the order of— well, from images to control commands, usually 20 milliseconds, although we can also fly with the much higher latencies. It really depends on the speed you want to achieve. But typically, 20 milliseconds. So if you compare 20 milliseconds versus the 220 milliseconds of the human, you can already see that, eventually, the machine should beat the human. Then the other thing that you asked me was, what did we learn from human pilots? So what we learned was— interestingly, we learned that basically they were always pushing the throttle of the joystick at the maximum thrust, but actually, this is—
Bry: Because that’s very consistent with optimal control theory.
Scaramuzza: Exactly. But what we then realized, and they told us, was that it was interesting for them to observe that actually, for the AI, was better to brake earlier rather than later as the human was actually doing. And we published these results in Science Robotics last summer. And we did this actually using an algorithm that computes the time optimal trajectory from the start to the finish through all the gates, and by exploiting the full quadrotor dynamical model. So it’s really using not approximation, not point-mass model, not polynomial trajectories. The full quadrotor model, it takes a lot to compute, let me tell you. It takes like one hour or more, depending on the length of the trajectory, but it does a very good job, to a point that Gabriel Kocher, who works for the Drone Racing League, told us, “Ah, this is very interesting. So I didn’t know, actually, I can push even faster if I start braking before this gate.”
Bry: Yeah, it seems like it went the other way around. The optimal control strategy taught the human something.
Ackerman: Davide, do you have some questions for Adam?
Scaramuzza: Yes. So since you mentioned that basically, one of the scenarios or one of the applications that you are targeting, it is basically cinematography, where basically, you want to take amazing shots at the level of Hollywood, maybe producers, using your autonomous drones. And this is actually very interesting. So what I want to ask you is, in general, so going beyond cinematography, if you look at the performance of autonomous drones in general, it still looks to me that, for generic applications, they are still behind human pilot performance. I’m thinking of beyond cinematography and beyond the racing. I’m thinking of search and rescue operations and many things. So my question to Adam is, do you think that providing a higher level of agility to your platform could potentially unlock new use cases or even extend existing use cases of the Skydio drones?
Bry: You’re asking specifically about agility, flight agility, like responsiveness and maneuverability?
Scaramuzza: Yes. Yes. Exactly.
Bry: I think that it is— I mean, in general, I think that most things with drones have this kind of product property where the more you get better at something, the better it’s going to be for most users, and the more applications will be unlocked. And this is true for a lot of things. It’s true for some things that we even wish it wasn’t true for, like flight time. Like the longer the flight time, the more interesting and cool things people are going to be able to do with it, and there’s kind of no upper limit there. Different use cases, it might taper off, but you’re going to unlock more and more use cases the longer you can fly. I think that agility is one of these parameters where the more, the better, although I will say it’s not the thing that I feel like we’re hitting a ceiling on now in terms of being able to provide value to our users. There are cases within different applications. So for example, search and rescue, being able to fly through a really tight gap or something, where it would be useful. And for capturing cinematic videos, similar story, like being able to fly at high speed through some really challenging course, where I think it would make a difference. So I think that there are areas out there in user groups that we’re currently serving where it would matter, but I don’t think it’s like the— it’s not the thing that I feel like we’re hitting right now in terms of sort of the lowest-hanging fruit to unlock more value for users. Yeah.
Scaramuzza: So you believe, though, that in the long term, actually achieving human-level agility would actually be added value for your drones?
Bry: Definitely. Yeah. I mean, one sort of mental model that I think about for the long-term direction of the products is looking at what birds can do. And the agility that birds have and the kinds of maneuvers that that makes them capable of, and being able to land in tricky places, or being able to slip through small gaps, or being able to change direction quickly, that affords them capability that I think is definitely useful to have in drones and would unlock some value. But I think the other really interesting thing is that the autonomy problem spans multiple sort of ranges of hierarchy, and when you get towards the top, there’s human judgment that I think is very— I mean, it’s crucial to a lot of things that people want to do with drones, and it’s very difficult to automate, and I think it’s actually relatively low value to automate. So for example, in a search and rescue mission, a person might have— a search and rescue worker might have very particular context on where somebody is likely to be stuck or maybe be hiding or something that would be very difficult to encode into a drone. They might have some context from a clue that came up earlier in the case or something about the environment or something about the weather.
And so one of the things that we think a lot about in how we build our products—we’re a company. We’re trying to make useful stuff for people, so we have a pretty pragmatic approach on these fronts— is basically— we’re not religiously committed to automating everything. We’re basically trying to automate the things where we can give the best tool to somebody to then apply the judgment that they have as a person and an operator to get done what they want to get done.
Scaramuzza: And actually, yeah, now that you mentioned this, I have another question. So I’ve watched many of your previous tech talks and also interacted with you guys at conferences. So what I learned—and correct me if I’m wrong—is that you’re using a lot of deep learning on the perception side, so as part of a 3D construction, semantic understanding. But it seems to me that on the control and planning side, you’re still relying basically on optimal control. And I wanted to ask you, so if this is the case, are you happy there with optimal control? We also know that Boston Dynamics is actually using only optimal control. Actually, they even claim they are not using any deep learning in control and planning. So is this actually also what you experience? And if this is the case, do you believe in the future, actually, you will be using deep learning also in planning and control, and where exactly do you see the benefits of deep learning there?
Bry: Yeah, that’s a super interesting question. So what you described at a high level is essentially right. So our perception stack— and we do a lot of different things in perception, but we’re pretty heavily using deep learning throughout, for semantic understanding, for spatial understanding, and then our planning and control stack is based on more conventional kind of optimal control optimization and full-state feedback control techniques, and it generally works pretty well. Having said that, we did— we put out a blog post on this. We did a research project where we basically did end-to-end— pretty close to an end-to-end learning system where we replaced a good chunk of the planning stack with something that was based on machine learning, and we got it to the point where it was good enough for flight demonstrations. And for the amount of work that we put into it, relative to the capability that we got, I think the results were really compelling. And my general outlook on this stuff— I think that the planning and controls is an area where the models, I think, provide a lot of value. Having a structured model based on physics and first principles does provide a lot of value, and it’s admissible to that kind of modeling. You can write down the mass and the inertia and the rotor parameters, and the physics of quadcopters are such that those things tend to be pretty accurate and tend to work pretty well, and by starting with that structure, you can come up with quite a capable system.
Having said that, I think that the— to me, the trajectory of machine learning and deep learning is such that eventually I think it will dominate almost everything, because being able to learn based on data and having these representations that are incredibly flexible and can encode sort of subtle relationships that might exist but wouldn’t fall out of a more conventional physics model, I think is really powerful, and then I also think being able to do more end-to-end stuff where subtle sort of second- or third-order perception impact— or second- or third-order perception or real world, physical world things can then trickle through into planning and control actions, I think is also quite powerful. So generally, that’s the direction I see us going, and we’ve done some research on this. And I think the way you’ll see it going is we’ll use sort of the same optimal control structure we’re using now, but we’ll inject more learning into it, and then eventually, the thing might evolve to the point where it looks more like a deep network in end-to-end.
Scaramuzza: Now, earlier you mentioned that you foresee that in the future, drones will be flying more agilely, similar to human pilots, and even in tight spaces. You mentioned passing through a narrow gap or even in a small corridor. So when you navigate in tight spaces, of course, ground effect is very strong. So do you guys then model these aerodynamic effects, ground effect— not just ground effect. Do you try to model all possible aerodynamic effects, especially when you fly close to structures?
Bry: It’s an interesting question. So today we don’t model— we estimate the wind. We estimate the local wind velocity—and we’ve actually found that we can do that pretty accurately—around the drone, and then the local wind that we’re estimating gets fed back into the control system to compensate. And so that’s kind of like a catch-all bucket for— you could think about ground effect as like a variation— this is not exactly how it works, obviously, but you could think about it as like a variation in the local wind, and our response times on those, like the ability to estimate wind and then feed it back into control, is pretty quick, although it’s not instantaneous. So if we had like a feed forward model where we knew as we got close to structures, “This is how the wind is likely to vary,” we could probably do slightly better. And I think you’re— what you’re pointing at here, I basically agree with. I think the more that you kind of try to squeeze every drop of performance out of these things you’re flying with maximum agility in very dense environments, the more these things start to matter, and I could see us wanting to do something like that in the future, and that stuff’s fun. I think it’s fun when you sort of hit the limit and then you have to invent better new algorithms and bring more information to bear to get the performance that you want.
On this— perhaps related. You can tell me. So you guys have done a lot of work with event cameras, and I think that you were— this might not be right, but from what I’ve seen, I think you were one of the first, if not the first, to put event cameras on quadcopters. I’d be very interested in— and you’ve probably told these stories a lot, but I still think it’d be interesting to hear. What steered you towards event cameras? How did you find out about them, and what made you decide to invest in research in them?
Scaramuzza: [crosstalk] first of all, let me explain what an event camera is. An event camera is a camera that has also pixels, but differently from a standard camera, an event camera only sends information when there is motion. So if there is no motion, then the camera doesn’t stream any information. Now, the camera does this through smart pixels, differently from a standard camera, where every pixel triggers information the same time at equidistant time intervals. In an event camera, the pixels are smart, and they only trigger information whenever a pixel detects motion. Usually, a motion is recorded as a change of intensity. And the stream of events happens asynchronously, and therefore, the byproduct of this is that you don’t get frames, but you only get a stream of information continuously in time with microsecond temporal resolution. So one of the key advantages of event cameras is that, basically, you can actually record phenomena that actually would take expensive high-speed cameras to perceive. But the key difference with a standard camera is that an event camera works in differential mode. And because it works in differential mode, by basically capturing per-pixel intensity differences, it consumes very little power, and it also has no motion blur, because it doesn’t accumulate photons over time.
So I would say that for robotics, what I— because you asked me how did I find out. So what I really, really saw, actually, that was very useful for robotics about event cameras were two particular things. First of all, the very high temporal resolution, because this can be very useful for safety, critical systems. And I’m thinking about drones, but also to avoid collisions in the automotive setting, because now we are also working in automotive settings as well. And also when you have to navigate in low-light environments, where using a standard camera with the high exposure times, you would actually be coping with a lot of motion blur that would actually cause a feature loss and other artifacts, like impossibility to detect objects and so on. So event cameras excel at this. No motion blur and very low latency. Another thing that could be also very interesting for especially lightweight robotics—and I’m thinking of micro drones—would be actually the fact that they consume also very little power. So little power, in fact, just to be on an event camera consumes one milliwatt, on average, because in fact, the power consumption depends on the dynamics of the scene. If nothing moves, then the power consumption is very negligible. If something moves, it is between one milliwatt or maximum 10 milliwatt.
Now, the interesting thing is that if you then couple event cameras with the spiking neuromorphic chips that also consume less than one milliwatt, you can actually mount them on a micro drones, and you can do amazing things, and we started working on it. The problem is that how do you train spiking networks? But that’s another story. Other interesting things where I see potential applications of event cameras are also, for example— now, think about your keyframe features of the Skydio drones. And here what you are doing, guys, is that basically, you are flying the drones around, and then you’re trying to send 3D positions and orientation of where you would like then [inaudible] to fly faster through. But the images have been captured while the drone is still. So basically, you move the drone to a certain position, you orient it in the direction where later you want it to fly, and then you record the position and orientation, and later, the drone will fly agilely through it. But that means that, basically, the drone should be able to relocalize fast with respect to this keyframe. Well, at some point, there are failure modes. We already know it. Failure modes. When the illumination goes down and there is motion blur, and this is actually something where I see, actually, the event camera could be beneficial. And then other things, of course [crosstalk]—
Ackerman: Do you agree with that, Adam?
Bry: Say again?
Ackerman: Do you agree, Adam?
Bry: I guess I’m— and this is why kind of I’m asking the question. I’m very curious about event cameras. When I have kind of the pragmatic hat on of trying to build these systems and make them as useful as possible, I see event cameras as quite complementary to traditional cameras. So it’s hard for me to see a future where, for example, on our products, we would be only using event cameras. But I can certainly imagine a future where, if they were compelling from a size, weight, cost standpoint, we would have them as an additional sensing mode to get a lot of the benefits that Davide is talking about. And I don’t know if that’s a research direction that you guys are thinking about. And in a research context, I think it’s very cool and interesting to see what can you do with just an event camera. I think that the most likely scenario to me is that they would become like a complementary sensor, and there’s probably a lot of interesting things to be done of using standard cameras and event cameras side by side and getting the benefits of both, because I think that the context that you get from a conventional camera that’s just giving you full static images of the scene, combined with an event camera could be quite interesting. You can imagine using the event camera to sharpen and get better fidelity out of the conventional camera, and you could use the event camera for faster response times, but it gives you less of a global picture than the conventional camera. So Davide’s smiling. Maybe I’m— I’m sure he’s thought about all these ideas as well.
Scaramuzza: Yeah. We have been working on that exact thing, combining event cameras with standard cameras, now for the past three years. So initially, when we started almost 10 years ago, of course, we only focused on event cameras alone, because it was intellectually very challenging. But the reality is that an event camera—let’s not forget—it’s a differential sensor. So it’s only complementary with standard camera. You will never get the full absolute intensity from out of an event camera. We show that you can actually reproduce the grayscale intensity up to an unknown absolute intensity with very high fidelity, by the way, but it’s only complementary to a standard camera, as you correctly said. So actually, you already mentioned everything we are working on and we have also already published. So for example, you mentioned unblurring blurry frames. This also has already been done, not by my group, but a group of Richard Hartley at the University of Canberra in Australia. And what we also showed in my group last year is that you can also generate super slow motion video by combining an event camera with a standard camera, by basically using the events in the blind time between two frames to interpolate and generate arbitrary frames at any arbitrary time. And so we show that we could actually upsample a low frame rate video by a factor of 50, and this with only consuming one-fortieth of the memory footprint. And this is interesting, because—
Bry: Do you think from— this is a curiosity question. From a hardware standpoint, I’m wondering if it’ll go the next— go even a bit further, like if we’ll just start to see image sensors that do both together. I mean, you could certainly imagine just putting the two pieces of silicon right next to each other, or— I don’t know enough about image sensor design, but even at the pixel level, you could have pixel— like just superimposed on the same piece of silicon. You could have event pixels next to standard accumulation pixels and get both sets of data out of one sensor.
Scaramuzza: Exactly. So both things have been done. So—
Bry: [crosstalk].
Scaramuzza: —the latest one I described, we actually installed an event camera side by side with a very high-resolution standard camera. But there is already an event camera called DAVIS that outputs both frames and events between the frames. This has been available already since 2016, but at the very low resolution, and only last year it reached the VGA resolution. That’s why we are combining—
Bry: That’s like [crosstalk].
Scaramuzza: —an event camera with a high-resolution standard camera, because want to basically see what we could possibly do one day when these event cameras are also available [inaudible] resolution together with a standard camera overlaid on the same pixel array. But there is a good news, because you also asked me another question about cost of this camera. So the price, as you know very well, drops as soon as there is a mass product for it. The good news is that Samsung has now a product called SmartThings Vision Sensor that basically is conceived for indoor home monitoring, so to basically detect people falling at home, and this device automatically triggers an emergency call. So this device is using an event camera, and it costs €180, which is much less than the cost of an event camera when you buy it from these companies. It’s around €3,000. So that’s a very good news. Now, if there will be other bigger applications, we can expect that the price would go down a lot, below even $5. That’s what these companies are openly saying. I mean, what I expect, honestly, is that it will follow what we experience with the time-of-flight cameras. I mean, the first time-of-flight cameras cost around $15,000, and then 15 years later, they were below $150. I’m thinking of the first Kinect tool that was time-of-flight and so on. And now we have them in all sorts of smartphones. So it all depends on the market.
Ackerman: Maybe one more question from each of you guys, if you’ve got one you’ve been saving for the end.
Scaramuzza: Okay. The very last question [inaudible]. Okay. I ask, Adam, and then you tell me if you want to answer or rather not. It’s, of course, about defense. So the question I prepared, I told Evan. So I read in the news that Skydio donated 300K of equivalent of drones to Ukraine. So my question is, what are your views on military use or dual use of quadcopters, and what is the philosophy of Skydio regarding defense applications of drones? I don’t know if you want to answer.
Bry: Yeah, that’s a great question. I’m happy to answer that. So our mission, which we’ve talked about quite publicly, is to make the world more productive, creative, and safe with autonomous flight. And the position that we’ve taken, and which I feel very strongly about, is that working with the militaries of free democracies is very much in alignment and in support of that mission. So going back three or four years, we’ve been working with the US Army. We won the Army’s short-range reconnaissance program, which was essentially a competition to select the official kind of soldier-carried quadcopter for the US Army. And the broader trend there, which I think is really interesting and in line with what we’ve seen in other technology categories, is basically the consumer and civilian technology just raced ahead of the traditional defense systems. The military has been using drones for decades, but their soldier-carried systems were these multi-hundred-thousand-dollar things that are quite clunky, quite difficult to use, not super capable. And our products and other products in the consumer world basically got to the point where they had comparable and, in many cases, superior capability at a fraction of the cost.
And I think— to the credit of the US military and other departments of defense and ministries of defense around the world, I think people realized that and decided that they were better off going with these kind of dual-use systems that were predominantly designed and scaled in civilian markets, but also had defense applicability. And that’s what we’ve done as a company. So it’s essentially our consumer civilian product that’s extended and tweaked in a couple of ways, like the radios, some of the security protocols, to serve defense customers. And I’m super proud of the work that we’re doing in Ukraine. So we’ve donated $300,000 worth of systems. At this point, we’ve sold way, way more than that, and we have hundreds of systems in Ukraine that are being used by Ukrainian defense forces, and I think that’s good important work. The final piece of this that I’ll say is we’ve also decided and we aren’t doing and we won’t put weapons on our drones. So we’re not going to build actual munition systems, which I think is— I don’t think there’s anything ethically wrong with that. Ultimately, militaries need weapons systems, and those have an important role to play, but it’s just not something that we want to do as a company, and is kind of out of step with the dual-use philosophy, which is really how we approach these things.
I have a question that I’m— it’s aligned with some of what we’ve talked about, but I’m very interested in how you think about and focus the research in your lab, now that this stuff is becoming more and more commercialized. There’s companies like us and others that are building real products based on a lot of the algorithms that have come out of academia. And in general, I think it’s an incredibly exciting time where the pace of progress is accelerating, there’s more and more interesting algorithms out there, and it seems like there’s benefits flowing both ways between research labs and between these companies, but I’m very interested in how you’re thinking about that these days.
Scaramuzza: Yes. It’s a very interesting question. So first of all, I think of you also as a robotics company. And so what you are demonstrating is what [inaudible] of robotics in navigation and perception can do, and the fact that you can do it on a drone, it means you can also do it on other robots. And that actually is a call for us researchers, because it pushes us to think of new venues where we can actually contribute. Otherwise, it looks like everything has been done. And so what, for example, we have been working on in my lab is trying to— so towards the goal of achieving human-level performance, how do humans do navigate? They don’t do ultimate control and geometric 3D reconstruction. We have a brain that does everything end to end, or at least with the [inaudible] subnetworks. So one thing that we have been playing with has been now deep learning for already now, yeah, six years. But in the last two years, we realized, actually, that you can do a lot with deep networks, and also, they have some advantages compared to the usual traditional autonomy architectures— architecture of autonomous robots. So what is the standard way to control robots, be it flying or ground? You have [inaudible] estimation. They have a perception. So basically, special AI, semantic understanding. Then you have localization, path planning, and control.
Now, all these modules are basically communicating with one another. Of course, you want them to communicate in a smart way, because you want to also try to plan trajectories that facilitate perception, so you have no motion blur while you navigate, and so on. But somehow, they are always conceived by humans. And so what we are trying to understand is whether you can actually replace some of these blocks or even all blocks and up to each point with deep networks, which begs the question, can you even train a policy end to end that takes as input some sort of sensory, like either images or even sensory obstructions, and outputs control commands of some sort of output abstraction, like [inaudible] or like waypoints? And what we found out is that, yes, this can be done. Of course, the problem is that for training these policies, you need a lot of data. And how do you generate this data? You can not fly drones in the real world. So we started working more and more in simulation. So now we are actually training all these things in simulation, even for forests. And thanks to the video game engines like Unity, now you can download a lot of these 3D environments and then deploy your algorithms there that train and teach a drone to fly in just a bunch of hours rather than flying and crashing drones in the real world, which is very costly as well. But the problem is that we need better simulators.
We need better simulators, and I’m not just thinking of for the realism. I think that one is actually somewhat solved. So I think we need the better physics like aerodynamic effects and other non-idealities. These are difficult to model. So we are also working on these kind of things. And then, of course, another big thing would be you would like to have a navigation policy that is able to abstract and generalize to different type of tasks, and possibly, at some point, even tell your drone or robot a high-level description of the task, and the drone or the robot would actually accomplish the task. That would be the dream. I think that the robotics community, we are moving towards that.
Bry: Yeah. I agree. I agree, and I’m excited about it.
Ackerman: We’ve been talking with Adam Bry from Skydio and Davide Scaramuzza from the University of Zürich about agile autonomous drones, and thanks again to our guests for joining us. For Chatbot and IEEE Spectrum, I’m Evan Ackerman.
Navigation over torturous terrain such as those in natural subterranean environments presents a significant challenge to field robots. The diversity of hazards, from large boulders to muddy or even partially submerged Earth, eludes complete definition. The challenge is amplified if the presence and nature of these hazards must be shared among multiple agents that are operating in the same space. Furthermore, highly efficient mapping and robust navigation solutions are absolutely critical to operations such as semi-autonomous search and rescue. We propose an efficient and modular framework for semantic grid mapping of subterranean environments. Our approach encodes occupancy and traversability information, as well as the presence of stairways, into a grid map that is distributed amongst a robot fleet despite bandwidth constraints. We demonstrate that the mapping method enables safe and enduring exploration of subterranean environments. The performance of the system is showcased in high-fidelity simulations, physical experiments, and Team MARBLE’s entry in the DARPA Subterranean Challenge which received third place.
Episode 2: How Labrador and iRobot Create Domestic Robots That Really Help
Evan Ackerman: I’m Evan Ackerman, and welcome to ChatBot, a new podcast from IEEE Spectrum where robotics experts interview each other about things that they find fascinating. On this episode of ChatBot, we’ll be talking with Mike Dooley and Chris Jones about useful robots in the home. Mike Dooley is the CEO and co-founder of Labrador Systems, the startup that’s developing an assistive robot in the form of a sort of semi-autonomous mobile table that can help people move things around their homes. Before founding Labrador, Mike led the development of Evolution Robotics’ innovative floor-cleaning robots. And when Evolution was acquired by iRobot in 2012, Mike became iRobot’s VP of product and business development. Labrador Systems is getting ready to launch its first robot, the Labrador Retriever, in 2023. Chris Jones is the chief technology officer at iRobot, which is arguably one of the most successful commercial robotics companies of all time. Chris has been at iRobot since 2005, and he spent several years as a senior investigator at iRobot research working on some of iRobot’s more unusual and experimental projects. iRobot Ventures is one of the investors in Labrador Systems. Chris, you were doing some interesting stuff at iRobot back in the day too, that I think a lot of people may not know how diverse iRobot’s robotics projects were.
Chris Jones: I think iRobot as a company, of course, being around since 1990, has done all sorts of things. Toys, commercial robots, consumer, military, industrial, all sorts of different things. But yeah, myself in particular, I spent the first seven, eight years of my time at iRobot doing a lot of super fun kind of far-out-there research types of projects, a lot of them funded by places like DARPA and working with some great academic collaborators, and of course, a whole crew of colleagues at iRobot. But yeah, some of those were ranged from completely squishy robots to robot arms to robots that could climb mountainsides to robots under the water, all sorts of different fun, useful, but fun, of course, and really challenging, which makes it fun, different types of robot concepts.
Ackerman: And those are all getting incorporated to the next generation Roomba, right?
Jones: I don’t know that I can comment on—
Ackerman: That’s not a no. Yeah. Okay. So Mike, I want to make sure that people who aren’t familiar with Labrador get a good understanding of what you’re working on. So can you describe kind of Labrador’s robot, what it does and why it’s important?
Mike Dooley: Yeah. So Labrador, we’re developing a robot called the Retriever, and it’s really designed as an extra pair of hands for individuals who have some issue either with pain, a health issue or injury that impacts their daily activities, particularly in the home. And so this is a robot designed to help people live more independently and to augment their abilities and give them some degree of autonomy back where they’re fighting that with the issue that they’re facing. And the robot, I think it’s been— after previewing its CES, it has been called a self-driving shelf. It’s designed to be really a mobile platform that’s about the size of a side table but has the ability to carry things as large as a laundry basket or set the dinner and plates on it, automatically navigates from place to place. It raises up to go up to countertop height when you’re by the kitchen sink and lowers down when you’re by your armchair. And it has the ability to retrieve too. So it’s a cross between robots that are used in warehousing to furniture mixed together to make something that’s comfortable and safe for the environment, but really is really meant to help folks where they have some difficulty moving themselves. This is meant to help them give that some degree of that independence back, as well as extend the impact of it for caregivers.
Ackerman: Yeah, I thought that was a fantastic idea when I first saw it at CES, and I’m so glad that you’ve been able to continue working on it. And especially with some support from folks like iRobot, right? Chris, iRobot is an investor in Labrador?
Jones: Correct. Through iRobot Ventures, we’re an early investor in Labrador. Of course, where that means, and we continue to be super excited about what they’re doing. I mean, for us, anyone who has great ideas for how robots can help people, in particular, assist people in their home with independent living, etc., I think is something we strongly believe is going to be a great application for robots. And when making investments, I’ll just add, of course, that earliest stage, a lot of it is about the team, right? And so Mike and the rest of his team are super compelling, right? That paired with a vision, that’s something that we believe is a great application for robots. It makes it an easy decision, right, to say there’s someone we’d like to support. So we love seeing their progress.
Ackerman: Yeah, me too.
Dooley: And we appreciate your support very much. So yeah.
Ackerman: All right, so what do you guys want to talk about? Mike, you want to kick things off?
Dooley: I can lead off. Yeah, so in full disclosure, at some point in my life, I was-- Chris, what’s the official name for an iRobot employee? I forgot what they came up with. It’s not iRoboteer, is it?
Jones: iRoboteer. Yeah.
Dooley: Okay, okay. All right, so I was an iRoboteer in my past life and crossed over with Chris for a number of years. And I know they’ve renovated the building a couple times now, but these products you mentioned or the robots you mentioned at the beginning, a lot of them are in display in a museum. And so I think my first question to Chris was, can you think of one of those, either that you worked on or maybe it didn’t, but you go, “Man, this should have taken off or this should have been this--” or it should have or you wished it would have. It would have been great if one of those that’s in there because there’s a lot, so.
Jones: Yes, there are a lot. You’re right. We have a museum, and it has been renovated in the last couple years, Mike, so you should come back and visit and check out the new updated museum. How would I answer that? There are so many things in there. I would say one that I have some sentimentality toward, and I think it holds some really compelling promise, even though at least to date, it hasn’t gone anywhere outside of the museum, Evan, is related to the squishy robots I was talking about. And in my mind, in one of the key challenges in unlocking future value in robots, and in particular, in autonomous robots, for example, in the home, is manipulation, is physical manipulation of the environment in the home. And Mike and Labrador are doing a little bit of this, right, by being able to maneuver and pick up, carry, drop off some things around the home. But the idea of a robot that’s able to physically pick up, grasp objects, pick them up off the floor, off a counter, open and close doors, all of those things is kind of the Holy Grail, right, if you can cost-effectively and robustly do that. In the home, there’s all sorts of great applications for that. And one of those research projects that’s in the museum was actually something called the Jamming Gripper. Mike, I don’t know if you remember seeing that at all, but this takes me back. And Evan, actually, I’m sure there are some IEEE stories and stuff back in the day from this. But this was an idea of a very compliant, it’s a soft manipulator. It’s not a hand. It’s actually very close to imagining a very soft membrane that’s filled with coffee grounds. So imagine a bag of coffee, right? Very soft and compliant.
But vacuum-packed coffee, you pull a vacuum on that bag. It turns rigid in the shape that it was in. It’s like a brick, which is a great concept for thinking about robot manipulation. That’s one idea. We had spent some research time with some folks in academia, had built a huge number of prototypes, and I still feel like there’s something there. There’s a really interesting concept there that can help with that more general purpose manipulation of objects in the home. So Mike, if you want to talk to us about licensing, maybe we can do that for Labrador with all your applications.
Dooley: Yeah. Actually, that’s what you should add. It would probably increase your budget dramatically, but you should add live demonstrations to the museum. See if you can have projects to get people to bring some of those back. Because I’m sure I saw it. I never knew it was doing that.
Jones: I mean, maybe we can continue this. There might be a little bit of a thread to continue that question into—the first one that came to my mind, Mike, when I was thinking about what to ask. And it’s something I have a lot of admiration or respect for you and how you do your job, which is you’re super good at engaging and listening to users kind of in their context to understand what their problems are. Such that you can best kind of articulate or define or ideate things that could help them address problems that they encounter in their everyday life. And that then allows you kind of as a leader, right, to use that to motivate quick prototype development to get the next level of testing or validation of what if this, right? And those things may or may not involve duct tape, right, involve some very crude things that are trying to elicit kind of that response or feedback from a user in terms of, is this something that would be valuable to you in overcoming some challenges that I’ve observed you having, let’s say, in your home environment? So I’m curious, Mike, how do you think about that process and how that translates into shaping a product design or the identification of an opportunity? I’m curious, maybe what you’ve learned through Labrador. I know you spent a lot of time in people’s homes to do exactly that. So I’m curious, how do you conduct that work? What are you looking for? How does that guide your development process?
Dooley: The word that you talk about is customer empathy, is are you feeling their pain? Are you understanding their need, and how are you connecting with it? And my undergrad’s in psychology, so I always was interested in what makes people think the way they do. I remember a iRobot study going into a home. And we were in the last day testing with somebody and a busy mom. And we’re testing Braava Jet. It’s a little robot that iRobot sells, that it’s really good for places with tight spaces for spraying and scrubbing floors, like kitchens and bathrooms. And the mom said, she almost said it was exhaustion, is that— I said, “What is it?” She says, “Does this do as good of a job as you could do?” And I think most people from iRobot would admit, “No. Can I match what the grease power, all the effort and everything I can put into this?” And she says, “But at least I can set this up, hit a button, and I can go to sleep. And at least it’s getting the job done. It’s doing something, and it gives me my time back.” And when you hear that, people go, “Well, Roomba is just something that cleans for people or whatever.” Like, “No. Roomba gives people their time back.” And once you’re on that channel, then you start thinking about, “Okay, what can we do more with the product that does that, that’s hitting that sort of core thing?” So yeah, and I think having the humbleness to not build a product you want, build it to the need, and then also the humbleness about where you can meet that need and where you can’t. Because robotics is hard, and we can’t make Rosey yet and things like that, so.
Ackerman: Mike, I’m curious, did you have to make compromises like that? Is there an example you could give with Labrador?
Dooley: Oh, jeez, all the— yeah. I mean, no, Labrador is perfect. No, I mean, we go through that all the time. I think on Labrador, no, we can’t do everything people want. What you’re trying to say, is it— I think there’s different languages of minimum viable product or good enough. There was somebody at Amazon used the term— I’m going to blank on it. It was like wonderful enough or something, or they have a nicer—
Jones: Lovable?
Dooley: Lovable. Yeah, lovable enough or something. And I think that that’s what you have to remember, is like, so on one hand, you have to be— you have to sort of have this open heart that you want to help people. And the other point, you have to have a really tight wallet because you just can’t spend enough to meet everything that people want. And so just a classic example is, Labrador goes up and down a certain amount of height. And people’s cabinets and someone in a wheelchair, they would love it if we would go up to the upper cabinets above the kitchen sink or other locations. And when you look at that, mechanically we can, but that then creates-- there’s product realities about stability and tilt testing. And so we have to fit those. Chris knows that well with Ava, for instance, is how heavy the base is for every inch you raise the mass above a certain amount. And so we have to make a limit. You have to say, “Hey, here’s the envelope. We’re going to do this to this, or we’re going to carry this much because that’s as much as we could deliver with this sort of function.” And then, is that lovable enough? Is that is that rewarding enough to people? And I think that’s the hard [inaudible], is that you have to do these deliveries within constraints. And I think sometimes when I’m talking to folks, they’re either outside robotics or they’re very much on the engineering side and not thinking about the product. They tend to think that you have to do everything. And it’s like that’s not how product development works, is you have to do just the critical first step, because then that makes this a category, and then you can do the next one and the next one. I think it brings to mind— Roomba has gone through an incredible evolution of what its functions were and how it worked and its performance since the very first version and to what Chris and team offer now. But if they tried to do the version today back then, they wouldn’t have been able to achieve it. And others fail because they probably went to the wrong angle. And yeah.
Jones: Evan, I think you asked if there are anything that was operating under constraints. I think product development in general, I presume, but certainly, robotics is all about constraints. It’s how do you operate within those? How do you understand where those boundaries are and having to make those calls as to— how are you going to have to— how are you going to decide to constrain your solution, right, to make sure that it’s something that’s feasible for you to do, right? It’s meeting a compelling need. It’s feasible for you to do. You can robustly deliver it. Trying to get that entire equation to work means you do have to reckon with those constraints kind of across the board to find the right solve. Mike, I’m curious. You do your user research, you have that customer empathy, you’ve perhaps worked through some of these surprising challenges that I’m sure you’ve encountered along the way with Labrador. You ultimately get to a point that you’re able to do pilots in homes, right? You’re actually now this— maybe the Duct Tape is gone or it’s at least hidden, right? It’s something that looks and feels more like a product and you’re actually getting into some type of more extended pilot of the product or idea of the product in users’ homes. What are the types of things you’re looking to accomplish with those pilots? Or what have you learned when you go from, “All right, I’ve been watching this user in their home with those challenges. So now I’m actually leaving something in their home without me being there and expecting them to be able to use it”? What’s the benefit or the learnings that you encounter in conducting that type of work?
Dooley: Yeah, it’s a weird type of experiment and there’s different schools of thought of how you do stuff. Some people want to go in and research everything to death and be a fly on the wall. And we went through this— I won’t say the source of it. A program we had to go through because of some of the— because of some of the funding that we’re getting from another project. And the quote in the beginning, they put up a slide that I think it’s from Steve Jobs. I’m sure I’m going to butcher it, that people don’t know what they want until I show them or something. I forget what the exact words are. And they were saying, “Yeah, that’s true for Steve Jobs, but for you, you can really talk to the customer and they’re going to tell you what they need.” I don’t believe that.
Jones: They need a faster horse, right? They don’t need a car.
Dooley: Yeah, exactly.
Jones: They’re going to tell you they need a faster horse.
Dooley: Yeah, so I’m in the Steve Jobs camp and on that. And it’s not because people aren’t intelligent. It’s just that they’re not in that world of knowing what possibilities you’re talking about. So I think there is this sort of soft skill between, okay, listen to their pain point. What is that difficulty of it? You’ve got a hypothesis to say, “Okay, out of everything you said, I think there’s an overlap here. And now I want to find out—” and we did that. We did that in the beginning. We did different ways of explaining the concept, and then the first level we did was just explain it over the phone and see what people thought of it and almost test it neutrally. Say, “Hey, here’s an idea.” And then, “Oh, here’s an idea like Roomba and here’s an idea like Alexa. What do you like or dislike?” Then we would actually build a prototype that was remote-controlled and brought it in their home, and now we finally do the leave-behind. And the whole thing is it’s like how to say it. It’s like you’re sort of releasing it to the world and we get out of the way. The next part is that it’s like letting a kid go and play soccer on their own and you’re not yelling or anything or don’t even watch. You just sort of let it happen. And what you’re trying to do is organically look at how are people— you’ve created this new reality. How are people interacting with it? And what we can see is the robots, they won’t do this in the future, but right now they talk on Slack. So when they send it to the kitchen, I can look up and I can see, “Hey, user one just sent it to the kitchen, and now they’re sending it to their armchair, and they’re probably having an afternoon snack. Oh, they sent it to the laundry room. Now they sent it over to the closet. They’re doing the laundry.” And the thing for us was just watching how fast were people adopting certain things, and then what were they using it for. And the striking thing that was—
Jones: That’s interesting.
Dooley: Yeah, go ahead.
Jones: I was just going to say, I mean, that’s interesting because I think I’m sure it’s very natural to put the product in someone’s home and kind of have a rigid expectation of, “No, no, this is how you use it. No, no, you’re doing it wrong. Let me show you how you use this.” But what you’re saying is it’s almost, yeah, you’re trying your best to solve their need here, but at some point you kind of leave it there, and now you’re also back into that empathy mode. It’s like, “Now with this tool, how do you use it?” and see kind of what happens.
Dooley: I think you said it in a really good way, is that you’ve changed this variable in the experiment. You’ve introduced this, and now you go back to just observing, just hearing what they’re— just watching what they’re doing with it, being as in-intrusive as possible, which is like, “We’re not there anymore.” Yeah, the robot’s logging it and we can see it, but it’s just on them. And we’re trying to stay out of the process and see how they engage with it. And that’s sort of like the thing that— we’ve shared it before, but we were just seeing that people were using it 90 to a 100 times a month, especially after the first month. It was like, we were looking at just the steady state. Would this become a habit or routine, and then what were they using it for?
Jones: So you’re saying when you see that, you have kind of a data point of one or a small number, but you have such a tangible understanding of the impact that this seems to be having, that you as an entrepreneur, right, that gives you a lot of confidence that may not be visible to whatever people that are outside the walls just trying to look at what you’re doing in the business. They see one data point, which is harder to grapple with, but you, being that close and understanding in that connection between what the product is doing and the needs that that gives you or the team a substantial confidence boost, right, is to, “This is working. We need to scale it. We have to show that this ports to other people in their homes, etc.,” but it gives you that confidence.
Dooley: Yeah, and then when we take the robots away, because we only have so many and we rotate them, getting the guilt trip emojis two months later from people, “I miss my robot. When are you going to build a new one?” and all that and stuff. So—
Jones: Do people name the robots?
Dooley: Yeah. They immediately do that and come up with creative names for it. One was called Rosey, naturally, but others was like— I’m forgetting the name she called it. It was inspired by a science fiction on an artificial AI companion and things. And it was just quite a bit of just different angles of— because she saw this as her assistant. She saw this as sort of this thing. But yeah, so I think that, again, for a robot, what you can see in the design is the classic thing at CES is to make a robot with a face and arms that doesn’t really do anything with those, but it pretends to be humanoid or human-like. And so we went the entire other route with this. And the fact that people then still relate to it that way, it means that-- we’re not trying to be cold or dispassionate. We’re just really interested in, can they get that value? Are they reacting to what the robot is doing, not to what the sort of halo that you sort of dressed it up as for that?
Jones: Yeah, I mean, as you know, like with Roomba or Braava and things like that, it’s the same thing. People project anthropomorphism or project that personality onto them, but that’s not really there, right, in a strong way. So yeah.
Dooley: Yeah, no, and it’s weird. And it’s something they do with robots in a weird way that they don’t-- people don’t name their dishwasher usually or something. But no, I would have-
Dooley: Yeah, [inaudible]. I did for a while. The stove got jealous, and then we had this whole thing when the refrigerator got into it.
Ackerman: I’ve heard anecdotally that maybe this was true with PackBots. I don’t know if it’s true with Roombas. That people want their robot back. They don’t want you to replace their old robot with a new robot. They want you to fix the old robot and have that same physical robot. It’s that lovely connection.
Jones: Yeah, certainly, PackBot on kind of the military robot side for bomb disposal and things like that, you would directly get those technicians who had a damaged robot, who they didn’t want a new robot. They wanted this one fixed, right? Because again, they anthropomorphize or there is some type of a bond there. And I think that’s been true with all of the robots, right? It’s something about the mobility, right, that embodies them with some type of a-- people project a personality on it. So they don’t have to be fancy and have arms and faces necessarily for people to project that on them. So that seems to be a common trait for any autonomously mobile platform.
Ackerman: Yeah. Mike, it was interesting to hear you say that. You’re being very thoughtful about that, and so I’m wondering if Chris, you can address that a little bit too. I don’t know if they do this anymore, but for a while, robots would speak to you, and I think it was a female voice that they had if they had an issue or something or needed to be cleaned. And that I always found to be an interesting choice because it’s sort of like the company is now giving this robot a human characteristic that’s very explicit. And I’m wondering how much thought went into that, and has that changed over the years about how much you’re willing to encourage people to anthropomorphize?
Jones: I mean, it’s a good question. I mean, that’s evolved, I would say, over the years, from not so much to there’s more of kind of a vocalization coming from the robot for certain scenarios. It is an important part. Some users, that is a primary way of interacting. I would say more of that type of feedback these days comes through more of kind of the mobile experience through the app to give both the feedback, additional information, actionable next steps. If you need to empty the dustbin or whatever it is, that that’s just a richer place to put that and a more accepted or common way for that to happen. So I don’t know, I would say that’s the direction things have trended, but I don’t know that that’s— that’s not because I don’t believe that we’re not trying to humanize the robot itself. It’s just more of a practical place where people these days will expect. It’s almost like Mike was saying about the dishwasher and the stove, etc. If everything is trying to talk to you like that or kind of project its own embodiment into your space, it could be overwhelming. So I think it’s easier to connect people at the right place and the right time with the right information, perhaps, if it’s through the mobile experience though.
But it is. That human-robot interaction or that experience design is a nuanced and tricky one. I’m certainly not an expert there myself, but it’s hard to find that right balance, that right mix of, what do you ask or expect of the user versus what do you assume or don’t give them an option? Because you also don’t want to overload them with too much information or too many options or too many questions, right, as you try to operate the product. So sometimes you do have to make assumptions, make defaults, right, that maybe can be changed if there’s really a need to that might require more digging. And Mike, I was curious. That was a question I had for you, was you have a physically, a meaningfully-sized product that’s operating autonomously in someone’s home, right?
Dooley: Yes.
Jones: Roomba can drive around and will navigate, and it’s a little more expected that we might bump into some things as we’re trying to clean and clean up against walls or furniture and all of that. Then it’s small enough that that isn’t an issue. How do you design for a product of the size that you’re working on, right? What went into kind of human-robot interaction side of that to allow for people who need to use this in their home that are not technologists, but they can take advantage of the— that can take advantage of the great value, right, that you’re trying to deliver for them. But it’s got to be super simple. How did you think about that HRI kind of design?
Dooley: There’s a lot wrapped into that. I think the bus stop is the first part of it. What’s the simplest way that they can command in a metaphor? Like everybody can relate to armchair or front door, that sort of thing. And so that idea that the robot just goes to these destinations is super simplifying. People get that. It’s almost now at a nanosecond how fast they get that and that metaphor. So that was one of it. And then you sort of explain the rules of the road of how the robot can go from place to place. It’s got these bus routes, but they’re elastic and that it can go around you if needed. But there’s all these types of interactions. Okay, we figured out what happens when you’re coming down the hall and the robot’s coming down. Let’s say you’re somebody else and they just walk towards each other. And I know in hospitals, the robot’s programmed to go to the side of the corridor. There’s no side in a home. That’s the stuff. So those are things that we still have to iron out, but there’s timeouts and there’s things of—that’s where we’ll be—we’re not doing it yet, but it’d be great to recognize that’s a person, not a closed door or something and respond to it. So right now, we have to tell the users, “Okay, it’ll spin a time to make sure you’re there, but then it’ll give up. And if you really wanted to, you could tell it to go back from your app. You could get out of the way if you want, or you could stop it by doing this.”
And so that’ll get refined as we get to the market, but those interactions, yeah, you’re right. You have this big robot that’s coming down. And one of the surprising things was it’s not just people. One of the women in the pilot had a Border Collie, and their Border Collie’s, by instinct, bred to herd sheep. So it would hear the robot. The robot’s very quiet, but she would command it. It would hear the robot coming down the hall and it would put its paw out to stop it, and that became it’s game. It started herding the robot. And so it’s really this weird thing, this metaphor you’re getting at.
Jones: Robots are pretty stubborn. The robot probably just sat there for like five minutes, like, “Come on. Who’s going to blink?”
Dooley: Yeah. Yeah. And the AI we’d love to add, we have to catch up with where you guys are at or license some of your vision recognition algorithms because, first, we’re trying to navigate and avoid obstacles. And that’s where all the tech is going into in terms of the design and the tiers of safety that we’re doing. But it’s just like what the user wanted in that case is, if it’s the dog, can you play my voice, say, “Get out” or, “Move,” or whatever, or something, “Go away”? Because she sent me a video of this. It’s like it was happening to her too, is she would send the robot out. The dogs would get all excited, and she’s behind it in her wheelchair. And now the dogs are waiting for her on the other side of the robot, the robot’s wondering what to do, and they’re all in the hall. And so yeah, there’s this sort of complication that gets in there that you have multiple agents going on there.
Ackerman: Maybe one more question from each of you guys. Mike, you want to go first?
Dooley: I’m trying to think. I have one more. And when you have new engineers start—let’s say they haven’t worked on robots before. They might be experienced. They’re coming out of school or they’re from other industries and they’re coming in. What is some key thing that they learn, or what sort of transformation goes on in their mind when they finally get in the zone of what it means to develop robots? And it’s a really broad question, but there’s sort of a rookie thing.
Jones: Yeah. What’s an aha moment that’s common for people new to robotics? And I think this is woven throughout this entire conversation here, which is, macro level, robots are actually hard. They’re difficult to kind of put the entire electromechanical software system together. It’s hard to perceive the world. If a robot’s driving around the home on its own, it needs to have a pretty good understanding of kind of what’s around it. Is something there, is something not there? The richer that understanding can be, the more adaptable or personalized that it can be. But generating that understanding is also hard. They have to be built to deal with all of those unanticipated scenarios that they’re going to encounter when they’re let out into the wild. So it’s that I think it’s surprising to a lot of people how long that long tail of corner cases ends up being that you have to grapple with. If you ignore one of them, it can mean it can end the product, right? It’s a long tail of things. Any one of them ends up, if it rears its head enough for those users, they’ll stop using the product because, “Well, this thing doesn’t work, and this has happened like twice to me now in the year I’ve had it. I’m kind of done with it,” right?
So you really have to grapple with the very long, long tail of corner cases when the technology hits the real world. I think that’s a super surprising one for people who are new to robotics. It’s more than a hardware consumer product company, consumer electronics company. You do need to deal with those challenges of perception, mobility in the home, the chaos of— specifically, you’re talking about more of the home environment, not the more structured environment and the industrial side. And I think that’s something that everyone has to go through that learning curve of understanding the impact that can have.
Dooley: Yeah. Of the dogs and cats.
Jones: Yeah, I mean, who would have thought cats are going to jump on the thing or Border Collies are going to try to herd it, right? And you have to just-- and you don’t learn those things until you get products out there. And that’s, Mike, what I was asking you about pilots and what do you hope to learn or the experience there. Is you have to take that step if you’re going to start kind of figuring out what those elements are going to start looking like. It’s very hard to do just intellectually or on paper or in the lab. You have to let them out there. So that’s a learning lesson there. Mike, maybe a similar question for you, but--
Ackerman: This is the last one, so make it a good one.
Jones: Yep. The last one, it better be a good one, huh? It’s a similar question for you, but maybe cut more on address to an entrepreneur in the robotic space. I’m curious, for a robot company to succeed, there’s a lot of, I’ll call them, ecosystem partners, right, that have to be there. Manufacturing, channel, or go-to-market partners, funding, right, to support a capital-intensive development process, and many more. I’m curious, what have you learned or what do people need to going into a robotics development or looking to be a robotics entrepreneur, what do people miss? What have you learned? What have you seen? What are the partners that are the most important? And I’m not asking for, “Oh, iRobot’s an investor. Speak nicely on the financial investor side.” That’s not what I’m after. But what have you learned, that you better not ignore this set of partners because if one of them falls through or it doesn’t work or is ineffective, it’s going to be hard for all the other pieces to come together?
Dooley: Yeah, it’s complex. I think just like you said, robots is hard. I think when we got acquired by iRobot and we were having some of the first meetings over— it’s Mike from software. Halloran.
Ackerman: This was Evolution Robotics?
Dooley: Evolution. Yeah, but Mike Halloran from iRobot, we came to the office at the Evolution’s office, and he just said, “Robots are hard. They’re really hard.” And it’s like, that’s the point we knew there was harmony. We were sort of under this thing. And so for everything what Chris is saying is that all of that is high stakes. And so you sort of have to be-- you have to be good enough on all those fronts of all those partners. And so some of it is critical path technology. Depth cameras, that function is really critical to us, and it’s critical to work well and then cost and scale. And so just being flexible about how we can deal with that and looking at that sort of chain and how do we sort of start at one level and scale it through? So you look at sort of, okay, what are these key enabling technologies that have to work? And that’s one bucket that are there. Then the partnerships on the business side, we’re in a complex ecosystem. I think the other rude awakening when people look at this is like, “Well, yeah, why doesn’t-- as people get older, they have disabilities. That’s what you have-- that’s your insurance funds.” It’s like, “No, it doesn’t.” It doesn’t for a lot of-- unless you have specific types of insurance. We’re partnering with Nationwide. They have long-term care insurance - and that’s why they’re working with us - that pays for these sorts of issues and things. Or Medicaid will get into these issues depending on somebody’s need.
And so I think what we’re trying to understand is—this goes back to that original question about customer empathy—is that how do we adjust what we’re doing? That we have this vision. I want to help people like my mom where she is now and where she was 10 years ago when she was experiencing difficulties with mobility initially. And we have to stage that. We have to get through that progression. And so who are the people that we work with now that solves a pain point that can be something that they have control over that is economically viable to them? And sometimes that means adjusting a bit of what we’re doing, because it’s just this step onto the long path as we do it.
Ackerman: Awesome. Well, thank you both again. This was a great conversation.
Jones: Yeah, thanks for having us and for hosting, Evan and Mike. Great to talk to you.
Dooley: Nice seeing you again, Chris and Evan. Same. Really enjoyed it.
Ackerman: We’ve been talking with Chris Jones from iRobot and Mike Dooley from Labrador Systems about developing robots for the home. And thanks again to our guests for joining us, for ChatBot and IEEE Spectrum. I’m Evan Ackerman.
Episode 2: How Labrador and iRobot Create Domestic Robots That Really Help
Evan Ackerman: I’m Evan Ackerman, and welcome to ChatBot, a new podcast from IEEE Spectrum where robotics experts interview each other about things that they find fascinating. On this episode of ChatBot, we’ll be talking with Mike Dooley and Chris Jones about useful robots in the home. Mike Dooley is the CEO and co-founder of Labrador Systems, the startup that’s developing an assistive robot in the form of a sort of semi-autonomous mobile table that can help people move things around their homes. Before founding Labrador, Mike led the development of Evolution Robotics’ innovative floor-cleaning robots. And when Evolution was acquired by iRobot in 2012, Mike became iRobot’s VP of product and business development. Labrador Systems is getting ready to launch its first robot, the Labrador Retriever, in 2023. Chris Jones is the chief technology officer at iRobot, which is arguably one of the most successful commercial robotics companies of all time. Chris has been at iRobot since 2005, and he spent several years as a senior investigator at iRobot research working on some of iRobot’s more unusual and experimental projects. iRobot Ventures is one of the investors in Labrador Systems. Chris, you were doing some interesting stuff at iRobot back in the day too, that I think a lot of people may not know how diverse iRobot’s robotics projects were.
Chris Jones: I think iRobot as a company, of course, being around since 1990, has done all sorts of things. Toys, commercial robots, consumer, military, industrial, all sorts of different things. But yeah, myself in particular, I spent the first seven, eight years of my time at iRobot doing a lot of super fun kind of far-out-there research types of projects, a lot of them funded by places like DARPA and working with some great academic collaborators, and of course, a whole crew of colleagues at iRobot. But yeah, some of those were ranged from completely squishy robots to robot arms to robots that could climb mountainsides to robots under the water, all sorts of different fun, useful, but fun, of course, and really challenging, which makes it fun, different types of robot concepts.
Ackerman: And those are all getting incorporated to the next generation Roomba, right?
Jones: I don’t know that I can comment on—
Ackerman: That’s not a no. Yeah. Okay. So Mike, I want to make sure that people who aren’t familiar with Labrador get a good understanding of what you’re working on. So can you describe kind of Labrador’s robot, what it does and why it’s important?
Mike Dooley: Yeah. So Labrador, we’re developing a robot called the Retriever, and it’s really designed as an extra pair of hands for individuals who have some issue either with pain, a health issue or injury that impacts their daily activities, particularly in the home. And so this is a robot designed to help people live more independently and to augment their abilities and give them some degree of autonomy back where they’re fighting that with the issue that they’re facing. And the robot, I think it’s been— after previewing its CES, it has been called a self-driving shelf. It’s designed to be really a mobile platform that’s about the size of a side table but has the ability to carry things as large as a laundry basket or set the dinner and plates on it, automatically navigates from place to place. It raises up to go up to countertop height when you’re by the kitchen sink and lowers down when you’re by your armchair. And it has the ability to retrieve too. So it’s a cross between robots that are used in warehousing to furniture mixed together to make something that’s comfortable and safe for the environment, but really is really meant to help folks where they have some difficulty moving themselves. This is meant to help them give that some degree of that independence back, as well as extend the impact of it for caregivers.
Ackerman: Yeah, I thought that was a fantastic idea when I first saw it at CES, and I’m so glad that you’ve been able to continue working on it. And especially with some support from folks like iRobot, right? Chris, iRobot is an investor in Labrador?
Jones: Correct. Through iRobot Ventures, we’re an early investor in Labrador. Of course, where that means, and we continue to be super excited about what they’re doing. I mean, for us, anyone who has great ideas for how robots can help people, in particular, assist people in their home with independent living, etc., I think is something we strongly believe is going to be a great application for robots. And when making investments, I’ll just add, of course, that earliest stage, a lot of it is about the team, right? And so Mike and the rest of his team are super compelling, right? That paired with a vision, that’s something that we believe is a great application for robots. It makes it an easy decision, right, to say there’s someone we’d like to support. So we love seeing their progress.
Ackerman: Yeah, me too.
Dooley: And we appreciate your support very much. So yeah.
Ackerman: All right, so what do you guys want to talk about? Mike, you want to kick things off?
Dooley: I can lead off. Yeah, so in full disclosure, at some point in my life, I was-- Chris, what’s the official name for an iRobot employee? I forgot what they came up with. It’s not iRoboteer, is it?
Jones: iRoboteer. Yeah.
Dooley: Okay, okay. All right, so I was an iRoboteer in my past life and crossed over with Chris for a number of years. And I know they’ve renovated the building a couple times now, but these products you mentioned or the robots you mentioned at the beginning, a lot of them are in display in a museum. And so I think my first question to Chris was, can you think of one of those, either that you worked on or maybe it didn’t, but you go, “Man, this should have taken off or this should have been this--” or it should have or you wished it would have. It would have been great if one of those that’s in there because there’s a lot, so.
Jones: Yes, there are a lot. You’re right. We have a museum, and it has been renovated in the last couple years, Mike, so you should come back and visit and check out the new updated museum. How would I answer that? There are so many things in there. I would say one that I have some sentimentality toward, and I think it holds some really compelling promise, even though at least to date, it hasn’t gone anywhere outside of the museum, Evan, is related to the squishy robots I was talking about. And in my mind, in one of the key challenges in unlocking future value in robots, and in particular, in autonomous robots, for example, in the home, is manipulation, is physical manipulation of the environment in the home. And Mike and Labrador are doing a little bit of this, right, by being able to maneuver and pick up, carry, drop off some things around the home. But the idea of a robot that’s able to physically pick up, grasp objects, pick them up off the floor, off a counter, open and close doors, all of those things is kind of the Holy Grail, right, if you can cost-effectively and robustly do that. In the home, there’s all sorts of great applications for that. And one of those research projects that’s in the museum was actually something called the Jamming Gripper. Mike, I don’t know if you remember seeing that at all, but this takes me back. And Evan, actually, I’m sure there are some IEEE stories and stuff back in the day from this. But this was an idea of a very compliant, it’s a soft manipulator. It’s not a hand. It’s actually very close to imagining a very soft membrane that’s filled with coffee grounds. So imagine a bag of coffee, right? Very soft and compliant.
But vacuum-packed coffee, you pull a vacuum on that bag. It turns rigid in the shape that it was in. It’s like a brick, which is a great concept for thinking about robot manipulation. That’s one idea. We had spent some research time with some folks in academia, had built a huge number of prototypes, and I still feel like there’s something there. There’s a really interesting concept there that can help with that more general purpose manipulation of objects in the home. So Mike, if you want to talk to us about licensing, maybe we can do that for Labrador with all your applications.
Dooley: Yeah. Actually, that’s what you should add. It would probably increase your budget dramatically, but you should add live demonstrations to the museum. See if you can have projects to get people to bring some of those back. Because I’m sure I saw it. I never knew it was doing that.
Jones: I mean, maybe we can continue this. There might be a little bit of a thread to continue that question into—the first one that came to my mind, Mike, when I was thinking about what to ask. And it’s something I have a lot of admiration or respect for you and how you do your job, which is you’re super good at engaging and listening to users kind of in their context to understand what their problems are. Such that you can best kind of articulate or define or ideate things that could help them address problems that they encounter in their everyday life. And that then allows you kind of as a leader, right, to use that to motivate quick prototype development to get the next level of testing or validation of what if this, right? And those things may or may not involve duct tape, right, involve some very crude things that are trying to elicit kind of that response or feedback from a user in terms of, is this something that would be valuable to you in overcoming some challenges that I’ve observed you having, let’s say, in your home environment? So I’m curious, Mike, how do you think about that process and how that translates into shaping a product design or the identification of an opportunity? I’m curious, maybe what you’ve learned through Labrador. I know you spent a lot of time in people’s homes to do exactly that. So I’m curious, how do you conduct that work? What are you looking for? How does that guide your development process?
Dooley: The word that you talk about is customer empathy, is are you feeling their pain? Are you understanding their need, and how are you connecting with it? And my undergrad’s in psychology, so I always was interested in what makes people think the way they do. I remember a iRobot study going into a home. And we were in the last day testing with somebody and a busy mom. And we’re testing Braava Jet. It’s a little robot that iRobot sells, that it’s really good for places with tight spaces for spraying and scrubbing floors, like kitchens and bathrooms. And the mom said, she almost said it was exhaustion, is that— I said, “What is it?” She says, “Does this do as good of a job as you could do?” And I think most people from iRobot would admit, “No. Can I match what the grease power, all the effort and everything I can put into this?” And she says, “But at least I can set this up, hit a button, and I can go to sleep. And at least it’s getting the job done. It’s doing something, and it gives me my time back.” And when you hear that, people go, “Well, Roomba is just something that cleans for people or whatever.” Like, “No. Roomba gives people their time back.” And once you’re on that channel, then you start thinking about, “Okay, what can we do more with the product that does that, that’s hitting that sort of core thing?” So yeah, and I think having the humbleness to not build a product you want, build it to the need, and then also the humbleness about where you can meet that need and where you can’t. Because robotics is hard, and we can’t make Rosey yet and things like that, so.
Ackerman: Mike, I’m curious, did you have to make compromises like that? Is there an example you could give with Labrador?
Dooley: Oh, jeez, all the— yeah. I mean, no, Labrador is perfect. No, I mean, we go through that all the time. I think on Labrador, no, we can’t do everything people want. What you’re trying to say, is it— I think there’s different languages of minimum viable product or good enough. There was somebody at Amazon used the term— I’m going to blank on it. It was like wonderful enough or something, or they have a nicer—
Jones: Lovable?
Dooley: Lovable. Yeah, lovable enough or something. And I think that that’s what you have to remember, is like, so on one hand, you have to be— you have to sort of have this open heart that you want to help people. And the other point, you have to have a really tight wallet because you just can’t spend enough to meet everything that people want. And so just a classic example is, Labrador goes up and down a certain amount of height. And people’s cabinets and someone in a wheelchair, they would love it if we would go up to the upper cabinets above the kitchen sink or other locations. And when you look at that, mechanically we can, but that then creates-- there’s product realities about stability and tilt testing. And so we have to fit those. Chris knows that well with Ava, for instance, is how heavy the base is for every inch you raise the mass above a certain amount. And so we have to make a limit. You have to say, “Hey, here’s the envelope. We’re going to do this to this, or we’re going to carry this much because that’s as much as we could deliver with this sort of function.” And then, is that lovable enough? Is that is that rewarding enough to people? And I think that’s the hard [inaudible], is that you have to do these deliveries within constraints. And I think sometimes when I’m talking to folks, they’re either outside robotics or they’re very much on the engineering side and not thinking about the product. They tend to think that you have to do everything. And it’s like that’s not how product development works, is you have to do just the critical first step, because then that makes this a category, and then you can do the next one and the next one. I think it brings to mind— Roomba has gone through an incredible evolution of what its functions were and how it worked and its performance since the very first version and to what Chris and team offer now. But if they tried to do the version today back then, they wouldn’t have been able to achieve it. And others fail because they probably went to the wrong angle. And yeah.
Jones: Evan, I think you asked if there are anything that was operating under constraints. I think product development in general, I presume, but certainly, robotics is all about constraints. It’s how do you operate within those? How do you understand where those boundaries are and having to make those calls as to— how are you going to have to— how are you going to decide to constrain your solution, right, to make sure that it’s something that’s feasible for you to do, right? It’s meeting a compelling need. It’s feasible for you to do. You can robustly deliver it. Trying to get that entire equation to work means you do have to reckon with those constraints kind of across the board to find the right solve. Mike, I’m curious. You do your user research, you have that customer empathy, you’ve perhaps worked through some of these surprising challenges that I’m sure you’ve encountered along the way with Labrador. You ultimately get to a point that you’re able to do pilots in homes, right? You’re actually now this— maybe the Duct Tape is gone or it’s at least hidden, right? It’s something that looks and feels more like a product and you’re actually getting into some type of more extended pilot of the product or idea of the product in users’ homes. What are the types of things you’re looking to accomplish with those pilots? Or what have you learned when you go from, “All right, I’ve been watching this user in their home with those challenges. So now I’m actually leaving something in their home without me being there and expecting them to be able to use it”? What’s the benefit or the learnings that you encounter in conducting that type of work?
Dooley: Yeah, it’s a weird type of experiment and there’s different schools of thought of how you do stuff. Some people want to go in and research everything to death and be a fly on the wall. And we went through this— I won’t say the source of it. A program we had to go through because of some of the— because of some of the funding that we’re getting from another project. And the quote in the beginning, they put up a slide that I think it’s from Steve Jobs. I’m sure I’m going to butcher it, that people don’t know what they want until I show them or something. I forget what the exact words are. And they were saying, “Yeah, that’s true for Steve Jobs, but for you, you can really talk to the customer and they’re going to tell you what they need.” I don’t believe that.
Jones: They need a faster horse, right? They don’t need a car.
Dooley: Yeah, exactly.
Jones: They’re going to tell you they need a faster horse.
Dooley: Yeah, so I’m in the Steve Jobs camp and on that. And it’s not because people aren’t intelligent. It’s just that they’re not in that world of knowing what possibilities you’re talking about. So I think there is this sort of soft skill between, okay, listen to their pain point. What is that difficulty of it? You’ve got a hypothesis to say, “Okay, out of everything you said, I think there’s an overlap here. And now I want to find out—” and we did that. We did that in the beginning. We did different ways of explaining the concept, and then the first level we did was just explain it over the phone and see what people thought of it and almost test it neutrally. Say, “Hey, here’s an idea.” And then, “Oh, here’s an idea like Roomba and here’s an idea like Alexa. What do you like or dislike?” Then we would actually build a prototype that was remote-controlled and brought it in their home, and now we finally do the leave-behind. And the whole thing is it’s like how to say it. It’s like you’re sort of releasing it to the world and we get out of the way. The next part is that it’s like letting a kid go and play soccer on their own and you’re not yelling or anything or don’t even watch. You just sort of let it happen. And what you’re trying to do is organically look at how are people— you’ve created this new reality. How are people interacting with it? And what we can see is the robots, they won’t do this in the future, but right now they talk on Slack. So when they send it to the kitchen, I can look up and I can see, “Hey, user one just sent it to the kitchen, and now they’re sending it to their armchair, and they’re probably having an afternoon snack. Oh, they sent it to the laundry room. Now they sent it over to the closet. They’re doing the laundry.” And the thing for us was just watching how fast were people adopting certain things, and then what were they using it for. And the striking thing that was—
Jones: That’s interesting.
Dooley: Yeah, go ahead.
Jones: I was just going to say, I mean, that’s interesting because I think I’m sure it’s very natural to put the product in someone’s home and kind of have a rigid expectation of, “No, no, this is how you use it. No, no, you’re doing it wrong. Let me show you how you use this.” But what you’re saying is it’s almost, yeah, you’re trying your best to solve their need here, but at some point you kind of leave it there, and now you’re also back into that empathy mode. It’s like, “Now with this tool, how do you use it?” and see kind of what happens.
Dooley: I think you said it in a really good way, is that you’ve changed this variable in the experiment. You’ve introduced this, and now you go back to just observing, just hearing what they’re— just watching what they’re doing with it, being as in-intrusive as possible, which is like, “We’re not there anymore.” Yeah, the robot’s logging it and we can see it, but it’s just on them. And we’re trying to stay out of the process and see how they engage with it. And that’s sort of like the thing that— we’ve shared it before, but we were just seeing that people were using it 90 to a 100 times a month, especially after the first month. It was like, we were looking at just the steady state. Would this become a habit or routine, and then what were they using it for?
Jones: So you’re saying when you see that, you have kind of a data point of one or a small number, but you have such a tangible understanding of the impact that this seems to be having, that you as an entrepreneur, right, that gives you a lot of confidence that may not be visible to whatever people that are outside the walls just trying to look at what you’re doing in the business. They see one data point, which is harder to grapple with, but you, being that close and understanding in that connection between what the product is doing and the needs that that gives you or the team a substantial confidence boost, right, is to, “This is working. We need to scale it. We have to show that this ports to other people in their homes, etc.,” but it gives you that confidence.
Dooley: Yeah, and then when we take the robots away, because we only have so many and we rotate them, getting the guilt trip emojis two months later from people, “I miss my robot. When are you going to build a new one?” and all that and stuff. So—
Jones: Do people name the robots?
Dooley: Yeah. They immediately do that and come up with creative names for it. One was called Rosey, naturally, but others was like— I’m forgetting the name she called it. It was inspired by a science fiction on an artificial AI companion and things. And it was just quite a bit of just different angles of— because she saw this as her assistant. She saw this as sort of this thing. But yeah, so I think that, again, for a robot, what you can see in the design is the classic thing at CES is to make a robot with a face and arms that doesn’t really do anything with those, but it pretends to be humanoid or human-like. And so we went the entire other route with this. And the fact that people then still relate to it that way, it means that-- we’re not trying to be cold or dispassionate. We’re just really interested in, can they get that value? Are they reacting to what the robot is doing, not to what the sort of halo that you sort of dressed it up as for that?
Jones: Yeah, I mean, as you know, like with Roomba or Braava and things like that, it’s the same thing. People project anthropomorphism or project that personality onto them, but that’s not really there, right, in a strong way. So yeah.
Dooley: Yeah, no, and it’s weird. And it’s something they do with robots in a weird way that they don’t-- people don’t name their dishwasher usually or something. But no, I would have-
Dooley: Yeah, [inaudible]. I did for a while. The stove got jealous, and then we had this whole thing when the refrigerator got into it.
Ackerman: I’ve heard anecdotally that maybe this was true with PackBots. I don’t know if it’s true with Roombas. That people want their robot back. They don’t want you to replace their old robot with a new robot. They want you to fix the old robot and have that same physical robot. It’s that lovely connection.
Jones: Yeah, certainly, PackBot on kind of the military robot side for bomb disposal and things like that, you would directly get those technicians who had a damaged robot, who they didn’t want a new robot. They wanted this one fixed, right? Because again, they anthropomorphize or there is some type of a bond there. And I think that’s been true with all of the robots, right? It’s something about the mobility, right, that embodies them with some type of a-- people project a personality on it. So they don’t have to be fancy and have arms and faces necessarily for people to project that on them. So that seems to be a common trait for any autonomously mobile platform.
Ackerman: Yeah. Mike, it was interesting to hear you say that. You’re being very thoughtful about that, and so I’m wondering if Chris, you can address that a little bit too. I don’t know if they do this anymore, but for a while, robots would speak to you, and I think it was a female voice that they had if they had an issue or something or needed to be cleaned. And that I always found to be an interesting choice because it’s sort of like the company is now giving this robot a human characteristic that’s very explicit. And I’m wondering how much thought went into that, and has that changed over the years about how much you’re willing to encourage people to anthropomorphize?
Jones: I mean, it’s a good question. I mean, that’s evolved, I would say, over the years, from not so much to there’s more of kind of a vocalization coming from the robot for certain scenarios. It is an important part. Some users, that is a primary way of interacting. I would say more of that type of feedback these days comes through more of kind of the mobile experience through the app to give both the feedback, additional information, actionable next steps. If you need to empty the dustbin or whatever it is, that that’s just a richer place to put that and a more accepted or common way for that to happen. So I don’t know, I would say that’s the direction things have trended, but I don’t know that that’s— that’s not because I don’t believe that we’re not trying to humanize the robot itself. It’s just more of a practical place where people these days will expect. It’s almost like Mike was saying about the dishwasher and the stove, etc. If everything is trying to talk to you like that or kind of project its own embodiment into your space, it could be overwhelming. So I think it’s easier to connect people at the right place and the right time with the right information, perhaps, if it’s through the mobile experience though.
But it is. That human-robot interaction or that experience design is a nuanced and tricky one. I’m certainly not an expert there myself, but it’s hard to find that right balance, that right mix of, what do you ask or expect of the user versus what do you assume or don’t give them an option? Because you also don’t want to overload them with too much information or too many options or too many questions, right, as you try to operate the product. So sometimes you do have to make assumptions, make defaults, right, that maybe can be changed if there’s really a need to that might require more digging. And Mike, I was curious. That was a question I had for you, was you have a physically, a meaningfully-sized product that’s operating autonomously in someone’s home, right?
Dooley: Yes.
Jones: Roomba can drive around and will navigate, and it’s a little more expected that we might bump into some things as we’re trying to clean and clean up against walls or furniture and all of that. Then it’s small enough that that isn’t an issue. How do you design for a product of the size that you’re working on, right? What went into kind of human-robot interaction side of that to allow for people who need to use this in their home that are not technologists, but they can take advantage of the— that can take advantage of the great value, right, that you’re trying to deliver for them. But it’s got to be super simple. How did you think about that HRI kind of design?
Dooley: There’s a lot wrapped into that. I think the bus stop is the first part of it. What’s the simplest way that they can command in a metaphor? Like everybody can relate to armchair or front door, that sort of thing. And so that idea that the robot just goes to these destinations is super simplifying. People get that. It’s almost now at a nanosecond how fast they get that and that metaphor. So that was one of it. And then you sort of explain the rules of the road of how the robot can go from place to place. It’s got these bus routes, but they’re elastic and that it can go around you if needed. But there’s all these types of interactions. Okay, we figured out what happens when you’re coming down the hall and the robot’s coming down. Let’s say you’re somebody else and they just walk towards each other. And I know in hospitals, the robot’s programmed to go to the side of the corridor. There’s no side in a home. That’s the stuff. So those are things that we still have to iron out, but there’s timeouts and there’s things of—that’s where we’ll be—we’re not doing it yet, but it’d be great to recognize that’s a person, not a closed door or something and respond to it. So right now, we have to tell the users, “Okay, it’ll spin a time to make sure you’re there, but then it’ll give up. And if you really wanted to, you could tell it to go back from your app. You could get out of the way if you want, or you could stop it by doing this.”
And so that’ll get refined as we get to the market, but those interactions, yeah, you’re right. You have this big robot that’s coming down. And one of the surprising things was it’s not just people. One of the women in the pilot had a Border Collie, and their Border Collie’s, by instinct, bred to herd sheep. So it would hear the robot. The robot’s very quiet, but she would command it. It would hear the robot coming down the hall and it would put its paw out to stop it, and that became it’s game. It started herding the robot. And so it’s really this weird thing, this metaphor you’re getting at.
Jones: Robots are pretty stubborn. The robot probably just sat there for like five minutes, like, “Come on. Who’s going to blink?”
Dooley: Yeah. Yeah. And the AI we’d love to add, we have to catch up with where you guys are at or license some of your vision recognition algorithms because, first, we’re trying to navigate and avoid obstacles. And that’s where all the tech is going into in terms of the design and the tiers of safety that we’re doing. But it’s just like what the user wanted in that case is, if it’s the dog, can you play my voice, say, “Get out” or, “Move,” or whatever, or something, “Go away”? Because she sent me a video of this. It’s like it was happening to her too, is she would send the robot out. The dogs would get all excited, and she’s behind it in her wheelchair. And now the dogs are waiting for her on the other side of the robot, the robot’s wondering what to do, and they’re all in the hall. And so yeah, there’s this sort of complication that gets in there that you have multiple agents going on there.
Ackerman: Maybe one more question from each of you guys. Mike, you want to go first?
Dooley: I’m trying to think. I have one more. And when you have new engineers start—let’s say they haven’t worked on robots before. They might be experienced. They’re coming out of school or they’re from other industries and they’re coming in. What is some key thing that they learn, or what sort of transformation goes on in their mind when they finally get in the zone of what it means to develop robots? And it’s a really broad question, but there’s sort of a rookie thing.
Jones: Yeah. What’s an aha moment that’s common for people new to robotics? And I think this is woven throughout this entire conversation here, which is, macro level, robots are actually hard. They’re difficult to kind of put the entire electromechanical software system together. It’s hard to perceive the world. If a robot’s driving around the home on its own, it needs to have a pretty good understanding of kind of what’s around it. Is something there, is something not there? The richer that understanding can be, the more adaptable or personalized that it can be. But generating that understanding is also hard. They have to be built to deal with all of those unanticipated scenarios that they’re going to encounter when they’re let out into the wild. So it’s that I think it’s surprising to a lot of people how long that long tail of corner cases ends up being that you have to grapple with. If you ignore one of them, it can mean it can end the product, right? It’s a long tail of things. Any one of them ends up, if it rears its head enough for those users, they’ll stop using the product because, “Well, this thing doesn’t work, and this has happened like twice to me now in the year I’ve had it. I’m kind of done with it,” right?
So you really have to grapple with the very long, long tail of corner cases when the technology hits the real world. I think that’s a super surprising one for people who are new to robotics. It’s more than a hardware consumer product company, consumer electronics company. You do need to deal with those challenges of perception, mobility in the home, the chaos of— specifically, you’re talking about more of the home environment, not the more structured environment and the industrial side. And I think that’s something that everyone has to go through that learning curve of understanding the impact that can have.
Dooley: Yeah. Of the dogs and cats.
Jones: Yeah, I mean, who would have thought cats are going to jump on the thing or Border Collies are going to try to herd it, right? And you have to just-- and you don’t learn those things until you get products out there. And that’s, Mike, what I was asking you about pilots and what do you hope to learn or the experience there. Is you have to take that step if you’re going to start kind of figuring out what those elements are going to start looking like. It’s very hard to do just intellectually or on paper or in the lab. You have to let them out there. So that’s a learning lesson there. Mike, maybe a similar question for you, but--
Ackerman: This is the last one, so make it a good one.
Jones: Yep. The last one, it better be a good one, huh? It’s a similar question for you, but maybe cut more on address to an entrepreneur in the robotic space. I’m curious, for a robot company to succeed, there’s a lot of, I’ll call them, ecosystem partners, right, that have to be there. Manufacturing, channel, or go-to-market partners, funding, right, to support a capital-intensive development process, and many more. I’m curious, what have you learned or what do people need to going into a robotics development or looking to be a robotics entrepreneur, what do people miss? What have you learned? What have you seen? What are the partners that are the most important? And I’m not asking for, “Oh, iRobot’s an investor. Speak nicely on the financial investor side.” That’s not what I’m after. But what have you learned, that you better not ignore this set of partners because if one of them falls through or it doesn’t work or is ineffective, it’s going to be hard for all the other pieces to come together?
Dooley: Yeah, it’s complex. I think just like you said, robots is hard. I think when we got acquired by iRobot and we were having some of the first meetings over— it’s Mike from software. Halloran.
Ackerman: This was Evolution Robotics?
Dooley: Evolution. Yeah, but Mike Halloran from iRobot, we came to the office at the Evolution’s office, and he just said, “Robots are hard. They’re really hard.” And it’s like, that’s the point we knew there was harmony. We were sort of under this thing. And so for everything what Chris is saying is that all of that is high stakes. And so you sort of have to be-- you have to be good enough on all those fronts of all those partners. And so some of it is critical path technology. Depth cameras, that function is really critical to us, and it’s critical to work well and then cost and scale. And so just being flexible about how we can deal with that and looking at that sort of chain and how do we sort of start at one level and scale it through? So you look at sort of, okay, what are these key enabling technologies that have to work? And that’s one bucket that are there. Then the partnerships on the business side, we’re in a complex ecosystem. I think the other rude awakening when people look at this is like, “Well, yeah, why doesn’t-- as people get older, they have disabilities. That’s what you have-- that’s your insurance funds.” It’s like, “No, it doesn’t.” It doesn’t for a lot of-- unless you have specific types of insurance. We’re partnering with Nationwide. They have long-term care insurance - and that’s why they’re working with us - that pays for these sorts of issues and things. Or Medicaid will get into these issues depending on somebody’s need.
And so I think what we’re trying to understand is—this goes back to that original question about customer empathy—is that how do we adjust what we’re doing? That we have this vision. I want to help people like my mom where she is now and where she was 10 years ago when she was experiencing difficulties with mobility initially. And we have to stage that. We have to get through that progression. And so who are the people that we work with now that solves a pain point that can be something that they have control over that is economically viable to them? And sometimes that means adjusting a bit of what we’re doing, because it’s just this step onto the long path as we do it.
Ackerman: Awesome. Well, thank you both again. This was a great conversation.
Jones: Yeah, thanks for having us and for hosting, Evan and Mike. Great to talk to you.
Dooley: Nice seeing you again, Chris and Evan. Same. Really enjoyed it.
Ackerman: We’ve been talking with Chris Jones from iRobot and Mike Dooley from Labrador Systems about developing robots for the home. And thanks again to our guests for joining us, for ChatBot and IEEE Spectrum. I’m Evan Ackerman.
Chatbot Episode 1: Making Boston Dynamics’ Robots Dance
Evan Ackerman: I’m Evan Ackerman, and welcome to ChatBot, a robotics podcast from IEEE Spectrum. On this episode of ChatBot, we’ll be talking with Monica Thomas and Amy LaViers about robots and dance. Monica Thomas is a dancer and choreographer. Monica has worked with Boston Dynamics to choreograph some of their robot videos in which Atlas, Spot, and even Handle dance to songs like Do You Love Me? The, “Do You Love Me?” Video has been viewed 37 million times. And if you haven’t seen it yet, it’s pretty amazing to see how these robots can move. Amy LaViers is the director of the Robotics, Automation, and Dance Lab, or RAD lab, which she founded in 2015 as a professor in Mechanical Science and Engineering at the University of Illinois, Urbana-Champaign. The RAD Lab is a collective for art making, commercialization, education, outreach, and research at the intersection of dance and robotics. And Amy’s work explores the creative relationships between machines and humans, as expressed through movement. So Monica, can you just tell me-- I think people in the robotics field may not know who you are or why you’re on the podcast at this point, so can you just describe how you initially got involved in Boston Dynamics?
Monica Thomas: Yeah. So I got involved really casually. I know people who work at Boston Dynamics and Marc Raibert, their founder and head. They’d been working on Spot, and they added the arm to Spot. And Marc was kind of like, “I kind of think this could dance.” And they were like, “Do you think this could dance?” And I was like, “It could definitely dance. That definitely could do a lot of dancing.” And so we just started trying to figure out, can it move in a way that feels like dance to people watching it? And the first thing we made was Uptown Spot. And it was really just figuring out moves that the robot does kind of already naturally. And that’s when they started developing, I think, Choreographer, their tool. But in terms of my thinking, it was just I was watching what the robot did as its normal patterns, like going up, going down, walking this place, different steps, different gates, what is interesting to me, what looks beautiful to me, what looks funny to me, and then imagining what else we could be doing, considering the angles of the joints. And then it just grew from there. And so once that one was out, Marc was like, “What about the rest of the robots? Could they dance? Maybe we could do a dance with all of the robots.” And I was like, “We could definitely do a dance with all of the robots. Any shape can dance.” So that’s when we started working on what turned into Do You Love Me? I didn’t really realize what a big deal it was until it came out and it went viral. And I was like, “Oh—” are we allowed to swear, or—?
Ackerman: Oh, yeah. Yeah.
Thomas: Yeah. So I was like, “[bleep bleep, bleeeep] is this?” I didn’t know how to deal with it. I didn’t know how to think about it. As a performer, the largest audience I performed for in a day was like 700 people, which is a big audience as a live performer. So when you’re hitting millions, it’s just like it doesn’t even make sense anymore, and yeah. So that was pretty mind-boggling. And then also because of kind of how it was introduced and because there is a whole world of choreo-robotics, which I was not really aware of because I was just doing my thing. Then I realized there’s all of this work that’s been happening that I couldn’t reference, didn’t know about, and conversations that were really important in the field that I also was unaware of and then suddenly was a part of. So I think doing work that has more viewership is really—it was a trip and a half—is a trip and a half. I’m still learning about it. Does that answer your question?
Ackerman: Yeah. Definitely.
Thomas: It’s a long-winded answer, but.
Ackerman: And Amy, so you have been working in these two disciplines for a long time, in the disciplines of robotics and in dance. So what made you decide to combine these two things, and why is that important?
Amy LaViers: Yeah. Well, both things, I guess in some way, have always been present in my life. I’ve danced since I was three, probably, and my dad and all of his brothers and my grandfathers were engineers. So in some sense, they were always there. And it was really-- I could tell you the date. I sometimes forget what it was, but it was a Thursday, and I was taking classes and dancing and controlling of mechanical systems, and I was realizing this over. I mean, I don’t think I’m combining them. I feel like they already kind of have this intersection that just exists. And I realized-- or I stumbled into that intersection myself, and I found lots of people working in it. And I was-- oh, my interests in both these fields kind of reinforce one another in a way that’s really exciting and interesting. I also happened to be an almost graduating-- I was in last class of my junior year of college, so I was thinking, “What am I going to do with myself?” Right? So it was very happenstance in that way. And again, I mean, I just felt like— it was like I walked into a room where all of a sudden, a lot of things made sense to me, and a lot of interests of mine were both present.
Ackerman: And can you summarize, I guess, the importance here? Because I feel like— I’m sure this is something you’ve run into, is that it’s easy for engineers or roboticists just to be— I mean, honestly, a little bit dismissive of this idea that it’s important for robots to have this expressivity. So why is it important?
LaViers: That is a great question that if I could summarize what my life is like, it’s me on a computer going like this, trying to figure out the words to answer that succinctly. But one way I might ask it, earlier when we were talking, you mentioned this idea of functional behavior versus expressive behavior, which comes up a lot when we start thinking in this space. And I think one thing that happens-- and my training and background in Laban Movement Analysis really emphasizes this duality between function and expression as opposed to the either/or. It’s kind of like the mind-body split, the idea that these things are one integrated unit. Function and expression are an integrated unit. And something that is functional is really expressive. Something that is expressive is really functional.
Ackerman: It definitely answers the question. And it looks like Monica is resonating with you a little bit, so I’m just going to get out of the way here. Amy, do you want to just start this conversation with Monica?
LaViers: Sure. Sure. Monica has already answered, literally, my first question, so I’m already having to shuffle a little bit. But I’m going to rephrase. My first question was, can robots dance? And I love how emphatically and beautifully you answered that with, “Any shape can dance.” I think that’s so beautiful. That was a great answer, and I think it brings up— you can debate, is this dance, or is this not? But there’s also a way to look at any movement through the lens of dance, and that includes factory robots that nobody ever sees.
Thomas: It’s exciting. I mean, it’s a really nice way to walk through the world, so I actually recommend it for everyone, just like taking a time and seeing the movement around you as dance. I don’t know if it’s allowing it to be intentional or just to be special, meaningful, something.
LaViers: That’s a really big challenge, particularly for an autonomous system. And for any moving system, I think that’s hard, artificial or not. I mean it’s hard for me. My family’s coming into town this weekend. I’m like, “How do I act so that they know I love them?” Right? That’s dramaticized version of real life, right, is, how do I be welcoming to my guests? And that’ll be, how do I move?
Thomas: What you’re saying is a reminder of, one of the things that I really enjoy watching robots move is that I’m allowed to project as much as I want to on them without taking away something from them. When you project too much on people, you lose the person, and that’s not really fair. But when you’re projecting on objects, things that are objects but that we personify— or not even personify, that we anthropomorphize or whatever, it is just a projection of us. But it’s acceptable. So nice for it to be acceptable, a place where you get to do that.
LaViers: Well, okay. Then can I ask my fourth question even though it’s not my turn? Because that’s just too perfect to what it is, which is just, what did you learn about yourself working with these robots?
Thomas: Well, I learned how much I love visually watching movement. I’ve always watched, but I don’t think it was as clear to me how much I like movement. The work that I made was really about context. It was about what’s happening in society, what’s happening in me as a person. But I never got into that school of dance that really spends time just really paying attention to movement or letting movement develop or explore, exploring movement. That wasn’t what I was doing. And with robots, I was like, “Oh, but yeah, I get it better now. I see it more now.” So much in life right now, for me, is not contained, and it doesn’t have answers. And translating movement across species from my body to a robot, that does have answers. It has multiple answers. It’s not like there’s a yes and a no, but you can answer a question. And it’s so nice to answer questions sometimes. I sat with this thing, and here’s something I feel like is an acceptable solution. Wow. That’s a rarity in life. So I love that about working with robots. I mean, also, they’re cool, I think. And it is also— they’re just cool. I mean, that’s true too. It’s also interesting. I guess the last thing that I really loved—and I didn’t have much opportunity to do this or as much as you’d expect because of COVID—is being in space with robots. It’s really interesting, just like being in space with anything that is different than your norm is notable. Being in space with an animal that you’re not used to being with is notable. And there’s just something really cool about being with something very different. And for me, robots are very different and not acclimatized.
Ackerman: Okay. Monica, you want to ask a question or two?
Thomas: Yeah. I do. The order of my questions is ruined also. I was thinking about the RAD Lab, and I was wondering if there are guiding principles that you feel are really important in that interdisciplinary work that you’re doing, and also any lessons maybe from the other side that are worth sharing.
LaViers: The usual way I describe it and describe my work more broadly is, I think there are a lot of roboticists that hire dancers, and they make robots and those dancers help them. And there are a lot of dancers that they hire engineers, and those engineers build something for them that they use inside of their work. And what I’m interested in, in the little litmus test or challenge I paint for myself and my collaborators is we want to be right in between those two things, right, where we are making something. First of all, we’re treating each other as peers, as technical peers, as artistic peers, as— if the robot moves on stage, I mean, that’s choreography. If the choreographer asks for the robot to move in a certain way, that’s robotics. That’s the inflection point we want to be at. And so that means, for example, in terms of crediting the work, we try to credit the creative contributions. And not just like, “Oh, well, you did 10 percent of the creative contributions.” We really try to treat each other as co-artistic collaborators and co-technical developers. And so artists are on our papers, and engineers are in our programs, to put it in that way. And likewise, that changes the questions we want to ask. We want to make something that pushes robotics just a inch further, a millimeter further. And we want to do something that pushes dance just an inch further, a millimeter further. We would love it if people would ask us, “Is this dance?” We get, “Is this robotics?” Quite a lot. So that makes me feel like we must be doing something interesting in robotics.
And every now and then, I think we do something interesting for dance too, and certainly, many of my collaborators do. And that inflection point, that’s just where I think is interesting. And I think that’s where— that’s the room I stumbled into, is where we’re asking those questions as opposed to just developing a robot and hiring someone to help us do that. I mean, it can be hard in that environment that people feel like their expertise is being given to the other side. And then, where am I an expert? And we’ve heard editors at publication venues say, “Well, this dancer can’t be a co-author,” and we’ve had venues where we’re working on the program and people say, “Well, no, this engineer isn’t a performer,” but I’m like, “But he’s queuing the robot, and if he messes up, then we all mess up.” I mean, that’s vulnerability too. So we have those conversations that are really touchy and a little sensitive and a little— and so how do you create that space where people do you feel safe and comfortable and valued and attributed for their work and that they can make a track record and do this again in another project, in another context and— so, I don’t know, if I’ve learned anything, I mean, I’ve learned that you just have to really talk about attribution all the time. I bring it up every time, and then I bring it up before we even think about writing a paper. And then I bring it up when we make the draft. And first thing I put in the draft is everybody’s name in the order it’s going to appear, with the affiliations and with the—subscripts on that don’t get added at the last minute. And when the editor of a very famous robotics venue says, “This person can’t be a co-author,” that person doesn’t get taken off as a co-author; that person is a co-author, and we figure out another way to make it work. And so I think that’s learning, or that’s just a struggle anyway.
Ackerman: Monica, I’m curious if when you saw the Boston Dynamics videos go viral, did you feel like there was much more of a focus on the robots and the mechanical capabilities than there was on the choreography and the dance? And if so, how did that make you feel?
Thomas: Yeah. So yes. Right. When dances I’ve made have been reviewed, which I’ve always really appreciated, it has been about the dance. It’s been about the choreography. And actually, kind of going way back to what we were talking about a couple things ago, a lot of the reviews that you get around this are about people, their reactions, right? Because, again, we can project so much onto robots. So I learned a lot about people, how people think about robots. There’s a lot of really overt themes, and then there’s individual nuance. But yeah, it wasn’t really about the dance, and it was in the middle of the pandemic too. So there’s really high isolation. I had no idea how people who cared about dance thought about it for a long time. And then every once in a while, I get one person here or one person there say something. So it’s a totally weird experience. Yes.
The way that I took information about the dance was kind of paying attention to the affective experience, the emotional experience that people had watching this. The dance was— nothing in that dance was— we use the structures of the traditions of dance in it for intentional reason. I chose that because I wasn’t trying to alarm people or show people ways that robots move that totally hit some old part of our brain that makes us absolutely panicked. That wasn’t my interest or the goal of that work. And honestly, at some point, it’d be really interesting to explore what the robots can just do versus what I, as a human, feel comfortable seeing them do. But the emotional response that people got told me a story about what the dance was doing in a backward-- also, what the music’s doing because—let’s be real—that music does— right? We stacked the deck.
LaViers: Yeah. And now that brings— I feel like that serves up two of my questions, and I might let you pick which one maybe we go to. I mean, one of my questions, I wrote down some of my favorite moments from the choreography that I thought we could discuss. Another question—and maybe we can do both of these in serie—is a little bit about— I’ll blush even just saying it, and I’m so glad that the people can’t see the blushing. But also, there’s been so much nodding, and I’m noticing that that won’t be in the audio recording. We’re nodding along to each other so much. But the other side—and you can just nod in a way that gives me your—the other question that comes up for that is, yeah, what is the monetary piece of this, and where are the power dynamics inside this? And how do you feel about how that sits now as that video continues to just make its rounds on the internet and establish value for Boston Dynamics?
Thomas: I would love to start with the first question. And the second one is super important, and maybe another day for that one.
Ackerman: Okay. That’s fair. That’s fair.
LaViers: Yep. I like that. I like that. So the first question, so my favorite moments of the piece that you choreographed to Do You Love Me? For the Boston Dynamics robots, the swinging arms at the beginning, where you don’t fully know where this is going. It looks so casual and so, dare I say it, natural, although it’s completely artificial, right? And the proximal rotation of the legs, I feel like it’s a genius way of getting around no spine. But you really make use of things that look like hip joints or shoulder joints as a way of, to me, accessing a good wriggle or a good juicy moment, and then the Spot space hold, I call it, where the head of the Spot is holding in place and then the robot wiggles around that, dances around that. And then the moment when you see all four complete—these distinct bodies, and it looks like they’re dancing together. And we touched on that earlier—any shape can dance—but making them all dance together I thought was really brilliant and effective in the work. So it’s one of those moments, super interesting, or you have a funny story about, I thought we could talk about it further.
Thomas: I have a funny story about the hip joints. So the initial— well, not the initial, but when they do the mashed potato, that was the first dance move that we started working on, on Atlas. And for folks who don’t know, the mashed potato is kind of the feet are going in and out; the knees are going in and out. So we ran into a couple of problems, which—and the twist. I guess it’s a combo. Both of them like you to roll your feet on the ground like rub, and that friction was not good for the robots. So when we first started really moving into the twist, which has this torso twisting— the legs are twisting. The foot should be twisting on the floor. The foot is not twisting on the floor, and the legs were so turned out that the shape of the pelvic region looked like a over-full diaper. So, I mean, it was wiggling, but it made the robot look young. It made the robot look like it was in a diaper that needed to be changed. It did not look like a twist that anybody would want to do near anybody else. And it was really amazing how— I mean, it was just hilarious to see it. And the engineers come in. They’re really seeing the movement and trying to figure out what they need for the movement. And I was like, “Well, it looks like it has a very full diaper.” And they were like, “Oh.” They knew it didn’t quite look right, but it was like—because I think they really don’t project as much as I do, I’m very projective that’s one of the ways that I’ve watched work, or you’re pulling from the work that way, but that’s not what they were looking at. And so yeah, then you change the angles of the legs, how turned in it is and whatever, and it resolved to a degree, I think, fairly successfully. It doesn’t really look like a diaper anymore. But that wasn’t really— and also to get that move right took us over a month.
Ackerman: Wow.
LaViers: Wow.
Thomas: We got much faster after that because it was the first, and we really learned. But it took a month of programming, me coming in, naming specific ways of reshifting it before we got a twist that felt natural if amended because it’s not the same way that--
LaViers: Yeah. Well, and it’s fascinating to think about how to get it to look the same. You had to change the way it did the movement, is what I heard you describing there, and I think that’s so fascinating, right? And just how distinct the morphologies between our body and any of these bodies, even the very facile human-ish looking Atlas, that there’s still a lot of really nuanced and fine-grained and human work-intensive labor to go into getting that to look the same as what we all think of as the twist or the mashed potato.
Thomas: Right. Right. And it does need to be something that we can project those dances onto, or it doesn’t work, in terms of this dance. It could work in another one. Yeah.
LaViers: Right. And you brought that up earlier, too, of trying to work inside of some established forms of dance as opposed to making us all terrified by the strange movement that can happen, which I think is interesting. And I hope one day you get to do that dance too.
Thomas: Yeah. No, I totally want to do that dance too.
Ackerman: Monica, do you have one last question you want to ask?
Thomas: I do. And this is— yeah. I want to ask you, kind of what does embodied or body-based intelligence offer in robotic engineering? So I feel like, you, more than anyone, can speak to that because I don’t do that side.
LaViers: Well, I mean, I think it can bring a couple of things. One, it can bring— I mean, the first moment in my career or life that that calls up for me is, I was watching one of my lab mates, when I was a doctoral student, give a talk about a quadruped robot that he was working on, and he was describing the crawling strategy like the gate. And someone said— and I think it was roughly like, “Move the center of gravity inside the polygon of support, and then pick up— the polygon of support formed by three of the legs. And then pick up the fourth leg and move it. Establish a new polygon of support. Move the center of mass into that polygon of support.” And it’s described with these figures. Maybe there’s a center of gravity. It’s like a circle that’s like a checkerboard, and there’s a triangle, and there’s these legs. And someone stands up and is like, “That makes no sense like that. Why would you do that?” And I’m like, “Oh, oh, I know, oh, because that’s one of the ways you can crawl.” I actually didn’t get down on the floor and do it because I was not so outlandish at that point.
But today, in the RAD lab, that would be, “Everyone on all fours, try this strategy out.” Does it feel like a good idea? Are there other ideas that we would use to do this pattern that might be worth exploring here as well? And so truly rolling around on the floor and moving your body and pretending to be a quadruped, which— in my dance classes, it’s a very common thing to practice crawling because we all forget how to crawl. We want to crawl with the cross-lateral pattern and the homo-lateral pattern, and we want to keep our butts down-- or keep the butts up, but we want to have that optionality so that we look like we’re facile, natural crawlers. We train that, right? And so for a quadruped robot talk and discussion, I think there’s a very literal way that an embodied exploration of the idea is a completely legitimate way to do research.
Ackerman: Yeah. I mean, Monica, this is what you were saying, too, as you were working with these engineers. Sometimes it sounded like they could tell that something wasn’t quite right, but they didn’t know how to describe it, and they didn’t know how to fix it because they didn’t have that language and experience that both of you have.
Thomas: Yeah. Yeah, exactly that.
Ackerman: Okay. Well, I just want to ask you each one more really quick question before we end here, which is that, what is your favorite fictional robot and why? I hope this isn’t too difficult, especially since you both work with real robots, but. Amy, you want to go first?
LaViers: I mean, I’m going to feel like a party pooper. I don’t like any robots, real or fictional. The fictional ones annoy me because-- the fictional ones annoy me because of the disambiguation issue and WALL-E and Eva are so cute. And I do love cute things, but are those machines, or are those characters? And are we losing sight of that? I mean, my favorite robot to watch move, this one-- I mean, I love the Keepon dancing to Spoon. That is something that if you’re having an off day, you google Keepon dancing to Spoon— Keepon is one word, K-E-E-P-O-N, dancing to Spoon, and you just bop. It’s just a bop. I love it. It’s so simple and so pure and so right.
Ackerman: It’s one of my favorite robots of all time, Monica. I don’t know if you’ve seen this, but it’s two little yellow balls like this, and it just goes up and down and rocks back and forth. But it does it so to music. It just does it so well. It’s amazing.
Thomas: I will definitely be watching that [crosstalk].
Ackerman: Yeah. And I should have expanded the question, and now I will expand it because Monica hasn’t answered yet. Favorite robot, real or fictional?
Thomas: So I don’t know if it’s my favorite. This one breaks my heart, and I’m currently having an empathy overdrive issue as a general problem. But there’s a robot installation - and I should know its name, but I don’t— where the robot reaches out, and it grabs the oil that they’ve created it to leak and pulls it towards its body. And it’s been doing this for several years now, but it’s really slowing down now. And I don’t think it even needs the oil. I don’t think it’s a robot that uses oil. It just thinks that it needs to keep it close. And it used to happy dance, and the oil has gotten so dark and the red rust color of, oh, this is so morbid of blood, but it just breaks my heart. So I think I love that robot and also want to save it in the really unhealthy way that we sometimes identify with things that we shouldn’t be thinking about that much.
Ackerman: And you both gave amazing answers to that question.
LaViers: And the piece is Sun Yuan and Peng Yu’s Can’t Help Myself.
Ackerman: That’s right. Yeah.
LaViers: And it is so beautiful. I couldn’t remember the artist’s name either, but—you’re right—it’s so beautiful.
Thomas: It’s beautiful. The movement is beautiful. It’s beautifully considered as an art piece, and the robot is gorgeous and heartbreaking.
Ackerman: Yeah. Those answers were so unexpected, and I love that. So thank you both, and thank you for being on this podcast. This was an amazing conversation. We didn’t have nearly enough time, so we’re going to have to come back to so much.
LaViers: Thank you for having me.
Thomas: Thank you so much for inviting me. [music]
Ackerman: We’ve been talking with Monica Thomas and Amy LaViers about robots and dance. And thanks again to our guests for joining us for ChatBot and IEEE Spectrum. I’m Evan Ackerman.
Chatbot Episode 1: Making Boston Dynamics’ Robots Dance
Evan Ackerman: I’m Evan Ackerman, and welcome to ChatBot, a robotics podcast from IEEE Spectrum. On this episode of ChatBot, we’ll be talking with Monica Thomas and Amy LaViers about robots and dance. Monica Thomas is a dancer and choreographer. Monica has worked with Boston Dynamics to choreograph some of their robot videos in which Atlas, Spot, and even Handle dance to songs like Do You Love Me? The, “Do You Love Me?” Video has been viewed 37 million times. And if you haven’t seen it yet, it’s pretty amazing to see how these robots can move. Amy LaViers is the director of the Robotics, Automation, and Dance Lab, or RAD lab, which she founded in 2015 as a professor in Mechanical Science and Engineering at the University of Illinois, Urbana-Champaign. The RAD Lab is a collective for art making, commercialization, education, outreach, and research at the intersection of dance and robotics. And Amy’s work explores the creative relationships between machines and humans, as expressed through movement. So Monica, can you just tell me-- I think people in the robotics field may not know who you are or why you’re on the podcast at this point, so can you just describe how you initially got involved in Boston Dynamics?
Monica Thomas: Yeah. So I got involved really casually. I know people who work at Boston Dynamics and Marc Raibert, their founder and head. They’d been working on Spot, and they added the arm to Spot. And Marc was kind of like, “I kind of think this could dance.” And they were like, “Do you think this could dance?” And I was like, “It could definitely dance. That definitely could do a lot of dancing.” And so we just started trying to figure out, can it move in a way that feels like dance to people watching it? And the first thing we made was Uptown Spot. And it was really just figuring out moves that the robot does kind of already naturally. And that’s when they started developing, I think, Choreographer, their tool. But in terms of my thinking, it was just I was watching what the robot did as its normal patterns, like going up, going down, walking this place, different steps, different gates, what is interesting to me, what looks beautiful to me, what looks funny to me, and then imagining what else we could be doing, considering the angles of the joints. And then it just grew from there. And so once that one was out, Marc was like, “What about the rest of the robots? Could they dance? Maybe we could do a dance with all of the robots.” And I was like, “We could definitely do a dance with all of the robots. Any shape can dance.” So that’s when we started working on what turned into Do You Love Me? I didn’t really realize what a big deal it was until it came out and it went viral. And I was like, “Oh—” are we allowed to swear, or—?
Ackerman: Oh, yeah. Yeah.
Thomas: Yeah. So I was like, “[bleep bleep, bleeeep] is this?” I didn’t know how to deal with it. I didn’t know how to think about it. As a performer, the largest audience I performed for in a day was like 700 people, which is a big audience as a live performer. So when you’re hitting millions, it’s just like it doesn’t even make sense anymore, and yeah. So that was pretty mind-boggling. And then also because of kind of how it was introduced and because there is a whole world of choreo-robotics, which I was not really aware of because I was just doing my thing. Then I realized there’s all of this work that’s been happening that I couldn’t reference, didn’t know about, and conversations that were really important in the field that I also was unaware of and then suddenly was a part of. So I think doing work that has more viewership is really—it was a trip and a half—is a trip and a half. I’m still learning about it. Does that answer your question?
Ackerman: Yeah. Definitely.
Thomas: It’s a long-winded answer, but.
Ackerman: And Amy, so you have been working in these two disciplines for a long time, in the disciplines of robotics and in dance. So what made you decide to combine these two things, and why is that important?
Amy LaViers: Yeah. Well, both things, I guess in some way, have always been present in my life. I’ve danced since I was three, probably, and my dad and all of his brothers and my grandfathers were engineers. So in some sense, they were always there. And it was really-- I could tell you the date. I sometimes forget what it was, but it was a Thursday, and I was taking classes and dancing and controlling of mechanical systems, and I was realizing this over. I mean, I don’t think I’m combining them. I feel like they already kind of have this intersection that just exists. And I realized-- or I stumbled into that intersection myself, and I found lots of people working in it. And I was-- oh, my interests in both these fields kind of reinforce one another in a way that’s really exciting and interesting. I also happened to be an almost graduating-- I was in last class of my junior year of college, so I was thinking, “What am I going to do with myself?” Right? So it was very happenstance in that way. And again, I mean, I just felt like— it was like I walked into a room where all of a sudden, a lot of things made sense to me, and a lot of interests of mine were both present.
Ackerman: And can you summarize, I guess, the importance here? Because I feel like— I’m sure this is something you’ve run into, is that it’s easy for engineers or roboticists just to be— I mean, honestly, a little bit dismissive of this idea that it’s important for robots to have this expressivity. So why is it important?
LaViers: That is a great question that if I could summarize what my life is like, it’s me on a computer going like this, trying to figure out the words to answer that succinctly. But one way I might ask it, earlier when we were talking, you mentioned this idea of functional behavior versus expressive behavior, which comes up a lot when we start thinking in this space. And I think one thing that happens-- and my training and background in Laban Movement Analysis really emphasizes this duality between function and expression as opposed to the either/or. It’s kind of like the mind-body split, the idea that these things are one integrated unit. Function and expression are an integrated unit. And something that is functional is really expressive. Something that is expressive is really functional.
Ackerman: It definitely answers the question. And it looks like Monica is resonating with you a little bit, so I’m just going to get out of the way here. Amy, do you want to just start this conversation with Monica?
LaViers: Sure. Sure. Monica has already answered, literally, my first question, so I’m already having to shuffle a little bit. But I’m going to rephrase. My first question was, can robots dance? And I love how emphatically and beautifully you answered that with, “Any shape can dance.” I think that’s so beautiful. That was a great answer, and I think it brings up— you can debate, is this dance, or is this not? But there’s also a way to look at any movement through the lens of dance, and that includes factory robots that nobody ever sees.
Thomas: It’s exciting. I mean, it’s a really nice way to walk through the world, so I actually recommend it for everyone, just like taking a time and seeing the movement around you as dance. I don’t know if it’s allowing it to be intentional or just to be special, meaningful, something.
LaViers: That’s a really big challenge, particularly for an autonomous system. And for any moving system, I think that’s hard, artificial or not. I mean it’s hard for me. My family’s coming into town this weekend. I’m like, “How do I act so that they know I love them?” Right? That’s dramaticized version of real life, right, is, how do I be welcoming to my guests? And that’ll be, how do I move?
Thomas: What you’re saying is a reminder of, one of the things that I really enjoy watching robots move is that I’m allowed to project as much as I want to on them without taking away something from them. When you project too much on people, you lose the person, and that’s not really fair. But when you’re projecting on objects, things that are objects but that we personify— or not even personify, that we anthropomorphize or whatever, it is just a projection of us. But it’s acceptable. So nice for it to be acceptable, a place where you get to do that.
LaViers: Well, okay. Then can I ask my fourth question even though it’s not my turn? Because that’s just too perfect to what it is, which is just, what did you learn about yourself working with these robots?
Thomas: Well, I learned how much I love visually watching movement. I’ve always watched, but I don’t think it was as clear to me how much I like movement. The work that I made was really about context. It was about what’s happening in society, what’s happening in me as a person. But I never got into that school of dance that really spends time just really paying attention to movement or letting movement develop or explore, exploring movement. That wasn’t what I was doing. And with robots, I was like, “Oh, but yeah, I get it better now. I see it more now.” So much in life right now, for me, is not contained, and it doesn’t have answers. And translating movement across species from my body to a robot, that does have answers. It has multiple answers. It’s not like there’s a yes and a no, but you can answer a question. And it’s so nice to answer questions sometimes. I sat with this thing, and here’s something I feel like is an acceptable solution. Wow. That’s a rarity in life. So I love that about working with robots. I mean, also, they’re cool, I think. And it is also— they’re just cool. I mean, that’s true too. It’s also interesting. I guess the last thing that I really loved—and I didn’t have much opportunity to do this or as much as you’d expect because of COVID—is being in space with robots. It’s really interesting, just like being in space with anything that is different than your norm is notable. Being in space with an animal that you’re not used to being with is notable. And there’s just something really cool about being with something very different. And for me, robots are very different and not acclimatized.
Ackerman: Okay. Monica, you want to ask a question or two?
Thomas: Yeah. I do. The order of my questions is ruined also. I was thinking about the RAD Lab, and I was wondering if there are guiding principles that you feel are really important in that interdisciplinary work that you’re doing, and also any lessons maybe from the other side that are worth sharing.
LaViers: The usual way I describe it and describe my work more broadly is, I think there are a lot of roboticists that hire dancers, and they make robots and those dancers help them. And there are a lot of dancers that they hire engineers, and those engineers build something for them that they use inside of their work. And what I’m interested in, in the little litmus test or challenge I paint for myself and my collaborators is we want to be right in between those two things, right, where we are making something. First of all, we’re treating each other as peers, as technical peers, as artistic peers, as— if the robot moves on stage, I mean, that’s choreography. If the choreographer asks for the robot to move in a certain way, that’s robotics. That’s the inflection point we want to be at. And so that means, for example, in terms of crediting the work, we try to credit the creative contributions. And not just like, “Oh, well, you did 10 percent of the creative contributions.” We really try to treat each other as co-artistic collaborators and co-technical developers. And so artists are on our papers, and engineers are in our programs, to put it in that way. And likewise, that changes the questions we want to ask. We want to make something that pushes robotics just a inch further, a millimeter further. And we want to do something that pushes dance just an inch further, a millimeter further. We would love it if people would ask us, “Is this dance?” We get, “Is this robotics?” Quite a lot. So that makes me feel like we must be doing something interesting in robotics.
And every now and then, I think we do something interesting for dance too, and certainly, many of my collaborators do. And that inflection point, that’s just where I think is interesting. And I think that’s where— that’s the room I stumbled into, is where we’re asking those questions as opposed to just developing a robot and hiring someone to help us do that. I mean, it can be hard in that environment that people feel like their expertise is being given to the other side. And then, where am I an expert? And we’ve heard editors at publication venues say, “Well, this dancer can’t be a co-author,” and we’ve had venues where we’re working on the program and people say, “Well, no, this engineer isn’t a performer,” but I’m like, “But he’s queuing the robot, and if he messes up, then we all mess up.” I mean, that’s vulnerability too. So we have those conversations that are really touchy and a little sensitive and a little— and so how do you create that space where people do you feel safe and comfortable and valued and attributed for their work and that they can make a track record and do this again in another project, in another context and— so, I don’t know, if I’ve learned anything, I mean, I’ve learned that you just have to really talk about attribution all the time. I bring it up every time, and then I bring it up before we even think about writing a paper. And then I bring it up when we make the draft. And first thing I put in the draft is everybody’s name in the order it’s going to appear, with the affiliations and with the—subscripts on that don’t get added at the last minute. And when the editor of a very famous robotics venue says, “This person can’t be a co-author,” that person doesn’t get taken off as a co-author; that person is a co-author, and we figure out another way to make it work. And so I think that’s learning, or that’s just a struggle anyway.
Ackerman: Monica, I’m curious if when you saw the Boston Dynamics videos go viral, did you feel like there was much more of a focus on the robots and the mechanical capabilities than there was on the choreography and the dance? And if so, how did that make you feel?
Thomas: Yeah. So yes. Right. When dances I’ve made have been reviewed, which I’ve always really appreciated, it has been about the dance. It’s been about the choreography. And actually, kind of going way back to what we were talking about a couple things ago, a lot of the reviews that you get around this are about people, their reactions, right? Because, again, we can project so much onto robots. So I learned a lot about people, how people think about robots. There’s a lot of really overt themes, and then there’s individual nuance. But yeah, it wasn’t really about the dance, and it was in the middle of the pandemic too. So there’s really high isolation. I had no idea how people who cared about dance thought about it for a long time. And then every once in a while, I get one person here or one person there say something. So it’s a totally weird experience. Yes.
The way that I took information about the dance was kind of paying attention to the affective experience, the emotional experience that people had watching this. The dance was— nothing in that dance was— we use the structures of the traditions of dance in it for intentional reason. I chose that because I wasn’t trying to alarm people or show people ways that robots move that totally hit some old part of our brain that makes us absolutely panicked. That wasn’t my interest or the goal of that work. And honestly, at some point, it’d be really interesting to explore what the robots can just do versus what I, as a human, feel comfortable seeing them do. But the emotional response that people got told me a story about what the dance was doing in a backward-- also, what the music’s doing because—let’s be real—that music does— right? We stacked the deck.
LaViers: Yeah. And now that brings— I feel like that serves up two of my questions, and I might let you pick which one maybe we go to. I mean, one of my questions, I wrote down some of my favorite moments from the choreography that I thought we could discuss. Another question—and maybe we can do both of these in serie—is a little bit about— I’ll blush even just saying it, and I’m so glad that the people can’t see the blushing. But also, there’s been so much nodding, and I’m noticing that that won’t be in the audio recording. We’re nodding along to each other so much. But the other side—and you can just nod in a way that gives me your—the other question that comes up for that is, yeah, what is the monetary piece of this, and where are the power dynamics inside this? And how do you feel about how that sits now as that video continues to just make its rounds on the internet and establish value for Boston Dynamics?
Thomas: I would love to start with the first question. And the second one is super important, and maybe another day for that one.
Ackerman: Okay. That’s fair. That’s fair.
LaViers: Yep. I like that. I like that. So the first question, so my favorite moments of the piece that you choreographed to Do You Love Me? For the Boston Dynamics robots, the swinging arms at the beginning, where you don’t fully know where this is going. It looks so casual and so, dare I say it, natural, although it’s completely artificial, right? And the proximal rotation of the legs, I feel like it’s a genius way of getting around no spine. But you really make use of things that look like hip joints or shoulder joints as a way of, to me, accessing a good wriggle or a good juicy moment, and then the Spot space hold, I call it, where the head of the Spot is holding in place and then the robot wiggles around that, dances around that. And then the moment when you see all four complete—these distinct bodies, and it looks like they’re dancing together. And we touched on that earlier—any shape can dance—but making them all dance together I thought was really brilliant and effective in the work. So it’s one of those moments, super interesting, or you have a funny story about, I thought we could talk about it further.
Thomas: I have a funny story about the hip joints. So the initial— well, not the initial, but when they do the mashed potato, that was the first dance move that we started working on, on Atlas. And for folks who don’t know, the mashed potato is kind of the feet are going in and out; the knees are going in and out. So we ran into a couple of problems, which—and the twist. I guess it’s a combo. Both of them like you to roll your feet on the ground like rub, and that friction was not good for the robots. So when we first started really moving into the twist, which has this torso twisting— the legs are twisting. The foot should be twisting on the floor. The foot is not twisting on the floor, and the legs were so turned out that the shape of the pelvic region looked like a over-full diaper. So, I mean, it was wiggling, but it made the robot look young. It made the robot look like it was in a diaper that needed to be changed. It did not look like a twist that anybody would want to do near anybody else. And it was really amazing how— I mean, it was just hilarious to see it. And the engineers come in. They’re really seeing the movement and trying to figure out what they need for the movement. And I was like, “Well, it looks like it has a very full diaper.” And they were like, “Oh.” They knew it didn’t quite look right, but it was like—because I think they really don’t project as much as I do, I’m very projective that’s one of the ways that I’ve watched work, or you’re pulling from the work that way, but that’s not what they were looking at. And so yeah, then you change the angles of the legs, how turned in it is and whatever, and it resolved to a degree, I think, fairly successfully. It doesn’t really look like a diaper anymore. But that wasn’t really— and also to get that move right took us over a month.
Ackerman: Wow.
LaViers: Wow.
Thomas: We got much faster after that because it was the first, and we really learned. But it took a month of programming, me coming in, naming specific ways of reshifting it before we got a twist that felt natural if amended because it’s not the same way that--
LaViers: Yeah. Well, and it’s fascinating to think about how to get it to look the same. You had to change the way it did the movement, is what I heard you describing there, and I think that’s so fascinating, right? And just how distinct the morphologies between our body and any of these bodies, even the very facile human-ish looking Atlas, that there’s still a lot of really nuanced and fine-grained and human work-intensive labor to go into getting that to look the same as what we all think of as the twist or the mashed potato.
Thomas: Right. Right. And it does need to be something that we can project those dances onto, or it doesn’t work, in terms of this dance. It could work in another one. Yeah.
LaViers: Right. And you brought that up earlier, too, of trying to work inside of some established forms of dance as opposed to making us all terrified by the strange movement that can happen, which I think is interesting. And I hope one day you get to do that dance too.
Thomas: Yeah. No, I totally want to do that dance too.
Ackerman: Monica, do you have one last question you want to ask?
Thomas: I do. And this is— yeah. I want to ask you, kind of what does embodied or body-based intelligence offer in robotic engineering? So I feel like, you, more than anyone, can speak to that because I don’t do that side.
LaViers: Well, I mean, I think it can bring a couple of things. One, it can bring— I mean, the first moment in my career or life that that calls up for me is, I was watching one of my lab mates, when I was a doctoral student, give a talk about a quadruped robot that he was working on, and he was describing the crawling strategy like the gate. And someone said— and I think it was roughly like, “Move the center of gravity inside the polygon of support, and then pick up— the polygon of support formed by three of the legs. And then pick up the fourth leg and move it. Establish a new polygon of support. Move the center of mass into that polygon of support.” And it’s described with these figures. Maybe there’s a center of gravity. It’s like a circle that’s like a checkerboard, and there’s a triangle, and there’s these legs. And someone stands up and is like, “That makes no sense like that. Why would you do that?” And I’m like, “Oh, oh, I know, oh, because that’s one of the ways you can crawl.” I actually didn’t get down on the floor and do it because I was not so outlandish at that point.
But today, in the RAD lab, that would be, “Everyone on all fours, try this strategy out.” Does it feel like a good idea? Are there other ideas that we would use to do this pattern that might be worth exploring here as well? And so truly rolling around on the floor and moving your body and pretending to be a quadruped, which— in my dance classes, it’s a very common thing to practice crawling because we all forget how to crawl. We want to crawl with the cross-lateral pattern and the homo-lateral pattern, and we want to keep our butts down-- or keep the butts up, but we want to have that optionality so that we look like we’re facile, natural crawlers. We train that, right? And so for a quadruped robot talk and discussion, I think there’s a very literal way that an embodied exploration of the idea is a completely legitimate way to do research.
Ackerman: Yeah. I mean, Monica, this is what you were saying, too, as you were working with these engineers. Sometimes it sounded like they could tell that something wasn’t quite right, but they didn’t know how to describe it, and they didn’t know how to fix it because they didn’t have that language and experience that both of you have.
Thomas: Yeah. Yeah, exactly that.
Ackerman: Okay. Well, I just want to ask you each one more really quick question before we end here, which is that, what is your favorite fictional robot and why? I hope this isn’t too difficult, especially since you both work with real robots, but. Amy, you want to go first?
LaViers: I mean, I’m going to feel like a party pooper. I don’t like any robots, real or fictional. The fictional ones annoy me because-- the fictional ones annoy me because of the disambiguation issue and WALL-E and Eva are so cute. And I do love cute things, but are those machines, or are those characters? And are we losing sight of that? I mean, my favorite robot to watch move, this one-- I mean, I love the Keepon dancing to Spoon. That is something that if you’re having an off day, you google Keepon dancing to Spoon— Keepon is one word, K-E-E-P-O-N, dancing to Spoon, and you just bop. It’s just a bop. I love it. It’s so simple and so pure and so right.
Ackerman: It’s one of my favorite robots of all time, Monica. I don’t know if you’ve seen this, but it’s two little yellow balls like this, and it just goes up and down and rocks back and forth. But it does it so to music. It just does it so well. It’s amazing.
Thomas: I will definitely be watching that [crosstalk].
Ackerman: Yeah. And I should have expanded the question, and now I will expand it because Monica hasn’t answered yet. Favorite robot, real or fictional?
Thomas: So I don’t know if it’s my favorite. This one breaks my heart, and I’m currently having an empathy overdrive issue as a general problem. But there’s a robot installation - and I should know its name, but I don’t— where the robot reaches out, and it grabs the oil that they’ve created it to leak and pulls it towards its body. And it’s been doing this for several years now, but it’s really slowing down now. And I don’t think it even needs the oil. I don’t think it’s a robot that uses oil. It just thinks that it needs to keep it close. And it used to happy dance, and the oil has gotten so dark and the red rust color of, oh, this is so morbid of blood, but it just breaks my heart. So I think I love that robot and also want to save it in the really unhealthy way that we sometimes identify with things that we shouldn’t be thinking about that much.
Ackerman: And you both gave amazing answers to that question.
LaViers: And the piece is Sun Yuan and Peng Yu’s Can’t Help Myself.
Ackerman: That’s right. Yeah.
LaViers: And it is so beautiful. I couldn’t remember the artist’s name either, but—you’re right—it’s so beautiful.
Thomas: It’s beautiful. The movement is beautiful. It’s beautifully considered as an art piece, and the robot is gorgeous and heartbreaking.
Ackerman: Yeah. Those answers were so unexpected, and I love that. So thank you both, and thank you for being on this podcast. This was an amazing conversation. We didn’t have nearly enough time, so we’re going to have to come back to so much.
LaViers: Thank you for having me.
Thomas: Thank you so much for inviting me. [music]
Ackerman: We’ve been talking with Monica Thomas and Amy LaViers about robots and dance. And thanks again to our guests for joining us for ChatBot and IEEE Spectrum. I’m Evan Ackerman.Microfliers, or miniature wireless robots deployed in numbers, are sometimes used today for large-scale surveillance and monitoring purposes, such as in environmental or biological studies. Because of the fliers’ ability to disperse in air, they can spread out to cover large areas after being dropped from a single location, including in places where access is otherwise difficult. Plus, they are smaller, lighter, and cheaper to deploy than multiple drones.
One of the challenges in creating more efficient microfliers has been in reducing power consumption. One way to do so, as researchers from the University of Washington (UW) and Université Grenoble Alpes have demonstrated, is to get rid of the battery. With inspiration from the Japanese art of paper folding, origami, they designed programmable microfliers that can disperse in the wind and change shape using electronic actuation. This is achieved by a solar-powered actuator that can produce up to 200 millinewtons of force in 25 milliseconds.
“Think of these little fliers as a sensor platform to measure environmental conditions, like, temperature, light, and other things.”
—Vikram Iyer, University of Washington
“The cool thing about these origami designs is, we’ve created a way for them to change shape in midair, completely battery free,” says Vikram Iyer, computer scientist and engineer at UW, one of the authors. “It’s a pretty small change in shape, but it creates a very dramatic change in falling behavior…that allows us to get some control over how these things are flying.”
Tumbling and stable states: A) The origami microflier here is in its tumbling state and B) postlanding configuration. As it descends, the flier tumbles, with a typical tumbling pattern pictured in C. D) The origami microflier is here in its stable descent state. The fliers’ range of landing locations, E, reveals its dispersal patterns after being released from its parent drone. Vicente Arroyos, Kyle Johnson, and Vikram Iyer/University of Washington
This research builds on the researchers’ earlier work published in 2022, demonstrating sensors that can disperse in air like dandelion seeds. For the current study, “the goal was to deploy hundreds of these sensors and control where they land, to achieve precise deployments,” says coauthor Shyamnath Gollakota, who leads the Mobile Intelligence Lab at WU. The microfliers, each weighing less than 500 milligrams, can travel almost 100 meters in a light breeze, and wirelessly transmit data about air pressure and temperature via Bluetooth up to a distance of 60 meters. The group’s findings were published in Science Robotics earlier this month.
Discovering the difference in the falling behavior of the two origami states was serendipity, Gollakota says: “When it is flat, it’s almost like a leaf, tumbling [in the] the wind,” he says. “A very slight change from flat to a little bit of a curvature [makes] it fall like a parachute in a very controlled motion.” In their tumbling state, in lateral wind gusts, the microfliers achieve up to three times the dispersal distance as in their stable state, he adds.
This close-up of the microflier reveals the electronics and circuitry on its top side.Vicente Arroyos, Kyle Johnson, and Vikram Iyer/University of Washington
There have been other origami-based systems in which motors, electrostatic actuators, shape-memory alloys, and electrothermal polymers, for example, have been used, but these did not address the challenges facing the researchers, Gollakota says. One was to find the sweet spot between an actuation mechanism strong enough to not change shape without being triggered, yet lightweight enough to keep power consumption low. Next, it had to produce a rapid transition response while falling to the ground. Finally, it needed to have a lightweight energy storage solution onboard to trigger the transition.
The mechanism, which Gollakota describes as “pretty commonsensical” still took them a year to come up with. There’s a stem in the middle of the origami, comprising a solenoid coil (a coil that acts as a magnet when a current passes through it), and two small magnets. Four hinged carbon-fiber rods attach the stem to the edges of the structure. When a pulse of current is applied to the solenoid coil, it pushes the magnets toward each other, making the structure snap into its alternative shape.
All it requires is a tiny bit of power, just enough to put the magnets within the right distance from each other for the magnetic forces to work, Gollakota says. There is an array of thin, lightweight solar cells to harvest energy, which is stored in a little capacitor. The circuit is fabricated directly on the foldable origami structure, and also includes a microcontroller, timer, Bluetooth receiver, and pressure and temperature sensors.
“We can program these things to trigger the shape change based on any of these things—after a fixed time, when we send it a radio signal, or, at an altitude [or temperature] that this device detects,” Iyer adds. The origami structure is bistable, meaning it does not need any energy to maintain shape once it has transitioned.
The researchers say their design can be extended to incorporate sensors for a variety of environmental monitoring applications. “Think of these little fliers as a sensor platform to measure environmental conditions, like temperature, light, and other things, [and] how they vary throughout the atmosphere,” Iyer says. Or they can deploy sensors on the ground for things like digital agriculture, climate change–related studies, and tracking forest fires.
In their current prototype, the microfliers only shape-change in one direction, but the researchers want to make them transition in both directions, to be able to toggle the two states, and control the trajectory even better. They also imagine a swarm of microfliers communicating with one another, controlling their behavior, and self-organizing how they are falling and dispersing.
Microfliers, or miniature wireless robots deployed in numbers, are sometimes used today for large-scale surveillance and monitoring purposes, such as in environmental or biological studies. Because of the fliers’ ability to disperse in air, they can spread out to cover large areas after being dropped from a single location, including in places where access is otherwise difficult. Plus, they are smaller, lighter, and cheaper to deploy than multiple drones.
One of the challenges in creating more efficient microfliers has been in reducing power consumption. One way to do so, as researchers from the University of Washington (UW) and Université Grenoble Alpes have demonstrated, is to get rid of the battery. With inspiration from the Japanese art of paper folding, origami, they designed programmable microfliers that can disperse in the wind and change shape using electronic actuation. This is achieved by a solar-powered actuator that can produce up to 200 millinewtons of force in 25 milliseconds.
“Think of these little fliers as a sensor platform to measure environmental conditions, like, temperature, light, and other things.”
—Vikram Iyer, University of Washington
“The cool thing about these origami designs is, we’ve created a way for them to change shape in midair, completely battery free,” says Vikram Iyer, computer scientist and engineer at UW, one of the authors. “It’s a pretty small change in shape, but it creates a very dramatic change in falling behavior…that allows us to get some control over how these things are flying.”
Tumbling and stable states: A) The origami microflier here is in its tumbling state and B) postlanding configuration. As it descends, the flier tumbles, with a typical tumbling pattern pictured in C. D) The origami microflier is here in its stable descent state. The fliers’ range of landing locations, E, reveals its dispersal patterns after being released from its parent drone. Vicente Arroyos, Kyle Johnson, and Vikram Iyer/University of Washington
This research builds on the researchers’ earlier work published in 2022, demonstrating sensors that can disperse in air like dandelion seeds. For the current study, “the goal was to deploy hundreds of these sensors and control where they land, to achieve precise deployments,” says coauthor Shyamnath Gollakota, who leads the Mobile Intelligence Lab at WU. The microfliers, each weighing less than 500 milligrams, can travel almost 100 meters in a light breeze, and wirelessly transmit data about air pressure and temperature via Bluetooth up to a distance of 60 meters. The group’s findings were published in Science Robotics earlier this month.
Discovering the difference in the falling behavior of the two origami states was serendipity, Gollakota says: “When it is flat, it’s almost like a leaf, tumbling [in the] the wind,” he says. “A very slight change from flat to a little bit of a curvature [makes] it fall like a parachute in a very controlled motion.” In their tumbling state, in lateral wind gusts, the microfliers achieve up to three times the dispersal distance as in their stable state, he adds.
This close-up of the microflier reveals the electronics and circuitry on its top side.Vicente Arroyos, Kyle Johnson, and Vikram Iyer/University of Washington
There have been other origami-based systems in which motors, electrostatic actuators, shape-memory alloys, and electrothermal polymers, for example, have been used, but these did not address the challenges facing the researchers, Gollakota says. One was to find the sweet spot between an actuation mechanism strong enough to not change shape without being triggered, yet lightweight enough to keep power consumption low. Next, it had to produce a rapid transition response while falling to the ground. Finally, it needed to have a lightweight energy storage solution onboard to trigger the transition.
The mechanism, which Gollakota describes as “pretty commonsensical” still took them a year to come up with. There’s a stem in the middle of the origami, comprising a solenoid coil (a coil that acts as a magnet when a current passes through it), and two small magnets. Four hinged carbon-fiber rods attach the stem to the edges of the structure. When a pulse of current is applied to the solenoid coil, it pushes the magnets toward each other, making the structure snap into its alternative shape.
All it requires is a tiny bit of power, just enough to put the magnets within the right distance from each other for the magnetic forces to work, Gollakota says. There is an array of thin, lightweight solar cells to harvest energy, which is stored in a little capacitor. The circuit is fabricated directly on the foldable origami structure, and also includes a microcontroller, timer, Bluetooth receiver, and pressure and temperature sensors.
“We can program these things to trigger the shape change based on any of these things—after a fixed time, when we send it a radio signal, or, at an altitude [or temperature] that this device detects,” Iyer adds. The origami structure is bistable, meaning it does not need any energy to maintain shape once it has transitioned.
The researchers say their design can be extended to incorporate sensors for a variety of environmental monitoring applications. “Think of these little fliers as a sensor platform to measure environmental conditions, like temperature, light, and other things, [and] how they vary throughout the atmosphere,” Iyer says. Or they can deploy sensors on the ground for things like digital agriculture, climate change–related studies, and tracking forest fires.
In their current prototype, the microfliers only shape-change in one direction, but the researchers want to make them transition in both directions, to be able to toggle the two states, and control the trajectory even better. They also imagine a swarm of microfliers communicating with one another, controlling their behavior, and self-organizing how they are falling and dispersing.
Introduction: Geometric pattern formation is crucial in many tasks involving large-scale multi-agent systems. Examples include mobile agents performing surveillance, swarms of drones or robots, and smart transportation systems. Currently, most control strategies proposed to achieve pattern formation in network systems either show good performance but require expensive sensors and communication devices, or have lesser sensor requirements but behave more poorly.
Methods and result: In this paper, we provide a distributed displacement-based control law that allows large groups of agents to achieve triangular and square lattices, with low sensor requirements and without needing communication between the agents. Also, a simple, yet powerful, adaptation law is proposed to automatically tune the control gains in order to reduce the design effort, while improving robustness and flexibility.
Results: We show the validity and robustness of our approach via numerical simulations and experiments, comparing it, where possible, with other approaches from the existing literature.
The concept of sustainability and sustainable development has been well discussed and was subject to many conferences of the EU and UN resulting in agendas, goals, and resolutions. Yet, literature shows that the three dimensions of sustainability (ecological, social, and economic) are unevenly accounted for in the design of mechatronic products. The stated reasons range from a lack or inapplicability of tools for integration into the design process, models for simulation, and impact analyses to necessary changes in policy and social behavior. The influence designers have on the sustainability of a product lies mostly in the early design phases of the development process, such as requirements engineering and concept evaluation. Currently, these concepts emerge mostly from performance-based requirements rather than sustainability impact-based requirements, which are also true for service robots in urban environments. So far, the main focus of research in this innovative and growing product branch lies in performance in perception, navigation, and interaction. This paper sets its focus on integrating all three dimensions of sustainability into the design process. Therefore, we describe the development of an urban service robot supporting municipal waste management in the city of Berlin. It is the set goal for the robot to increase the service and support the employees while reducing emissions. For that, we make use of a product development process (PDP) and its adaptable nature to build a specific development process suited to include the three dimensions of sustainability during the requirements engineering and evaluation activities. Herein, we show how established design methods like the life cycle assessment or life cycle costing can be applied to the development of urban service robots and which aspects are underrepresented. Especially, the social dimension required us to look beyond standardized methods in the field of mechanical engineering. Based on our findings, we introduce a new activity to the development process that we call preliminary social assessment in order to incorporate social aspects in the early design phase.
6D pose recognition has been a crucial factor in the success of robotic grasping, and recent deep learning based approaches have achieved remarkable results on benchmarks. However, their generalization capabilities in real-world applications remain unclear. To overcome this gap, we introduce 6IMPOSE, a novel framework for sim-to-real data generation and 6D pose estimation. 6IMPOSE consists of four modules: First, a data generation pipeline that employs the 3D software suite Blender to create synthetic RGBD image datasets with 6D pose annotations. Second, an annotated RGBD dataset of five household objects was generated using the proposed pipeline. Third, a real-time two-stage 6D pose estimation approach that integrates the object detector YOLO-V4 and a streamlined, real-time version of the 6D pose estimation algorithm PVN3D optimized for time-sensitive robotics applications. Fourth, a codebase designed to facilitate the integration of the vision system into a robotic grasping experiment. Our approach demonstrates the efficient generation of large amounts of photo-realistic RGBD images and the successful transfer of the trained inference model to robotic grasping experiments, achieving an overall success rate of 87% in grasping five different household objects from cluttered backgrounds under varying lighting conditions. This is made possible by fine-tuning data generation and domain randomization techniques and optimizing the inference pipeline, overcoming the generalization and performance shortcomings of the original PVN3D algorithm. Finally, we make the code, synthetic dataset, and all the pre-trained models available on GitHub.
There seems to be two general approaches to cooking automation. There’s the “let’s make a robot that can operate in a human kitchen because everyone has a human kitchen,” which seems like a good idea, except that you then have to build your robot to function in human environments which is super hard. On the other end of the spectrum, there’s the “let’s make a dedicated automated system because automation is easier than robotics,” which seems like a good idea, except that you then have to be willing to accept compromises in recipes and texture and taste because preparing food in an automated way just does not yield the same result, as anyone who has ever attempted to Cuisinart their way out of developing some knife skills can tell you.
The Robotics and Mechanisms Lab (RoMeLa) at UCLA, run by Dennis Hong, has been working on a compromise approach that leverages both robot-friendly automation and the kind of human skills that make things taste right. Called Project YORI, which somehow stands for “Yummy Operations Robot Initiative” while also meaning “cooking” in Korean, the system combines a robot-optimized environment with a pair of arms that can operate kitchen tools sort of like a human.
“Instead of trying to mimic how humans cook,” the researchers say, “we approached the problem by thinking how cooking would be accomplished if a robot cooks. Thus the YORI system does not use the typical cooking methods, tools or utensils which are developed for humans.” In addition to a variety of automated cooking systems, the tools that YORI does use are modified to work with a tool changing system, which mostly eliminates the problem of grasping something like a knife well enough that you can precisely and repeatedly exert a substantial amount of force through it, and also helps keep things structured and accessible.
In terms of cooking methods, the system takes advantage of technology when and where it works better than conventional human cooking techniques. For example, in order to tell whether ingredients are fresh or to determine when food is cooked ideally, YORI “utilizes unique chemical sensors,” which I guess are the robot equivalent of a nose and taste buds and arguably would do a more empirical job than some useless recipe metric like “season to taste.”
The advantage of a system like this is versatility. In theory, it’s not as constrained by recipes that you can cram into a system built around automation because of those added robotic capabilities, while also being somewhat practical—or at least, more practical than a robot designed to interact with a lightly modified human kitchen. And it’s actually designed to be practical(ish), in the sense that it’s being developed under a partnership with Woowa Brothers, the company that runs the leading food delivery service in South Korea. It’s obviously still a work in progress—you can see a human hand sneaking in there from time to time. But the approach seems interesting, and I hope that RoMeLa keeps making progress on it, because I’m hungry.
There seems to be two general approaches to cooking automation. There’s the “let’s make a robot that can operate in a human kitchen because everyone has a human kitchen,” which seems like a good idea, except that you then have to build your robot to function in human environments which is super hard. On the other end of the spectrum, there’s the “let’s make a dedicated automated system because automation is easier than robotics,” which seems like a good idea, except that you then have to be willing to accept compromises in recipes and texture and taste because preparing food in an automated way just does not yield the same result, as anyone who has ever attempted to Cuisinart their way out of developing some knife skills can tell you.
The Robotics and Mechanisms Lab (RoMeLa) at UCLA, run by Dennis Hong, has been working on a compromise approach that leverages both robot-friendly automation and the kind of human skills that make things taste right. Called Project YORI, which somehow stands for “Yummy Operations Robot Initiative” while also meaning “cooking” in Korean, the system combines a robot-optimized environment with a pair of arms that can operate kitchen tools sort of like a human.
“Instead of trying to mimic how humans cook,” the researchers say, “we approached the problem by thinking how cooking would be accomplished if a robot cooks. Thus the YORI system does not use the typical cooking methods, tools or utensils which are developed for humans.” In addition to a variety of automated cooking systems, the tools that YORI does use are modified to work with a tool changing system, which mostly eliminates the problem of grasping something like a knife well enough that you can precisely and repeatedly exert a substantial amount of force through it, and also helps keep things structured and accessible.
In terms of cooking methods, the system takes advantage of technology when and where it works better than conventional human cooking techniques. For example, in order to tell whether ingredients are fresh or to determine when food is cooked ideally, YORI “utilizes unique chemical sensors,” which I guess are the robot equivalent of a nose and taste buds and arguably would do a more empirical job than some useless recipe metric like “season to taste.”
The advantage of a system like this is versatility. In theory, it’s not as constrained by recipes that you can cram into a system built around automation because of those added robotic capabilities, while also being somewhat practical—or at least, more practical than a robot designed to interact with a lightly modified human kitchen. And it’s actually designed to be practical(ish), in the sense that it’s being developed under a partnership with Woowa Brothers, the company that runs the leading food delivery service in South Korea. It’s obviously still a work in progress—you can see a human hand sneaking in there from time to time. But the approach seems interesting, and I hope that RoMeLa keeps making progress on it, because I’m hungry.
Video Friday is your weekly selection of awesome robotics videos, collected by your friends at IEEE Spectrum robotics. We also post a weekly calendar of upcoming robotics events for the next few months. Please send us your events for inclusion.
Enjoy today’s videos!
Musical dancing is an ubiquitous phenomenon in human society. Providing robots the ability to dance has the potential to make the human/robot coexistence more acceptable. Hence, dancing robots have generated a considerable research interest in the recent years. In this paper, we present a novel formalization of robot dancing as planning and control of optimally timed actions based on beat timings and additional features extracted from the music.Wow! Okay, all robotics videos definitely need confetti cannons.
[ DFKI ]
What an incredibly relaxing robot video this is.
Except for the tree bit, I mean.
Skydio has a fancy new drone, but not for you!
Skydio X10, a drone designed for first responders, infrastructure operators, and the U.S. and allied militaries around the world. It has the sensors to capture every detail of the data that matters and the AI-powered autonomy to put those sensors wherever they are needed. It packs more capability and versatility in a smaller and easier-to-use package than has ever existed.[ Skydio X10 ]
An innovative adaptive bipedal robot with bio-inspired multimodal locomotion control can autonomously adapt its body posture to balance on pipes, surmount obstacles of up to 14 centimeters in height (48 percent of its height), and stably move between horizontal and vertical pipe segments. This cutting-edge robotics technology addresses challenges that out-pipe inspection robots have encountered and can enhance out-pipe inspections within the oil and gas industry.Thanks, Poramate!
I’m not totally sure how you’d control all of these extra arms in a productive way, but I’m sure they’ll figure it out!
[ KIMLAB ]
The video is one of the tests we tried on the X30 robot dog in the R&D period, to examine the speed of its stair-climbing ability.[ Deep Robotics ]
They’re calling this the “T-REX” but without a pair of tiny arms. Missed opportunity there.
[ AgileX ]
Drag your mouse to look around within this 360-degree panorama captured by NASA’s Curiosity Mars rover. See the steep slopes, layered buttes, and dark rocks surrounding Curiosity while it was parked below Gediz Vallis Ridge, which formed as a result of violent debris flows that were later eroded by wind into a towering formation. This happened about 3 billion years ago, during one of the last wet periods seen on this part of the Red Planet.[ NASA ]
I don’t know why you need to drive out into the woods to drop-test your sensor rack. Though maybe the stunning Canadian backwoods scenery is reason enough.
[ NORLab ]
Here’s footage of Reachy in the kitchen, opening the fridge’s door and others, cleaning dirt and coffee stains.If they ever make Reachy’s face symmetrical, I will refuse to include it in any more Video Fridays. O_o
[ Pollen Robotics ]
Inertial odometry is an attractive solution to the problem of state estimation for agile quadrotor flight. In this work, we propose a learning-based odometry algorithm that uses an inertial measurement unit (IMU) as the only sensor modality for autonomous drone racing tasks. We show that our inertial odometry algorithm is superior to the state-of-the-art filter-based and optimization-based visual-inertial odometry as well as the state-of-the-art learned-inertial odometry in estimating the pose of an autonomous racing drone.[ UZH RPG ]
Robotic Choreographer is the world’s first dance performance-only robot arm born from the concept of performers that are bigger and faster than humans. This robot has a total length of 3 meters, two rotation axes that rotate infinitely, and an arm rotating up to five times for 1 second.[ MPlusPlus ] via [ Kazumichi Moriyama ]
This video shows the latest development from Extend Robotics, demonstrating the completion of integration of the Mitsubishi Electric Melfa robot. Key demonstrations include 6 degrees-of-freedom (DoF) precision control with real-time inverse kinematics, dual Kinect camera, low-latency streaming and fusion, and high precision control drawing.[ Extend Robotics ]
Here’s what’s been going on at the GRASP Lab at UPenn.
[ GRASP Lab ]
Video Friday is your weekly selection of awesome robotics videos, collected by your friends at IEEE Spectrum robotics. We also post a weekly calendar of upcoming robotics events for the next few months. Please send us your events for inclusion.
Enjoy today’s videos!
Musical dancing is an ubiquitous phenomenon in human society. Providing robots the ability to dance has the potential to make the human/robot coexistence more acceptable. Hence, dancing robots have generated a considerable research interest in the recent years. In this paper, we present a novel formalization of robot dancing as planning and control of optimally timed actions based on beat timings and additional features extracted from the music.Wow! Okay, all robotics videos definitely need confetti cannons.
[ DFKI ]
What an incredibly relaxing robot video this is.
Except for the tree bit, I mean.
Skydio has a fancy new drone, but not for you!
Skydio X10, a drone designed for first responders, infrastructure operators, and the U.S. and allied militaries around the world. It has the sensors to capture every detail of the data that matters and the AI-powered autonomy to put those sensors wherever they are needed. It packs more capability and versatility in a smaller and easier-to-use package than has ever existed.[ Skydio X10 ]
An innovative adaptive bipedal robot with bio-inspired multimodal locomotion control can autonomously adapt its body posture to balance on pipes, surmount obstacles of up to 14 centimeters in height (48 percent of its height), and stably move between horizontal and vertical pipe segments. This cutting-edge robotics technology addresses challenges that out-pipe inspection robots have encountered and can enhance out-pipe inspections within the oil and gas industry.Thanks, Poramate!
I’m not totally sure how you’d control all of these extra arms in a productive way, but I’m sure they’ll figure it out!
[ KIMLAB ]
The video is one of the tests we tried on the X30 robot dog in the R&D period, to examine the speed of its stair-climbing ability.[ Deep Robotics ]
They’re calling this the “T-REX” but without a pair of tiny arms. Missed opportunity there.
[ AgileX ]
Drag your mouse to look around within this 360-degree panorama captured by NASA’s Curiosity Mars rover. See the steep slopes, layered buttes, and dark rocks surrounding Curiosity while it was parked below Gediz Vallis Ridge, which formed as a result of violent debris flows that were later eroded by wind into a towering formation. This happened about 3 billion years ago, during one of the last wet periods seen on this part of the Red Planet.[ NASA ]
I don’t know why you need to drive out into the woods to drop-test your sensor rack. Though maybe the stunning Canadian backwoods scenery is reason enough.
[ NORLab ]
Here’s footage of Reachy in the kitchen, opening the fridge’s door and others, cleaning dirt and coffee stains.If they ever make Reachy’s face symmetrical, I will refuse to include it in any more Video Fridays. O_o
[ Pollen Robotics ]
Inertial odometry is an attractive solution to the problem of state estimation for agile quadrotor flight. In this work, we propose a learning-based odometry algorithm that uses an inertial measurement unit (IMU) as the only sensor modality for autonomous drone racing tasks. We show that our inertial odometry algorithm is superior to the state-of-the-art filter-based and optimization-based visual-inertial odometry as well as the state-of-the-art learned-inertial odometry in estimating the pose of an autonomous racing drone.[ UZH RPG ]
Robotic Choreographer is the world’s first dance performance-only robot arm born from the concept of performers that are bigger and faster than humans. This robot has a total length of 3 meters, two rotation axes that rotate infinitely, and an arm rotating up to five times for 1 second.[ MPlusPlus ] via [ Kazumichi Moriyama ]
This video shows the latest development from Extend Robotics, demonstrating the completion of integration of the Mitsubishi Electric Melfa robot. Key demonstrations include 6 degrees-of-freedom (DoF) precision control with real-time inverse kinematics, dual Kinect camera, low-latency streaming and fusion, and high precision control drawing.[ Extend Robotics ]
Here’s what’s been going on at the GRASP Lab at UPenn.
[ GRASP Lab ]
This paper presents an in-pipe robot with three underactuated parallelogram crawler modules, which can automatically shift its body shape when encountering obstacles. The shape-shifting movement is achieved by only a single actuator through a simple differential mechanism by only combining a pair of spur gears. It can lead to downsizing, cost reduction, and simplification of control for adaptation to obstacles. The parallelogram shape does not change the total belt circumference length, thus, a new mechanism to maintain the belt tension is not necessary. Moreover, the proposed crawler can form the anterior-posterior symmetric parallelogram relative to the moving direction, which generates high adaptability in both forward and backward directions. However, whether the locomotion or shape-shifting is driven depends on the gear ratio of the differential mechanism because their movements are only switched mechanically. Therefore, to clarify the requirements of the gear ratio for the passive adaptation, two outputs of each crawler mechanism (torques of the flippers and front pulley) are quasi-statically analyzed, and how the environmental and design parameters influence the robot performance are verified by real experiments. From the experiments, although the robot could not adapt to the stepped pipe in vertical section, it successfully shifted its crawler’s shape to parallelogram in horizontal section only with our simulated output ratio.
Video Friday is your weekly selection of awesome robotics videos, collected by your friends at IEEE Spectrum robotics. We also post a weekly calendar of upcoming robotics events for the next few months. Please send us your events for inclusion.
Enjoy today’s videos!
Researchers at the University of Washington have developed small robotic devices that can change how they move through the air by “snapping” into a folded position during their descent. When these “microfliers” are dropped from a drone, they use a Miura-ori origami fold to switch from tumbling and dispersing outward through the air to dropping straight to the ground.And you can make your own! The origami part, anyway:
[ Science Robotics ] via [ UW ]
Thanks, Sarah!
A central question in robotics is how to design a control system for an agile, mobile robot. This paper studies this question systematically, focusing on a challenging setting: autonomous drone racing. We show that a neural network controller trained with reinforcement learning (RL) outperforms optimal control (OC) methods in this setting. Our findings allow us to push an agile drone to its maximum performance, achieving a peak acceleration greater than 12 g and a peak velocity of 108 km/h.Also, please see our feature story on a related topic.
[ Science Robotics ]
Ascento has a fresh $4.3m in funding to develop its cute two-wheeled robot for less-cute security applications.
[ Ascento ]
Thanks, Miguel!
The evolution of Roomba is here. Introducing three new robots, with three new powerful ways to clean. For over 30 years, we have been on a mission to build robots that help people to do more. Now, we are answering the call from consumers to expand our robot lineup to include more 2 in 1 robot vacuum and mop options.[ iRobot ]
As the beginning of 2023 Weekly KIMLAB, we want to introduce PAPRAS, Plug-And-Play Robotic Arm System. A series of PAPRAS applications will be posted in coming weeks. If you are interested in details of PAPRAS, please check our paper.Gerardo Bledt was the Head of our Locomotion and Controls Team at Apptronik. He tragically passed away this summer. He was a friend, colleague, and force of nature. He was a maestro with robots, and showed all of us what was possible. We dedicate Apollo and our work to Gerardo.[ Apptronik ]
This robot plays my kind of Jenga.
This teleoperated robot was built by Lingkang Zhang, who tells us that it was inspired by Sanctuary AI’s robot.
[ HRC Model 4 ]
Thanks, Lingkang!
Soft universal grippers are advantageous to safely grasp a wide variety of objects. However, due to their soft material, these grippers have limited lifetimes, especially when operating in unstructured and unfamiliar environments. Our self-healing universal gripper (SHUG) can grasp various objects and recover from substantial realistic damages autonomously. It integrates damage detection, heat-assisted healing, and healing evaluation. Notably, unlike other universal grippers, the entire SHUG can be fully reprocessed and recycled.Thanks Bram!
How would the movie Barbie look like with robots?[ Misty ]
Zoox is so classy that if you get in during the day and get out at night, it’ll apparently give you a free jean jacket.
[ Zoox ]
X30, the next generation of industrial inspection quadruped robot is on its way. It is now moving and climbing faster, and it has stronger adaptability to adverse environments with advanced add-ons.[ DeepRobotics ]
Join us on an incredible journey with Alma, a cutting-edge robot with the potential to revolutionize the lives of people with disabilities. This short documentary takes you behind the scenes of our team’s preparation for the Cybathlon challenge, a unique competition that brings together robotics and human ingenuity to solve real-world challenges.[ Cybathlon ]
NASA’s Moon rover prototype completed software tests. The VIPER mission is managed by NASA’s Ames Research Center in California’s Silicon Valley and is scheduled to be delivered to Mons Mouton near the South Pole of the Moon in late 2024 by Astrobotic’s Griffin lander as part of the Commercial Lunar Payload Services initiative. VIPER will inform future Artemis landing sites by helping to characterize the lunar environment and help determine locations where water and other resources could be harvested to sustain humans over extended stays.[ NASA ]
We are excited to announce Husky Observer, a fully integrated system that enables robotics developers to accelerate inspection solutions. Built on top of the versatile Husky platform, this new configuration will enable robotics developers to build their inspection solutions and fast track their system development.[ Clearpath ]
Land mines and other unexploded ordnance from wars past and present maim or kill thousands of civilians in dozens of nations every year. Finding and disarming them is a slow, dangerous process. Researchers from the Columbia Climate School’s Lamont-Doherty Earth Observatory and other institutions are trying to harness drones, geophysics and artificial intelligence to make the process faster and safer.[ Columbia ]
Drones are being used by responders in the terrible Morocco earthquake. This 5 minute describes the 5 ways in which drones are typically used in earthquake response- and 4 ways that they aren’t.[ CRASAR ]