Feed aggregator

Small soft robotic systems are being explored for myriad applications in medicine. Specifically, magnetically actuated microrobots capable of remote manipulation hold significant potential for the targeted delivery of therapeutics and biologicals. Much of previous efforts on microrobotics have been dedicated to locomotion in aqueous environments and hard surfaces. However, our human bodies are made of dense biological tissues, requiring researchers to develop new microrobotics that can locomote atop tissue surfaces. Tumbling microrobots are a sub-category of these devices capable of walking on surfaces guided by rotating magnetic fields. Using microrobots to deliver payloads to specific regions of sensitive tissues is a primary goal of medical microrobots. Central nervous system (CNS) tissues are a prime candidate given their delicate structure and highly region-specific function. Here we demonstrate surface walking of soft alginate capsules capable of moving on top of a rat cortex and mouse spinal cord ex vivo, demonstrating multi-location small molecule delivery to up to six different locations on each type of tissue with high spatial specificity. The softness of alginate gel prevents injuries that may arise from friction with CNS tissues during millirobot locomotion. Development of this technology may be useful in clinical and preclinical applications such as drug delivery, neural stimulation, and diagnostic imaging.

When life knocks you down, you’ve got to get back up. Ladybugs take this advice seriously in the most literal sense. If caught on their backs, the insects are able to use their tough exterior wings, called elytra (of late made famous in the game Minecraft), to self-right themselves in just a fraction of a second.

Inspired by this approach, researchers have created self-righting drones with artificial elytra. Simulations and experiments show that the artificial elytra can not only help salvage fixed-wing drones from compromising positions, but also improve the aerodynamics of the vehicles during flight. The results are described in a study published July 9 in IEEE Robotics and Automation Letters.

Charalampos Vourtsis is a doctoral assistant at the Laboratory of Intelligent Systems, Ecole Polytechnique Federale de Lausanne in Switzerland who co-created the new design. He notes that beetles, including ladybugs, have existed for tens of millions of years. “Over that time, they have developed several survival mechanisms that we found to be a source of inspiration for applications in modern robotics,” he says.

His team was particularly intrigued by beetles’ elytra, which for ladybugs are their famous black-spotted, red exterior wing. Underneath the elytra is the hind wing, the semi-transparent appendage that’s actually used for flight.

When stuck on their backs, ladybugs use their elytra to stabilize themselves, and then thrust their legs or hind wings in order to pitch over and self-right. Vourtsis’ team designed Micro Aerial Vehicles (MAVs) that use a similar technique, but with actuators to provide the self-righting force. “Similar to the insect, the artificial elytra feature degrees of freedom that allow them to reorient the vehicle if it flips over or lands upside down,” explains Vourtsis.

The researchers created and tested artificial elytra of different lengths (11, 14 and 17 centimeters) and torques to determine the most effective combination for self-righting a fixed-wing drone. While torque had little impact on performance, the length of elytra was found to be influential.

On a flat, hard surface, the shorter elytra lengths yielded mixed results. However, the longer length was associated with a perfect success rate. The longer elytra were then tested on different inclines of 10°, 20° and 30°, and at different orientations. The drones used the elytra to self-right themselves in all scenarios, except for one position at the steepest incline.  

The design was also tested on seven different terrains: pavement, course sand, fine sand, rocks, shells, wood chips and grass. The drones were able to self-right with a perfect success rate across all terrains, with the exception of grass and fine sand. Vourtsis notes that the current design was made from widely available materials and a simple scale model of the beetle’s elytra—but further optimization may help the drones self-right on these more difficult terrains.

As an added bonus, the elytra were found to add non-negligible lift during flight, which offsets their weight.  

Vourtsis says his team hopes to benefit from other design features of the beetles’ elytra. “We are currently investigating elytra for protecting folding wings when the drone moves on the ground among bushes, stones, and other obstacles, just like beetles do,” explains Vourtsis. “That would enable drones to fly long distances with large, unfolded wings, and safely land and locomote in a compact format in narrow spaces.”

Last week, DARPA announced the twelve teams who will be competing in the Virtual Track of the DARPA Subterranean Challenge Finals, scheduled to take place in September in Louisville, KY. The robots and the environment may be virtual, but the prize money is very real, with $1.5 million of DARPA cash on the table for the teams who are able to find the most subterranean artifacts in the shortest amount of time.

You can check out the list of Virtual Track competitors here, but we’ll be paying particularly close attention to Team Coordinated Robotics and Team BARCS, who have been trading first and second place back and forth across the three previous competitions. But there are many other strong contenders, and since nearly a year will have passed between the Final and the previous Cave Circuit, there’s been plenty of time for all teams to have developed creative new ideas and improvements.

As a quick reminder, the SubT Final will include elements of tunnels, caves, and the urban underground. As before, teams will be using simulated models of real robots to explore the environment looking for artifacts (like injured survivors, cell phones, backpacks, and even hazardous gas), and they’ll have to manage things like austere navigation, degraded sensing and communication, dynamic obstacles, and rough terrain.

While we’re not sure exactly what the Virtual Track is going to look like, one of the exciting aspects of a virtual competition like this is how DARPA is not constrained by things like available physical space or funding. They could make a virtual course that incorporates the inside of the Egyptian pyramids, the Cheyenne Mountain military complex, and my basement, if they were so inclined. We are expecting a combination of the overall themes of the three previous virtual courses (tunnel, cave, and urban), but connected up somehow, and likely with a few surprises thrown in for good measure.

To some extent, the Virtual Track represents the best case scenario for SubT robots, in the sense that fewer things will just spontaneously go wrong. This is something of a compromise, since things very often spontaneously go wrong when you’re dealing with real robots in the real world. This is not to diminish the challenges of the Virtual Track in the least—even the virtual robots aren’t invincible, and their software will need to keep them from running into simulated walls or falling down simulated stairs. But as far as I know, the virtual robots will not experience damage during transport to the event, electronics shorting, motors burning out, emergency stop buttons being accidentally pressed, and that sort of thing. If anything, this makes the Virtual Track more exciting to watch, because you’re seeing teams of virtual robots on their absolute best behavior challenging each other primarily on the cleverness and efficiency of their programmers. 

The other reason that the Virtual Track is more exciting is that unlike the Systems Track, there are no humans in the loop at all. Teams submit their software to DARPA, and then sit back and relax (or not) and watch their robots compete all by themselves in real time. This is a hugely ambitious way to do things, because a single human even a little bit in the loop can provide the kind of critical contextual world knowledge and mission perspective that robots often lack. A human in there somewhere is fine in the near to medium term, but full autonomy is the dream.

As for the Systems Track (which involves real robots on the physical course in Louisville), we’re not yet sure who all of the final competitors will be. The pandemic has made travel complicated, and some international teams aren’t yet sure whether they’ll be able to make it. Either way, we’ll be there at the end of September, when we’ll be able to watch both the Systems and Virtual Track teams compete for the SubT Final championship.

Last week, DARPA announced the twelve teams who will be competing in the Virtual Track of the DARPA Subterranean Challenge Finals, scheduled to take place in September in Louisville, KY. The robots and the environment may be virtual, but the prize money is very real, with $1.5 million of DARPA cash on the table for the teams who are able to find the most subterranean artifacts in the shortest amount of time.

You can check out the list of Virtual Track competitors here, but we’ll be paying particularly close attention to Team Coordinated Robotics and Team BARCS, who have been trading first and second place back and forth across the three previous competitions. But there are many other strong contenders, and since nearly a year will have passed between the Final and the previous Cave Circuit, there’s been plenty of time for all teams to have developed creative new ideas and improvements.

As a quick reminder, the SubT Final will include elements of tunnels, caves, and the urban underground. As before, teams will be using simulated models of real robots to explore the environment looking for artifacts (like injured survivors, cell phones, backpacks, and even hazardous gas), and they’ll have to manage things like austere navigation, degraded sensing and communication, dynamic obstacles, and rough terrain.

While we’re not sure exactly what the Virtual Track is going to look like, one of the exciting aspects of a virtual competition like this is how DARPA is not constrained by things like available physical space or funding. They could make a virtual course that incorporates the inside of the Egyptian pyramids, the Cheyenne Mountain military complex, and my basement, if they were so inclined. We are expecting a combination of the overall themes of the three previous virtual courses (tunnel, cave, and urban), but connected up somehow, and likely with a few surprises thrown in for good measure.

To some extent, the Virtual Track represents the best case scenario for SubT robots, in the sense that fewer things will just spontaneously go wrong. This is something of a compromise, since things very often spontaneously go wrong when you’re dealing with real robots in the real world. This is not to diminish the challenges of the Virtual Track in the least—even the virtual robots aren’t invincible, and their software will need to keep them from running into simulated walls or falling down simulated stairs. But as far as I know, the virtual robots will not experience damage during transport to the event, electronics shorting, motors burning out, emergency stop buttons being accidentally pressed, and that sort of thing. If anything, this makes the Virtual Track more exciting to watch, because you’re seeing teams of virtual robots on their absolute best behavior challenging each other primarily on the cleverness and efficiency of their programmers. 

The other reason that the Virtual Track is more exciting is that unlike the Systems Track, there are no humans in the loop at all. Teams submit their software to DARPA, and then sit back and relax (or not) and watch their robots compete all by themselves in real time. This is a hugely ambitious way to do things, because a single human even a little bit in the loop can provide the kind of critical contextual world knowledge and mission perspective that robots often lack. A human in there somewhere is fine in the near to medium term, but full autonomy is the dream.

As for the Systems Track (which involves real robots on the physical course in Louisville), we’re not yet sure who all of the final competitors will be. The pandemic has made travel complicated, and some international teams aren’t yet sure whether they’ll be able to make it. Either way, we’ll be there at the end of September, when we’ll be able to watch both the Systems and Virtual Track teams compete for the SubT Final championship.

Sensor morphology and structure has the ability to significantly aid and improve tactile sensing capabilities, through mechanisms such as improved sensitivity or morphological computation. However, different tactile tasks require different morphologies posing a challenge as to how to best design sensors, and also how to enable sensor morphology to be varied. We introduce a jamming filter which, when placed over a tactile sensor, allows the filter to be shaped and molded online, thus varying the sensor structure. We demonstrate how this is beneficial for sensory tasks analyzing how the change in sensor structure varies the information that is gained using the sensor. Moreover, we show that appropriate morphology can significantly influence discrimination, and observe how the selection of an appropriate filter can increase the object classification accuracy when using standard classifiers by up to 28%.

You’re an engineer trying to work out a solution to a complicated problem.  You have been at this problem for the last three days. You’ve been leveraging your expertise in innovative methods and other disciplined processes, but you still haven’t gotten to where you need to be.

Imagine if you could forego the last thirty hours of work, and instead you could have reached a novel solution in just 30 minutes. In addition to having saved yourself nearly a week of time, you would have not only arrived at a solution to your vexing engineering issue, but you also would have prepared all the necessary documentation to apply for intellectual property (IP) protection for it. 

This is now what’s available from IP.com  with its latest suite of workflow solutions dubbed IQ Ideas PlusTM.  IQ Ideas Plus makes it easy for inventors to submit, refine, and collaborate on ideas that are then delivered to the IP team for review.  This new workflow solution is built on IP.com’s AI natural language processing engine, Semantic GistTM, which the company has been refining since 1994. The IQ Ideas Plus portfolio was introduced earlier this year in the U.S. and has started rolling out worldwide.  

“The great thing about Semantic Gist is that it is set up to do a true semantic search,” explained Dr. William Fowlkes, VP Analytics and Workflow Solutions at IP.com and developer of the IQ Ideas Plus solution. “It works off of your description. It does not require you to use arcane codes to define subject matters, to use keywords, or rely on complex Boolean constructs to find the key technology that you're looking for.”

The program is leveraging AI to analyze your words. So, the description of your problem is turned into a query. The AI engine then analyzes that query for its technical content and then using essentially cosine-similarity-type techniques and vector math, it will search eight or nine million patents, from any field, that are similar to your problem. 

“Even patents that look like they're in a different field sometimes have some pieces, some key technology nuggets, that are actually similar to your problem and it will find those,” added Fowlkes.

In a typical session, you might spend 10 – 15 minutes describing your problem on the IQ Ideas Plus template, which includes root cause analysis, when you need to fix a specific problem, or system improvement analysis, when you are asked to develop the next big thing for an existing product. The template lists those elements that you need to include so that you describe all the relevant factors and how they work together. 

The template involves a graphical user interface (GUI) that starts by asking you to name your new analysis and to describe the type of analysis you’ll be conducting: “Solve a Problem”, or “Improve a System”. 

After you’ve chosen to ‘Solve a Problem’, for example, you are given a drop-down menu that asks you what field this problem resides in, i.e., mechanical engineering, electrical engineering, etc. The next drop-down menu then asks what sub-group this field belongs to, i.e., aerospace. After you’ve chosen your fields, you write a fairly simple description of your problem and ask for a solution (How do I fix…?). 

You then press the button, and three to five seconds later, you’re provided two lists – “Functional Concepts” and “Inventive Principles”. One can think of the Functional Concepts list as a thorough catalogue of all the prior art in this area. What really distinguishes the IQ Ideas Plus process is the “Inventive Principles” list, which is abstractions from previous patents or patent applications. 

The semantic engine returns ordered lists with the most relevant results at the top. Of course, as you scroll down through the list, after the first 10 to 20, the results become less and less relevant.

What will often happen is that as you work through both the “Functional Concepts” and “Inventive Principles” lists you begin to realize that you’ve omitted elements to your description, or that your description should go in a slightly different direction based on the results. While this represents a slightly iterative process, each iteration is just as fast as the first. In fact, it's faster because you no longer need to spend 10 minutes writing down your changes. All along the process, there's a workbook, similar to an electronic lab notebook, for you to jot down your ideas. 

As you jot down your ideas based on the recommendations from the AI, it will offer you the ability to run a concept evaluation, telling you whether the concept is “marginally acceptable” or “good”, for example.  You can use this concept evaluation tool to understand whether you have written your problem and solution in a way that it's unique or novel, or whether you should consider going back to the drawing board to keep iterating on it. 

When you get a high-scoring idea, the next module, called “Inventor’s Aide,” helps you write a very clear invention disclosure. In many organizations, drafting and submitting disclosures can be a pain point for engineers. Inventor’s Aide makes the process fast and easy providing suggestions to make the language clear and concise.

With the IQ Ideas Plus suite of tools, all of the paperwork (i.e., a list of related or similar patents, companies active in the field, a full technology landscape review, etc.) is included as attachments to your invention disclosure so that when it gets sent to the patent committee, they can look at the idea and know what art is already there and what technologies are in the space. They can then vet your idea, which has been delivered in a clear, concise manner with no jargon, so they understand the idea you have written. 

The cycle time between a patent review committee looking at your disclosure and you getting it back can sometimes take weeks.  IQ Ideas Plus shortens the cycle time, drives efficiencies and reduces a lot of frustration on both ends of the equation.  Moving more complete disclosures through the system improves the grant rate of the applications because the tool has helped document necessary legwork during the process. 

“IQ Ideas does a great job of both helping you to find novel solutions using the brainstorming modules, and then analyzing those new ideas using the Inventor’s Aide module,” Fowlkes said.

Fowkes argues that this really benefits both sides of the invention process – product development engineers and IP teams. For the engineers, filing invention disclosures is a very burdensome task. For the patent review committees or IP Counsel, getting clear, concise disclosures, free of jargon and acronyms and complete with documentation of prior art attached, makes the review faster and more efficient.

Professor Greg Gdowski,  Executive Director of the Center for Medical Technology & Innovation, at the University of Rochester, deployed IQ Ideas Plus to his students earlier this year.  According to Gdowski, IQ Ideas Plus is very valuable.

“We train our students in carrying out technology landscapes on unmet clinical needs that are observed in our surgical operating rooms.  Despite our best efforts, the students always miss technologies that are out there in the form of patent or patent applications.  IQ Ideas Plus not only helped us brainstorm additional solutions, but it also revealed existing technologies that would have complicated the solution space had they not been identified.”

Gdowski said another important advantage of using IQ Ideas Plus was that it helped the team understand the distribution of patents and companies working on technology related to a specific unmet clinical need (or problem).   “IQ Ideas Plus gives engineers a new lens by which to evaluate solutions to problems and to execute intellectual property landscapes,” Gdowski added.

IQ Ideas Plus enables faster idea generation and collaboration, more complete documents for submission and review so the best ideas surface faster allowing great ideas to get to market faster.

Footnote:
Greg Gdowski is the IEEE Region 1 Director-Elect Candidate
Dr. William Fowlkes is an IEEE Senior Member

This is a guest post. The views expressed here are solely those of the author and do not represent positions of IEEE Spectrum or the IEEE.

Most people associate artificial intelligence with robots as an inseparable pair. In fact, the term “artificial intelligence” is rarely used in research labs. Terminology specific to certain kinds of AI and other smart technologies are more relevant. Whenever I’m asked the question “Is this robot operated by AI?”, I hesitate to answer—wondering whether it would be appropriate to call the algorithms we develop “artificial intelligence.”

First used by scientists such as John McCarthy and Marvin Minsky in the 1950s, and frequently appearing in sci-fi novels or films for decades, AI is now being used in smartphone virtual assistants and autonomous vehicle algorithms. Both historically and today, AI can mean many different things—which can cause confusion.

However, people often express the preconception that AI is an artificially realized version of human intelligence. And that preconception might come from our cognitive bias as human beings. 

We judge robots’ or AI’s tasks in comparison to humans

If you happened to follow this news in 2017, how did you feel when AlphaGo, AI developed by DeepMind, defeated 9-dan Go player Lee Sedol? You may have been surprised or terrified, thinking that AI has surpassed the ability of geniuses. Still, winning a game with an exponential number of possible moves like Go only means that AI has exceeded a very limited part of human intelligence. The same goes for IBM’s AI, Watson, which competed in ‘Jeopardy!’, the television quiz show.

I believe many were impressed to see the Mini Cheetah, developed in my MIT Biomimetic Robotics Laboratory, perform a backflip. While jumping backwards and landing on the ground is very dynamic, eye-catching and, of course, difficult for humans, the algorithm for the particular motion is incredibly simple compared to one that enables stable walking that requires much more complex feedback loops. Achieving robot tasks that are seemingly easy for us is often extremely difficult and complicated. This gap occurs because we tend to think of a task’s difficulty based on human standards.  

Achieving robot tasks that are seemingly easy for us is often extremely difficult and complicated.

We tend to generalize AI functionality after watching a single robot demonstration. When we see someone on the street doing backflips, we tend to assume this person would be good at walking and running, and also be flexible and athletic enough to be good at other sports. Very likely, such judgement about this person would not be wrong.

However, can we also apply this judgement to robots? It’s easy for us to generalize and determine AI performance based on an observation of a specific robot motion or function, just as we do with humans. By watching a video of a robot hand-solving Rubik’s Cube at OpenAI, an AI research lab, we think that the AI can perform all other simpler tasks because it can perform such a complex one. We overlook the fact that this AI’s neural network was only trained for a limited type of task; solving the Rubik’s Cube in that configuration. If the situation changes—for example, holding the cube upside down while manipulating it—the algorithm does not work as well as might be expected.

Unlike AI, humans can combine individual skills and apply them to multiple complicated tasks. Once we learn how to solve a Rubik’s Cube, we can quickly work on the cube even when we’re told to hold it upside down, though it may feel strange at first. Human intelligence can naturally combine the objectives of not dropping the cube and solving the cube. Most robot algorithms will require new data or reprogramming to do so. A person who can spread jam on bread with a spoon can do the same using a fork. It is obvious. We understand the concept of “spreading” jam, and can quickly get used to using a completely different tool. Also, while autonomous vehicles require actual data for each situation, human drivers can make rational decisions based on pre-learned concepts to respond to countless situations. These examples show one characteristic of human intelligence in stark contrast to robot algorithms, which cannot perform tasks with insufficient data.

HYUNG TAEK YOON

Mammals have continuously been evolving for more than 65 million years. The entire time humans spent on learning math, using languages, and playing games would sum up to a mere 10,000 years. In other words, humanity spent a tremendous amount of time developing abilities directly related to survival, such as walking, running, and using our hands. Therefore, it may not be surprising that computers can compute much faster than humans, as they were developed for this purpose in the first place. Likewise, it is natural that computers cannot easily obtain the ability to freely use hands and feet for various purposes as humans do. These skills have been attained through evolution for over 10 million years.

This is why it is unreasonable to compare robot or AI performance from demonstrations to that of an animal or human’s abilities. It would be rash to believe that robot technologies involving walking and running like animals are complete, while watching videos of the Cheetah robot running across fields at MIT and leaping over obstacles. Numerous robot demonstrations still rely on algorithms set for specialized tasks in bounded situations. There is a tendency, in fact, for researchers to select demonstrations that seem difficult, as it can produce a strong impression. However, this level of difficulty is from the human perspective, which may be irrelevant to the actual algorithm performance.

Humans are easily influenced by instantaneous and reflective perception before any logical thoughts. And this cognitive bias is strengthened when the subject is very complicated and difficult to analyze logically—for example, a robot that uses machine learning. 

Robotic demonstrations still rely on algorithms set for specialized tasks in bounded situations.

So where does our human cognitive bias come from? I believe it comes from our psychological tendency to subconsciously anthropomorphize the subjects we see. Humans have evolved as social animals, probably developing the ability to understand and empathize with each other in the process. Our tendency to anthropomorphize subjects would have come from the same evolutionary process. People tend to use the expression “teaching robots” when they refer to programing algorithms. Nevertheless, we are used to using anthropomorphized expressions. As the 18th century philosopher David Hume said, “There is a universal tendency among mankind to conceive all beings like themselves. We find human faces in the moon, armies in the clouds.”

Of course, we not only anthropomorphize subjects’ appearance but also their state of mind. For example, when Boston Dynamics released a video of its engineers kicking a robot, many viewers reacted by saying “this is cruel,” and that they “pity the robot.” A comment saying, “one day, robots will take revenge on that engineer” received likes. In reality, the engineer was simply testing the robot’s balancing algorithm. However, before any thought process to comprehend this situation, the aggressive motion of kicking combined with the struggling of the animal-like robot is instantaneously transmitted to our brains, leaving a strong impression. Like this, such instantaneous anthropomorphism has a deep effect on our cognitive process. 

Humans process information qualitatively, and computers, quantitively

Looking around, our daily lives are filled with algorithms, as can be seen by machines and services that run on these algorithms. All algorithms operate on numbers. We use the terms such as “objective function,” which is a numerical function that represents a certain objective. Many algorithms have the sole purpose of reaching the maximum or minimum value of this function, and an algorithm’s characteristics differ based on how it achieves this.
The goal of tasks such as winning a game of Go or chess are relatively easy to quantify. The easier quantification is, the better the algorithms work. On the contrary, humans often make decisions without quantitative thinking.  

As an example, consider cleaning a room. The way we clean differs subtly from day to day, depending on the situation, depending on whose room it is, and depending on how one feels. Were we trying to maximize a certain function in this process? We did no such thing. The act of cleaning has been done with an abstract objective of “clean enough.” Besides, the standard for how much is “enough” changes easily. This standard may be different among people, causing conflicts particularly among family members or roommates. 

There are many other examples. When you wash your face every day, which quantitative indicators do you intend to maximize with your hand movements? How hard do you rub? When choosing what to wear? When choosing what to have for dinner? When choosing which dish to wash first? The list goes on. We are used to making decisions that are good enough by putting together information we already have. However, we often do not check whether every single decision is optimized. Most of the time, it is impossible to know because we would have to satisfy numerous contradicting indicators with limited data. When selecting groceries with a friend at the store, we cannot each quantify standards for groceries and make a decision based on these numerical values. Usually, when one picks something out, the other will either say “OK!” or suggest another option. This is very different from saying this vegetable “is the optimal choice!” It is more like saying “this is good enough” 

This operational difference between people and algorithms may cause troubles when designing work or services we expect robots to perform. This is because while algorithms perform tasks based on quantitative values, humans’ satisfaction, the outcome of the task, is difficult to be quantified completely. It is not easy to quantify the goal of a task that must adapt to individual preferences or changing circumstances like the aforementioned room cleaning or dishwashing tasks. That is, to coexist with humans, robots may have to evolve not to optimize particular functions, but to achieve “good enough.” Of course, the latter is much more difficult to achieve robustly in real-life situations where you need to manage so many conflicting objectives and qualitative constraints. 

Actually, we do not know what we are doing

Try to recall the most recent meal you had before reading this. Can you remember what you had? Then, can you also remember the process of chewing and swallowing the food? Do you know what exactly your tongue was doing at that very moment? Our tongue does so many things for us. It helps us put food in our mouths, distribute the food between our teeth, swallow the finely chewed pieces, or even send large pieces back toward our teeth, if needed. We can naturally do all of this, even while talking to a friend, using your tongue also in charge of pronunciation. How much do our conscious decisions contribute to the movement of our tongues that accomplish so many complex tasks simultaneously? It may seem like we are moving our tongues as we want, but in fact, there are more moments when the tongue is moving automatically, taking high-level commands from our consciousness. This is why we cannot remember detailed movements of our tongues during a meal. We know little about their movement in the first place.

We may assume that our hands are the most consciously controllable organ, but many hand movements also happen automatically and unconsciously, or subconsciously at most. For those who disagree, try putting something like keys in your pocket and take it back out. In that short moment, countless micromanipulations instantly and seamlessly coordinated to complete the task. We often cannot perceive each action separately. We do not even know what units we should divide them into, so we collectively express them as abstract words such as organize, wash, apply, rub, wipe, etc. These verbs are qualitatively defined. They often refer to the aggregate of fine movements and manipulations, whose composition changes depending on the situations. Of course, it is easy even for children to understand and think of this concept, but from the perspective of algorithm development, these words are endlessly vague and abstract.

HYUNG TAEK YOON

Let’s try to teach how to make a sandwich by spreading peanut butter on bread. We can show how this is done and explain with a few simple words. Let’s assume a slightly different situation. Say there is an alien who uses the same language as us, but knows nothing about human civilization or culture. (I know this assumption is already contradictory..., but please bear with me.) Can we explain over the phone how to make a peanut butter sandwich? We will probably get stuck trying to explain how to scoop peanut butter out of the jar. Even grasping the slice of bread is not so simple. We have to grasp the bread strongly enough so we can spread the peanut butter, but not so much so as to ruin the shape of the soft bread. At the same time, we should not drop the bread either. It is easy for us to think of how to grasp the bread, but it will not be easy to express this through speech or text, let alone in a function. Even if it is a human who is learning a task, can we learn a carpenter’s work over the phone? Can we precisely correct tennis or golf postures over the phone? It is difficult to discern to what extent the details we see are done either consciously or unconsciously. 

My point is that not everything we do with our hands and feet can directly be expressed with our language. Things that happen in between successive actions often automatically occur unconsciously, and thus we explain our actions in a much simpler way than how they actually take place. This is why our actions seem very simple, and why we forget how incredible they really are. The limitations of expression often lead to underestimation of actual complexity. We should recognize the fact that difficulty of language depiction can hinder research progress in fields where words are not well developed.

Until recently, AI has been practically applied in information services related to data processing. Some prominent examples today include voice recognition and facial recognition. Now, we are entering a new era of AI that can effectively perform physical services in our midst. That is, the time is coming in which automation of complex physical tasks becomes imperative.

Particularly, our increasingly aging society poses a huge challenge. Shortage of labor is no longer a vague social problem. It is urgent that we discuss how to develop technologies that augment humans’ capability, allowing us to focus on more valuable work and pursue lives uniquely human. This is why not only engineers but also members of society from various fields should improve their understanding of AI and unconscious cognitive biases. It is easy to misunderstand artificial intelligence, as noted above, because it is substantively unlike human intelligence.

Things that are very natural among humans may be cognitive biases for AI and robots. Without a clear understanding of our cognitive biases, we cannot set the appropriate directions for technology research, application, and policy. In order for productive development as a scientific community, we need keen attention to our cognition and deliberate debate in the process of promoting appropriate development and applications of technology.  

Sangbae Kim leads the Biomimetic Robotics Laboratory at MIT. The preceding is an adaptation of a blog Kim posted in June for Naver Labs

This is a guest post. The views expressed here are solely those of the author and do not represent positions of IEEE Spectrum or the IEEE.

Most people associate artificial intelligence with robots as an inseparable pair. In fact, the term “artificial intelligence” is rarely used in research labs. Terminology specific to certain kinds of AI and other smart technologies are more relevant. Whenever I’m asked the question “Is this robot operated by AI?”, I hesitate to answer—wondering whether it would be appropriate to call the algorithms we develop “artificial intelligence.”

First used by scientists such as John McCarthy and Marvin Minsky in the 1950s, and frequently appearing in sci-fi novels or films for decades, AI is now being used in smartphone virtual assistants and autonomous vehicle algorithms. Both historically and today, AI can mean many different things—which can cause confusion.

However, people often express the preconception that AI is an artificially realized version of human intelligence. And that preconception might come from our cognitive bias as human beings. 

We judge robots’ or AI’s tasks in comparison to humans

If you happened to follow this news in 2017, how did you feel when AlphaGo, AI developed by DeepMind, defeated 9-dan Go player Lee Sedol? You may have been surprised or terrified, thinking that AI has surpassed the ability of geniuses. Still, winning a game with an exponential number of possible moves like Go only means that AI has exceeded a very limited part of human intelligence. The same goes for IBM’s AI, Watson, which competed in ‘Jeopardy!’, the television quiz show.

I believe many were impressed to see the Mini Cheetah, developed in my MIT Biomimetic Robotics Laboratory, perform a backflip. While jumping backwards and landing on the ground is very dynamic, eye-catching and, of course, difficult for humans, the algorithm for the particular motion is incredibly simple compared to one that enables stable walking that requires much more complex feedback loops. Achieving robot tasks that are seemingly easy for us is often extremely difficult and complicated. This gap occurs because we tend to think of a task’s difficulty based on human standards.  

Achieving robot tasks that are seemingly easy for us is often extremely difficult and complicated.

We tend to generalize AI functionality after watching a single robot demonstration. When we see someone on the street doing backflips, we tend to assume this person would be good at walking and running, and also be flexible and athletic enough to be good at other sports. Very likely, such judgement about this person would not be wrong.

However, can we also apply this judgement to robots? It’s easy for us to generalize and determine AI performance based on an observation of a specific robot motion or function, just as we do with humans. By watching a video of a robot hand-solving Rubik’s Cube at OpenAI, an AI research lab, we think that the AI can perform all other simpler tasks because it can perform such a complex one. We overlook the fact that this AI’s neural network was only trained for a limited type of task; solving the Rubik’s Cube in that configuration. If the situation changes—for example, holding the cube upside down while manipulating it—the algorithm does not work as well as might be expected.

Unlike AI, humans can combine individual skills and apply them to multiple complicated tasks. Once we learn how to solve a Rubik’s Cube, we can quickly work on the cube even when we’re told to hold it upside down, though it may feel strange at first. Human intelligence can naturally combine the objectives of not dropping the cube and solving the cube. Most robot algorithms will require new data or reprogramming to do so. A person who can spread jam on bread with a spoon can do the same using a fork. It is obvious. We understand the concept of “spreading” jam, and can quickly get used to using a completely different tool. Also, while autonomous vehicles require actual data for each situation, human drivers can make rational decisions based on pre-learned concepts to respond to countless situations. These examples show one characteristic of human intelligence in stark contrast to robot algorithms, which cannot perform tasks with insufficient data.

HYUNG TAEK YOON

Mammals have continuously been evolving for more than 65 million years. The entire time humans spent on learning math, using languages, and playing games would sum up to a mere 10,000 years. In other words, humanity spent a tremendous amount of time developing abilities directly related to survival, such as walking, running, and using our hands. Therefore, it may not be surprising that computers can compute much faster than humans, as they were developed for this purpose in the first place. Likewise, it is natural that computers cannot easily obtain the ability to freely use hands and feet for various purposes as humans do. These skills have been attained through evolution for over 10 million years.

This is why it is unreasonable to compare robot or AI performance from demonstrations to that of an animal or human’s abilities. It would be rash to believe that robot technologies involving walking and running like animals are complete, while watching videos of the Cheetah robot running across fields at MIT and leaping over obstacles. Numerous robot demonstrations still rely on algorithms set for specialized tasks in bounded situations. There is a tendency, in fact, for researchers to select demonstrations that seem difficult, as it can produce a strong impression. However, this level of difficulty is from the human perspective, which may be irrelevant to the actual algorithm performance.

Humans are easily influenced by instantaneous and reflective perception before any logical thoughts. And this cognitive bias is strengthened when the subject is very complicated and difficult to analyze logically—for example, a robot that uses machine learning. 

Robotic demonstrations still rely on algorithms set for specialized tasks in bounded situations.

So where does our human cognitive bias come from? I believe it comes from our psychological tendency to subconsciously anthropomorphize the subjects we see. Humans have evolved as social animals, probably developing the ability to understand and empathize with each other in the process. Our tendency to anthropomorphize subjects would have come from the same evolutionary process. People tend to use the expression “teaching robots” when they refer to programing algorithms. Nevertheless, we are used to using anthropomorphized expressions. As the 18th century philosopher David Hume said, “There is a universal tendency among mankind to conceive all beings like themselves. We find human faces in the moon, armies in the clouds.”

Of course, we not only anthropomorphize subjects’ appearance but also their state of mind. For example, when Boston Dynamics released a video of its engineers kicking a robot, many viewers reacted by saying “this is cruel,” and that they “pity the robot.” A comment saying, “one day, robots will take revenge on that engineer” received likes. In reality, the engineer was simply testing the robot’s balancing algorithm. However, before any thought process to comprehend this situation, the aggressive motion of kicking combined with the struggling of the animal-like robot is instantaneously transmitted to our brains, leaving a strong impression. Like this, such instantaneous anthropomorphism has a deep effect on our cognitive process. 

Humans process information qualitatively, and computers, quantitively

Looking around, our daily lives are filled with algorithms, as can be seen by machines and services that run on these algorithms. All algorithms operate on numbers. We use the terms such as “objective function,” which is a numerical function that represents a certain objective. Many algorithms have the sole purpose of reaching the maximum or minimum value of this function, and an algorithm’s characteristics differ based on how it achieves this.
The goal of tasks such as winning a game of Go or chess are relatively easy to quantify. The easier quantification is, the better the algorithms work. On the contrary, humans often make decisions without quantitative thinking.  

As an example, consider cleaning a room. The way we clean differs subtly from day to day, depending on the situation, depending on whose room it is, and depending on how one feels. Were we trying to maximize a certain function in this process? We did no such thing. The act of cleaning has been done with an abstract objective of “clean enough.” Besides, the standard for how much is “enough” changes easily. This standard may be different among people, causing conflicts particularly among family members or roommates. 

There are many other examples. When you wash your face every day, which quantitative indicators do you intend to maximize with your hand movements? How hard do you rub? When choosing what to wear? When choosing what to have for dinner? When choosing which dish to wash first? The list goes on. We are used to making decisions that are good enough by putting together information we already have. However, we often do not check whether every single decision is optimized. Most of the time, it is impossible to know because we would have to satisfy numerous contradicting indicators with limited data. When selecting groceries with a friend at the store, we cannot each quantify standards for groceries and make a decision based on these numerical values. Usually, when one picks something out, the other will either say “OK!” or suggest another option. This is very different from saying this vegetable “is the optimal choice!” It is more like saying “this is good enough” 

This operational difference between people and algorithms may cause troubles when designing work or services we expect robots to perform. This is because while algorithms perform tasks based on quantitative values, humans’ satisfaction, the outcome of the task, is difficult to be quantified completely. It is not easy to quantify the goal of a task that must adapt to individual preferences or changing circumstances like the aforementioned room cleaning or dishwashing tasks. That is, to coexist with humans, robots may have to evolve not to optimize particular functions, but to achieve “good enough.” Of course, the latter is much more difficult to achieve robustly in real-life situations where you need to manage so many conflicting objectives and qualitative constraints. 

Actually, we do not know what we are doing

Try to recall the most recent meal you had before reading this. Can you remember what you had? Then, can you also remember the process of chewing and swallowing the food? Do you know what exactly your tongue was doing at that very moment? Our tongue does so many things for us. It helps us put food in our mouths, distribute the food between our teeth, swallow the finely chewed pieces, or even send large pieces back toward our teeth, if needed. We can naturally do all of this, even while talking to a friend, using your tongue also in charge of pronunciation. How much do our conscious decisions contribute to the movement of our tongues that accomplish so many complex tasks simultaneously? It may seem like we are moving our tongues as we want, but in fact, there are more moments when the tongue is moving automatically, taking high-level commands from our consciousness. This is why we cannot remember detailed movements of our tongues during a meal. We know little about their movement in the first place.

We may assume that our hands are the most consciously controllable organ, but many hand movements also happen automatically and unconsciously, or subconsciously at most. For those who disagree, try putting something like keys in your pocket and take it back out. In that short moment, countless micromanipulations instantly and seamlessly coordinated to complete the task. We often cannot perceive each action separately. We do not even know what units we should divide them into, so we collectively express them as abstract words such as organize, wash, apply, rub, wipe, etc. These verbs are qualitatively defined. They often refer to the aggregate of fine movements and manipulations, whose composition changes depending on the situations. Of course, it is easy even for children to understand and think of this concept, but from the perspective of algorithm development, these words are endlessly vague and abstract.

HYUNG TAEK YOON

Let’s try to teach how to make a sandwich by spreading peanut butter on bread. We can show how this is done and explain with a few simple words. Let’s assume a slightly different situation. Say there is an alien who uses the same language as us, but knows nothing about human civilization or culture. (I know this assumption is already contradictory..., but please bear with me.) Can we explain over the phone how to make a peanut butter sandwich? We will probably get stuck trying to explain how to scoop peanut butter out of the jar. Even grasping the slice of bread is not so simple. We have to grasp the bread strongly enough so we can spread the peanut butter, but not so much so as to ruin the shape of the soft bread. At the same time, we should not drop the bread either. It is easy for us to think of how to grasp the bread, but it will not be easy to express this through speech or text, let alone in a function. Even if it is a human who is learning a task, can we learn a carpenter’s work over the phone? Can we precisely correct tennis or golf postures over the phone? It is difficult to discern to what extent the details we see are done either consciously or unconsciously. 

My point is that not everything we do with our hands and feet can directly be expressed with our language. Things that happen in between successive actions often automatically occur unconsciously, and thus we explain our actions in a much simpler way than how they actually take place. This is why our actions seem very simple, and why we forget how incredible they really are. The limitations of expression often lead to underestimation of actual complexity. We should recognize the fact that difficulty of language depiction can hinder research progress in fields where words are not well developed.

Until recently, AI has been practically applied in information services related to data processing. Some prominent examples today include voice recognition and facial recognition. Now, we are entering a new era of AI that can effectively perform physical services in our midst. That is, the time is coming in which automation of complex physical tasks becomes imperative.

Particularly, our increasingly aging society poses a huge challenge. Shortage of labor is no longer a vague social problem. It is urgent that we discuss how to develop technologies that augment humans’ capability, allowing us to focus on more valuable work and pursue lives uniquely human. This is why not only engineers but also members of society from various fields should improve their understanding of AI and unconscious cognitive biases. It is easy to misunderstand artificial intelligence, as noted above, because it is substantively unlike human intelligence.

Things that are very natural among humans may be cognitive biases for AI and robots. Without a clear understanding of our cognitive biases, we cannot set the appropriate directions for technology research, application, and policy. In order for productive development as a scientific community, we need keen attention to our cognition and deliberate debate in the process of promoting appropriate development and applications of technology.  

Sangbae Kim leads the Biomimetic Robotics Laboratory at MIT. The preceding is an adaptation of a blog Kim posted in June for Naver Labs

We demonstrate how a reinforcement learning agent can use compositional recurrent neural networks to learn to carry out commands specified in linear temporal logic (LTL). Our approach takes as input an LTL formula, structures a deep network according to the parse of the formula, and determines satisfying actions. This compositional structure of the network enables zero-shot generalization to significantly more complex unseen formulas. We demonstrate this ability in multiple problem domains with both discrete and continuous state-action spaces. In a symbolic domain, the agent finds a sequence of letters that satisfy a specification. In a Minecraft-like environment, the agent finds a sequence of actions that conform to a formula. In the Fetch environment, the robot finds a sequence of arm configurations that move blocks on a table to fulfill the commands. While most prior work can learn to execute one formula reliably, we develop a novel form of multi-task learning for RL agents that allows them to learn from a diverse set of tasks and generalize to a new set of diverse tasks without any additional training. The compositional structures presented here are not specific to LTL, thus opening the path to RL agents that perform zero-shot generalization in other compositional domains.

A significant challenge for the control of a robotic lower extremity rehabilitation exoskeleton is to ensure stability and robustness during programmed tasks or motions, which is crucial for the safety of the mobility-impaired user. Due to various levels of the user’s disability, the human-exoskeleton interaction forces and external perturbations are unpredictable and could vary substantially and cause conventional motion controllers to behave unreliably or the robot to fall down. In this work, we propose a new, reinforcement learning-based, motion controller for a lower extremity rehabilitation exoskeleton, aiming to perform collaborative squatting exercises with efficiency, stability, and strong robustness. Unlike most existing rehabilitation exoskeletons, our exoskeleton has ankle actuation on both sagittal and front planes and is equipped with multiple foot force sensors to estimate center of pressure (CoP), an important indicator of system balance. This proposed motion controller takes advantage of the CoP information by incorporating it in the state input of the control policy network and adding it to the reward during the learning to maintain a well balanced system state during motions. In addition, we use dynamics randomization and adversary force perturbations including large human interaction forces during the training to further improve control robustness. To evaluate the effectiveness of the learning controller, we conduct numerical experiments with different settings to demonstrate its remarkable ability on controlling the exoskeleton to repetitively perform well balanced and robust squatting motions under strong perturbations and realistic human interaction forces.

Space exploration and exploitation depend on the development of on-orbit robotic capabilities for tasks such as servicing of satellites, removing of orbital debris, or construction and maintenance of orbital assets. Manipulation and capture of objects on-orbit are key enablers for these capabilities. This survey addresses fundamental aspects of manipulation and capture, such as the dynamics of space manipulator systems (SMS), i.e., satellites equipped with manipulators, the contact dynamics between manipulator grippers/payloads and targets, and the methods for identifying properties of SMSs and their targets. Also, it presents recent work of sensing pose and system states, of motion planning for capturing a target, and of feedback control methods for SMS during motion or interaction tasks. Finally, the paper reviews major ground testing testbeds for capture operations, and several notable missions and technologies developed for capture of targets on-orbit.

Video Friday is your weekly selection of awesome robotics videos, collected by your Automaton bloggers. We’ll also be posting a weekly calendar of upcoming robotics events for the next few months; here’s what we have so far (send us your events!):

Humanoids 2020 – July 19-21, 2021 – [Online Event] RO-MAN 2021 – August 8-12, 2021 – [Online Event] DARPA SubT Finals – September 21-23, 2021 – Louisville, KY, USA WeRobot 2021 – September 23-25, 2021 – Coral Gables, FL, USA IROS 2021 – September 27-1, 2021 – [Online Event] ROSCon 2021 – October 21-23, 2021 – New Orleans, LA, USA

Let us know if you have suggestions for next week, and enjoy today's videos.

This 3D printed hand uses fluidic circuits (which respond differently to different input pressures) to create a soft robotic hand that only needs one input source to actuate three fingers independently.

[ UMD ]

Thanks, Fan!

Nano quadcopters are ideal for gas source localization (GSL) as they are safe, agile and inexpensive. However, their extremely restricted sensors and computational resources make GSL a daunting challenge. In this work, we propose a novel bug algorithm named ‘Sniffy Bug’, which allows a fully autonomous swarm of gas-seeking nano quadcopters to localize a gas source in an unknown, cluttered and GPS-denied environments.

[ MAVLab ]

Large-scale aerial deployment of miniature sensors in tough environmental conditions requires a deployment device that is lightweight, robust and steerable. We present a novel samara-inspired autorotating craft that is capable of autorotating and diving.

[ Paper ]

Scientists from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have recently created a new algorithm to help a robot find efficient motion plans to ensure physical safety of its human counterpart. In this case, the bot helped put a jacket on a human, which could potentially prove to be a powerful tool in expanding assistance for those with disabilities or limited mobility.

[ MIT CSAIL ]

Listening to the language here about SoftBank's Whiz cleaning robot, I’ve got some concerns.

My worry is that the value that the robot is adding here is mostly in perception of cleaning, rather than actually, you know, cleaning. Which is still value, and that’s fine, but whether it’s long term commercially viable is less certain.

[ SoftBank ]

This paper presents a novel method for multi-legged robots to probe and test the terrain for collapses using its legs while walking. The proposed method improves on existing terrain probing approaches, and integrates the probing action into a walking cycle. A follow the-leader strategy with a suitable gait and stance is presented and implemented on a hexapod robot.

[ CSIRO ]

Robotics researchers from NVIDIA and University of Southern California presented their work at the 2021 Robotics: Science and Systems (RSS) conference called DiSECt, the first differentiable simulator for robotic cutting. The simulator accurately predicts the forces acting on a knife as it presses and slices through natural soft materials, such as fruits and vegetables.

[ NVIDIA ]

These videos from Moley Robotics have too many cuts in them to properly judge how skilled the robot is, but as far as I know, it only cooks the "perfect" steak in the sense that it will cook a steak of a given weight for a given time.

[ Moley ]

Most hands are designed for general purpose, as it’s very tedious to make task-specific hands. Existing methods battle trade-offs between the complexity of designs critical for contact-rich tasks, and the practical constraints of manufacturing, and contact handling.

This led researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) to create a new method to computationally optimize the shape and control of a robotic manipulator for a specific task. Their system uses software to manipulate the design, simulate the robot doing a task, and then provide an optimization score to assess the design and control.

[ MIT CSAIL ]

Drone Adventures maps wildlife in Namibia from above.

[ Drone Adventures ]

Some impressive electronics disassembly tasks using a planner that just unscrews things, shakes them, and sees whether it then needs to unscrew more things.

[ Imagine ]

The reconfigurable robot ReQuBiS can very well transition into biped, quadruped and snake configurations without the need of re-arranging modules, unlike most state-of-the-art models. Its design allows the robot to split into two agents to perform tasks in parallel for biped and snake mobility.

[ Paper ] via [ IvLabs ]

Thanks, Fan!

World Vision Kenya aims to improve the climate resilience of nine villages in Tana River County, sustainably manage the ecosystem and climate change, and restore the communities’ livelihoods by reseeding the hotspot areas with indigenous trees, covering at least 250 acres for every village. This can be challenging to achieve, considering the vast areas needing coverage. That’s why World Vision Kenya partnered with Kenya Flying Labs to help make this process faster, easier, and more efficient (and more fun!).

[ WeRobotics ]

Pieter Abbeel’s Robot Brains Podcast has started posting video versions of the episodes, if you’re into that sort of thing. There are interesting excerpts as well, a few of which we can share here.

[ Robot Brains ]

RSS took place this week with paper presentations, talks, Q&As, and more, but here are two of the keynotes that are definitely worth watching.

[ RSS 2021 ]

Video Friday is your weekly selection of awesome robotics videos, collected by your Automaton bloggers. We’ll also be posting a weekly calendar of upcoming robotics events for the next few months; here’s what we have so far (send us your events!):

Humanoids 2020 – July 19-21, 2021 – [Online Event] RO-MAN 2021 – August 8-12, 2021 – [Online Event] DARPA SubT Finals – September 21-23, 2021 – Louisville, KY, USA WeRobot 2021 – September 23-25, 2021 – Coral Gables, FL, USA IROS 2021 – September 27-1, 2021 – [Online Event] ROSCon 2021 – October 21-23, 2021 – New Orleans, LA, USA

Let us know if you have suggestions for next week, and enjoy today's videos.

This 3D printed hand uses fluidic circuits (which respond differently to different input pressures) to create a soft robotic hand that only needs one input source to actuate three fingers independently.

[ UMD ]

Thanks, Fan!

Nano quadcopters are ideal for gas source localization (GSL) as they are safe, agile and inexpensive. However, their extremely restricted sensors and computational resources make GSL a daunting challenge. In this work, we propose a novel bug algorithm named ‘Sniffy Bug’, which allows a fully autonomous swarm of gas-seeking nano quadcopters to localize a gas source in an unknown, cluttered and GPS-denied environments.

[ MAVLab ]

Large-scale aerial deployment of miniature sensors in tough environmental conditions requires a deployment device that is lightweight, robust and steerable. We present a novel samara-inspired autorotating craft that is capable of autorotating and diving.

[ Paper ]

Scientists from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have recently created a new algorithm to help a robot find efficient motion plans to ensure physical safety of its human counterpart. In this case, the bot helped put a jacket on a human, which could potentially prove to be a powerful tool in expanding assistance for those with disabilities or limited mobility.

[ MIT CSAIL ]

Listening to the language here about SoftBank's Whiz cleaning robot, I’ve got some concerns.

My worry is that the value that the robot is adding here is mostly in perception of cleaning, rather than actually, you know, cleaning. Which is still value, and that’s fine, but whether it’s long term commercially viable is less certain.

[ SoftBank ]

This paper presents a novel method for multi-legged robots to probe and test the terrain for collapses using its legs while walking. The proposed method improves on existing terrain probing approaches, and integrates the probing action into a walking cycle. A follow the-leader strategy with a suitable gait and stance is presented and implemented on a hexapod robot.

[ CSIRO ]

Robotics researchers from NVIDIA and University of Southern California presented their work at the 2021 Robotics: Science and Systems (RSS) conference called DiSECt, the first differentiable simulator for robotic cutting. The simulator accurately predicts the forces acting on a knife as it presses and slices through natural soft materials, such as fruits and vegetables.

[ NVIDIA ]

These videos from Moley Robotics have too many cuts in them to properly judge how skilled the robot is, but as far as I know, it only cooks the "perfect" steak in the sense that it will cook a steak of a given weight for a given time.

[ Moley ]

Most hands are designed for general purpose, as it’s very tedious to make task-specific hands. Existing methods battle trade-offs between the complexity of designs critical for contact-rich tasks, and the practical constraints of manufacturing, and contact handling.

This led researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) to create a new method to computationally optimize the shape and control of a robotic manipulator for a specific task. Their system uses software to manipulate the design, simulate the robot doing a task, and then provide an optimization score to assess the design and control.

[ MIT CSAIL ]

Drone Adventures maps wildlife in Namibia from above.

[ Drone Adventures ]

Some impressive electronics disassembly tasks using a planner that just unscrews things, shakes them, and sees whether it then needs to unscrew more things.

[ Imagine ]

The reconfigurable robot ReQuBiS can very well transition into biped, quadruped and snake configurations without the need of re-arranging modules, unlike most state-of-the-art models. Its design allows the robot to split into two agents to perform tasks in parallel for biped and snake mobility.

[ Paper ] via [ IvLabs ]

Thanks, Fan!

World Vision Kenya aims to improve the climate resilience of nine villages in Tana River County, sustainably manage the ecosystem and climate change, and restore the communities’ livelihoods by reseeding the hotspot areas with indigenous trees, covering at least 250 acres for every village. This can be challenging to achieve, considering the vast areas needing coverage. That’s why World Vision Kenya partnered with Kenya Flying Labs to help make this process faster, easier, and more efficient (and more fun!).

[ WeRobotics ]

Pieter Abbeel’s Robot Brains Podcast has started posting video versions of the episodes, if you’re into that sort of thing. There are interesting excerpts as well, a few of which we can share here.

[ Robot Brains ]

RSS took place this week with paper presentations, talks, Q&As, and more, but here are two of the keynotes that are definitely worth watching.

[ RSS 2021 ]

The latest update to Tesla’s self-driving technology ups the company’s stake in a bold bet that it can deliver autonomous vehicles using cameras alone. But despite improving capabilities in vision-based self-driving, experts say it faces fundamental hurdles.

Last Saturday, Tesla rolled out the much-delayed version 9 of its “Full Self-Driving” (FSD) software, which gives Tesla vehicles limited ability to navigate autonomously. The package, which is already on sale as a $10,000 add-on, has been in beta testing with a select group of drivers since last October. But the latest update marks a significant shift by ditching input from radar sensors and relying solely on the car’s cameras.

This follows the announcement in May that Tesla will be removing radar altogether from its Model 3 and Model Y cars built in the US and suggests the company is doubling down on a strategy at odds with most other self-driving projects. Autonomous vehicles built by Alphabet subsidiary Waymo and GM-owned Cruise fuse input from cameras, radar and ultra-precise lidar and only ply streets pre-mapped using high-resolution 3D laser scans.

Tesla CEO Elon Musk has been vocal in his criticism of lidar due to its high cost and has instead advocated for a “pure vision” approach. That’s controversial due to the lack of redundancy that comes with relying on a single sensor. But the rationale is clear, says Kilian Weinberger, an associate professor at Cornell University who works on computer vision for autonomous vehicles.

“Cameras are dirt cheap compared to lidar,” he says. “By doing this they can put this technology into all the cars they’re selling. If they sell 500,000 cars all of these cars are driving around collecting data for them.”

Data is the lifeblood of the machine learning systems at the heart of self-driving technology. Tesla’s big bet, says Weinberger, is that the mountain of video its fleet amasses will help it reach full autonomy faster than the smaller amount of lidar data its competitors relying on a small number of more sensor-laden cars driven by employees.

Speaking at the Conference on Computer Vision and Pattern Recognition last month, Tesla’s AI chief Andrej Karpathy revealed the company had built a supercomputer, which he claimed was the fifth most powerful in the world, to process all this data. He also explained the decision to drop radar, saying that after training on more than 1.5 petabytes of video augmented with both radar data and human labeling the vision-only system now significantly outperforms their previous approach.

The justification for dropping radar does make sense, says Weinberger, and he adds that the gap between lidar and cameras has narrowed in recent years. Lidar’s big selling point is incredibly accurate depth sensing achieved by bouncing lasers off objects—but vision-based systems can also estimate depth, and their capabilities have improved significantly.

Weinberger and colleagues made a breakthrough in 2019 by converting camera-based depth estimations into the same kind of 3D point clouds used by lidar, significantly improving accuracy. Karpathy revealed that the company was using such a “pseudo-lidar” technique at the Scaled Machine Learning Conference last year.

How you estimate depth is important though. One approach compares images from two cameras spaced sufficiently far apart to triangulate the distance to objects. The other is to train AI on huge numbers of images until it learns to pick up depth cues. Weinberger says this is probably the approach Tesla uses because its front facing cameras are too close together for the first technique.

The benefit of triangulation-based techniques is that measurements are based in physics, much like lidar, says Leaf Jiang, CEO of start-up NODAR, which develops camera-based 3D vision technology based on this approach. Inferring distance is inherently more vulnerable to mistakes in ambiguous situations, he says, for instance, distinguishing an adult at 50 meters from a child at 25 meters. “It tries to figure out distance based on perspective cues or shading cues, or whatnot, and that’s not always reliable,” he says.

How you sense depth is only part of the problem, though. State-of-the-art machine learning simply recognizes patterns, which means it struggles with novel situations. Unlike a human driver, if it hasn’t encountered a scenario before it has no ability to reason about what to do. “Any AI system has no understanding of what's actually going on,” says Weinberger.

The logic behind collecting ever more data is that you will capture more of the rare scenarios that could flummox your AI, but there’s a fundamental limit to this approach. “Eventually you have unique cases. And unique cases you can’t train for,” says Weinberger. “The benefits of adding more and more data are diminishing at some point.”

This is the so-called “long tail problem,” says Marc Pollefeys, a professor at ETH Zurich who has worked on camera-based self-driving, and it presents a major hurdle for going from the kind of driver assistance systems already common in modern cars to truly autonomous vehicles. The underlying technology is similar, he says. But while an automatic braking system designed to augment a driver’s reactions can afford to miss the occasional pedestrian, the margin for error when in complete control of the car is fractions of a percent.

Other self-driving companies try to get round this by reducing the scope for uncertainty. If you pre-map roads you only need to focus on the small amount of input that doesn’t match, says Pollefeys. Similarly, the chance of three different sensors making the same mistake simultaneously are vanishingly small.

The scalability of such an approach is certainly questionable. But trying to go from a system that mostly works to one that almost never makes mistakes by simply pushing ever more data through a machine learning pipeline is “doomed to fail,” says Pollefeys.

“When we see that something works 99 percent of the time, we think it can’t be too hard to make it work 100 percent,” he says. “And that’s actually not the case. Making 10 times fewer mistakes is a gigantic effort.”

Videos posted by Tesla owners after the FSD update showing their vehicles lurching out into highways or being blind to concrete pillars in the middle of the road demonstrate the gulf that still needs to be bridged and suggests Musk’s prediction of full autonomy by the end of the year may have been overly optimistic.

But Pollefeys thinks it’s unlikely Tesla will abandon the narrative that full autonomy is close at hand. “A lot of people already paid for it [Tesla’s FSD package], so they have to keep the hope alive,” he says. “They’re stuck in that story.”

Tesla didn’t respond to an interview request.

Today’s electric motor design requires multiphysics analysis across a wide torque and speed operating range to accommodate rapid development cycles and system integration. Ansys Motor-CAD is accelerating this work-in-progress. Try Ansys Motor-CAD for free for 30-days and let us show you how we can help lower product development costs and reduce time to market today!

Reinforcement learning simulation environments pose an important experimental test bed and facilitate data collection for developing AI-based robot applications. Most of them, however, focus on single-agent tasks, which limits their application to the development of social agents. This study proposes the Chef’s Hat simulation environment, which implements a multi-agent competitive card game that is a complete reproduction of the homonymous board game, designed to provoke competitive strategies in humans and emotional responses. The game was shown to be ideal for developing personalized reinforcement learning, in an online learning closed-loop scenario, as its state representation is extremely dynamic and directly related to each of the opponent’s actions. To adapt current reinforcement learning agents to this scenario, we also developed the COmPetitive Prioritized Experience Replay (COPPER) algorithm. With the help of COPPER and the Chef’s Hat simulation environment, we evaluated the following: (1) 12 experimental learning agents, trained via four different regimens (self-play, play against a naive baseline, PER, or COPPER) with three algorithms based on different state-of-the-art learning paradigms (PPO, DQN, and ACER), and two “dummy” baseline agents that take random actions, (2) the performance difference between COPPER and PER agents trained using the PPO algorithm and playing against different agents (PPO, DQN, and ACER) or all DQN agents, and (3) human performance when playing against two different collections of agents. Our experiments demonstrate that COPPER helps agents learn to adapt to different types of opponents, improving the performance when compared to off-line learning models. An additional contribution of the study is the formalization of the Chef’s Hat competitive game and the implementation of the Chef’s Hat Player Club, a collection of trained and assessed agents as an enabler for embedding human competitive strategies in social continual and competitive reinforcement learning.

In this work, we present several heuristic-based and data-driven active vision strategies for viewpoint optimization of an arm-mounted depth camera to aid robotic grasping. These strategies aim to efficiently collect data to boost the performance of an underlying grasp synthesis algorithm. We created an open-source benchmarking platform in simulation (https://github.com/galenbr/2021ActiveVision), and provide an extensive study for assessing the performance of the proposed methods as well as comparing them against various baseline strategies. We also provide an experimental study with a real-world two finger parallel jaw gripper setup by utilizing an existing grasp planning benchmark in the literature. With these analyses, we were able to quantitatively demonstrate the versatility of heuristic methods that prioritize certain types of exploration, and qualitatively show their robustness to both novel objects and the transition from simulation to the real world. We identified scenarios in which our methods did not perform well and objectively difficult scenarios, and present a discussion on which avenues for future research show promise.

Socially assistive robots are being designed to support people’s well-being in contexts such as art therapy where human therapists are scarce, by making art together with people in an appropriate way. A challenge is that various complex and idiosyncratic concepts relating to art, like emotions and creativity, are not yet well understood. Guided by the principles of speculative design, the current article describes the use of a collaborative prototyping approach involving artists and engineers to explore this design space, especially in regard to general and personalized art-making strategies. This led to identifying a goal: to generate representational or abstract art that connects emotionally with people’s art and shows creativity. For this, an approach involving personalized “visual metaphors” was proposed, which balances the degree to which a robot’s art is influenced by interacting persons. The results of a small user study via a survey provided further insight into people’s perceptions: the general design was perceived as intended and appealed; as well, personalization via representational symbols appeared to lead to easier and clearer communication of emotions than via abstract symbols. In closing, the article describes a simplified demo, and discusses future challenges. Thus, the contribution of the current work lies in suggesting how a robot can seek to interact with people in an emotional and creative way through personalized art; thereby, the aim is to stimulate ideation in this promising area and facilitate acceptance of such robots in everyday human environments.

This paper conceptualizes the problem of emergency evacuation as a paradigm for investigating human-robot interaction. We argue that emergency evacuation offers unique and important perspectives on human-robot interaction while also demanding close attention to the ethical ramifications of the technologies developed. We present a series of approaches for developing emergency evacuation robots and detail several essential design considerations. This paper concludes with a discussion of the ethical implications of emergency evacuation robots and a roadmap for their development, implementation, and evaluation.

A bounded cost path planning method is developed for underwater vehicles assisted by a data-driven flow modeling method. The modeled flow field is partitioned as a set of cells of piece-wise constant flow speed. A flow partition algorithm and a parameter estimation algorithm are proposed to learn the flow field structure and parameters with justified convergence. A bounded cost path planning algorithm is developed taking advantage of the partitioned flow model. An extended potential search method is proposed to determine the sequence of partitions that the optimal path crosses. The optimal path within each partition is then determined by solving a constrained optimization problem. Theoretical justification is provided for the proposed extended potential search method generating the optimal solution. The path planned has the highest probability to satisfy the bounded cost constraint. The performance of the algorithms is demonstrated with experimental and simulation results, which show that the proposed method is more computationally efficient than some of the existing methods.

Pages