top of page

New AI Mimics Human Learning and Could Revolutionize Robotics


This research has significant implications for robotics, artificial intelligence, and even neuroscience, helping us better understand how humans develop this unique ability to generalize knowledge and apply learning to new situations. The robot learned much better to understand new commands when it received varied training with different combinations of words and movements. In practical terms, this means that robots in the future will be able to learn in a way that is more similar to humans, becoming more adaptable to new situations and commands that were not directly programmed.


Humans have an impressive ability to apply previously acquired knowledge to solve new problems. One of the most important aspects of this ability is compositionality, which means the ability to break down a complex concept into smaller, reusable parts.


For example, if someone learns the action “push a cart,” they can apply that knowledge to understand how “push a chair” works, even if they have never done it before.


This characteristic allows us to generalize learning and adapt it to new situations. In robotics, this ability is a major challenge. The researchers wonder: How can robots develop this same mental flexibility, learning to combine words and concepts dynamically as they interact with the world around them?


In other words, how can a robot learn to understand commands like “get the ball” and then apply that knowledge to understand “get the cube,” even if it has never been specifically trained to do so?

To explore this question, scientists at the Okinawa Institute of Science and Technology in Japan created a neural network model inspired by the human brain. This model was designed to integrate three types of information at the same time:


  1. Vision (what the robot sees),


  2. Proprioception (the perception of its own movement and position of the robotic arm),


  3. Language (the instructions it receives).


This model follows a principle called free energy, which is a theory used to explain how the human brain makes predictions about the environment and adjusts its behavior based on these predictions.


The goal was to create a system that could learn to relate words to physical actions and that could generalize this learning to new combinations of words and movements.

The researchers tested this model on a robotic arm, conducting a series of simulation experiments. The results were revealing: the robot learned much better to understand new commands when it received varied training with different combinations of words and movements.


That is, if it learned to "push the ball" and "lift the cube", it had an easier time understanding a new instruction such as "lift the ball", even without having been directly trained to do so.


This happened because the robot was able to mentally organize the words and movements in a structured way, creating reusable patterns, just as humans do.


To better understand which elements of the model were essential for this learning, the researchers performed ablation tests, which consist of removing certain parts of the system to see how this affects performance.


These tests showed that two factors were fundamental for the robot to learn correctly:


  • Visual attention: the robot needed to focus on the right objects to understand and execute the action correctly.


  • Working memory: he needed to keep temporary information in his "mind" to coordinate movements and achieve goals accurately.


These findings are important because they help us understand how the interaction between language and movement can lead to the development of compositionality, both in humans and in artificial systems.

In practical terms, this means that robots in the future will be able to learn in a way that is more similar to humans, becoming more adaptable to new situations and commands that have not been directly programmed.


This research could have significant implications for robotics, artificial intelligence, and even neuroscience, helping us better understand how humans develop this unique ability to generalize knowledge and apply learning to new situations.



READ MORE:


Development of compositionality through interactive learning of language and action of robots

PRASANNA VIJAYARAGHAVAN, JEFFREY FREDERIC QUEISSER, SERGIO VERDUZCO FLORES, and JUN TANI

SCIENCE ROBOTICS, 22 Jan 2025, Vol 10, Issue 98

DOI: 10.1126/scirobotics.adp0751


Abstract:


Humans excel at applying learned behavior to unlearned situations. A crucial component of this generalization behavior is our ability to compose/decompose a whole into reusable parts, an attribute known as compositionality. One of the fundamental questions in robotics concerns this characteristic: How can linguistic compositionality be developed concomitantly with sensorimotor skills through associative learning, particularly when individuals only learn partial linguistic compositions and their corresponding sensorimotor patterns? To address this question, we propose a brain-inspired neural network model that integrates vision, proprioception, and language into a framework of predictive coding and active inference on the basis of the free-energy principle. The effectiveness and capabilities of this model were assessed through various simulation experiments conducted with a robot arm. Our results show that generalization in learning to unlearned verb-noun compositions is significantly enhanced when training variations of task composition are increased. We attribute this to self-organized compositional structures in linguistic latent state space being influenced substantially by sensorimotor learning. Ablation studies show that visual attention and working memory are essential to accurately generate visuomotor sequences to achieve linguistically represented goals. These insights advance our understanding of mechanisms underlying development of compositionality through interactions of linguistic and sensorimotor experience.

Comments


© 2024 by Lidiane Garcia

bottom of page