top of page

Robots That Feel: AI Combines Vision And Touch To Manipulate Objects With Human Precision

  • Writer: Lidi Garcia
    Lidi Garcia
  • Sep 12
  • 5 min read
ree

This work presented a publicly available robotic system that combines vision and touch for more precise manipulation. Researchers developed a system called TactileAloha that uses artificial intelligence to combine tactile and visual information to learn to perform complex tasks. Finally, they demonstrated experiments that proved that sensing the texture and details of objects allows the robot to perform tasks that would be virtually impossible with vision alone.


In recent years, robotics has advanced significantly, especially in systems that use cameras to "see" the environment. This artificial vision is very useful because it provides information about the space and objects around the robot.


However, vision alone is not enough for certain tasks that require precise touch.


Imagine, for example, having to distinguish which side of a zip tie is smooth and which is serrated, or perceive the difference between the parts of Velcro, or even differentiate fabrics by texture. In these situations, humans rely on touch, and it is precisely this sense of touch that is being brought to robots.


Touch is essential to us because it helps us recognize what we're holding and control the force of our interaction with the environment. Without it, it would be difficult to pick up delicate objects without breaking them or fit pieces together correctly.


ree

The idea behind this study was to develop a robotic system that doesn't rely solely on vision, but is also capable of sensing textures, details, and even the orientation of an object.


The system developed was named TactileAloha. It works by attaching a special sensor called GelSight to the robot's gripper. This sensor is capable of capturing detailed information about the surface of objects, such as small irregularities, graininess, or the precise position of specific parts.


This way, the robot not only sees the object through the camera but also feels its texture, much like we do when we run our fingers over something.


ree

Overview of the TactileAloha platform. In addition to updates to the tactile sensor and camera, they utilize a real-time monitoring window that displays real-time camera and sensor observations during data collection to ensure data accuracy. The work area is covered with a pad to protect the robot's gripper during manipulation. Four C-clamps are attached to the four corners of the platform to reduce the impact of vibrations caused by the robot's movement. A foot pedal facilitates data collection.


To teach the robot to use this information, the researchers combined three types of data:


- Camera images (the robot's view).


- Touch signals captured by the tactile sensor.


- The internal movements of the robot's joints (i.e., the perception of how it moves).


These data were then combined by an artificial intelligence system. This system learned by observing human examples (a method called imitation learning), in which the robot copies successful strategies that were demonstrated to it.


Furthermore, the researchers created a way to give more weight to the first actions in each task, as they are crucial in determining whether the robot will successfully complete the task.


Another important point was the way to control the robot's movements over time. Instead of simply planning one action at a time, the system generates several future actions in sequence. To decide what to do at each moment, the robot compares these predictions and chooses the most reliable movement, ensuring accuracy and smoothness.


To test the system, the scientists chose two very practical tasks: correctly inserting a plastic tie and closing a piece of Velcro. In both, feeling the texture and correct orientation of the material was essential.


ree

The robot performs the cable tie insertion task: Insert one cable tie into the head of the other. Two cable ties are initially placed in the work area. To facilitate gripping by the robot's gripper, two supports are 3D printed to hold the cable ties upright. The orientation of the near cable tie is fixed, while the direction of the far cable tie is random. Furthermore, the head of the far cable tie cannot be observed because it is obstructed by the robot's base. During manipulation, the robot's arms first grasp both cable ties (A.1, B.1). The tactile sensor then detects the orientation, and the arms align themselves by adjusting their posture to ensure that the toothed side aligns with the tongue of the receiving cable tie (A.2, B.2). Finally, the left cable tie is inserted into the right zipper head (A.3, B.3) with precise movement. Depending on the initial cable tie orientation, different manipulation sequences were automatically generated, as shown in A and B, based on tactile detection.


The results showed that TactileAloha performed these tasks better than vision-only systems. The success rate increased, on average, by 11% compared to the state-of-the-art system available to date.


ree

In summary, this work presented a publicly available robotic system that combines vision and touch for more precise manipulation. It also presented an artificial intelligence method that combines tactile and visual information to learn to perform complex tasks. Finally, experiments demonstrated that sensing the texture and details of objects allows robots to perform tasks that would be virtually impossible with vision alone.


This demonstrates how the future of robotics is increasingly closer to replicating our own human capabilities, combining vision and touch so that machines can interact more naturally and effectively with the world around them.



READ MORE:


TactileAloha: Learning Bimanual Manipulation With Tactile Sensing

Ningquan Gu; Kazuhiro Kosuge; and Mitsuhiro Hayashibe

IEEE Robotics and Automation Letters, vol. 10, no. 8, pp. 8348-8355, Aug. 2025, 

doi: 10.1109/LRA.2025.3585396.


Abstract: 


Tactile texture is vital for robotic manipulation but challenging for camera vision-based observation. To address this, we propose TactileAloha, an integrated tactile-vision robotic system built upon Aloha, with a tactile sensor mounted on the gripper to capture fine-grained texture information and support real-time visualization during teleoperation, facilitating efficient data collection and manipulation. Using data collected from our integrated system, we encode tactile signals with a pre-trained ResNet and fuse them with visual and proprioceptive features. The combined observations are processed by a transformer-based policy with action chunking to predict future actions. We use a weighted loss function during training to emphasize near-future actions, and employ an improved temporal aggregation scheme at deployment to enhance action precision. Experimentally, we introduce two bimanual tasks: zip tie insertion and Velcro fastening, both requiring tactile sensing to perceive the object texture and align two object orientations by two hands. Our proposed method adaptively changes the generated manipulation sequence itself based on tactile sensing in a systematic manner. Results show that our system, leveraging tactile information, can handle texture-related tasks that camera vision-based methods fail to address. Moreover, our method achieves an average relative improvement of approximately 11.0% compared to state-of-the-art method with tactile input, demonstrating its performance.

 
 
 

Comments


© 2020-2025 by Lidiane Garcia

bottom of page