Physical Address

304 North Cardinal St.
Dorchester Center, MA 02124

To interact with the real world, AI will gain physical intelligence


Recently AI The models are surprisingly human in their ability to generate text, audio and video when prompted. However, until now, these algorithms have largely remained relegated to the digital world, rather than the physical, three-dimensional world in which we live. In fact, whenever we try to apply these models to the real world even the most sophisticated struggle to get it right. – Just think, for example, how challenging it has been to develop safe and reliable cars. Although artificially intelligent, not only do these models not only have knowledge of physics, but also often hallucinate, which leads them to make inexplicable errors.

This is the year, however, when AI will finally be there make the leap from the digital world to the real world we live in. Expanding AI beyond its digital boundaries requires reworking how machines think, fusing the digital intelligence of AI with the mechanical prowess of robotics. This is what I call “physical intelligence,” a new form of intelligent machine that can understand dynamic environments, cope with unpredictability, and make real-time decisions. Unlike the models used by standard AI, physical intelligence is rooted in physics; in understanding the fundamental principles of the real world, such as cause and effect.

Such features allow models of physical intelligence to interact and adapt to different environments. In my research group at MIT, we develop models of physical intelligence that we call liquid networks. In one experiment, for example, we trained two drones—one operated by a standard AI model and another by a liquid network—to locate objects in a forest during summer, using data captured by human pilots. While both drones performed equally well when tasked with doing exactly what they were trained to do, when asked to locate objects in different circumstances – during winter or in an urban environment – only the drone of the liquid network has successfully completed its task. This experiment showed us that, unlike traditional AI systems that stop evolving after their initial training phase, liquid networks continue to learn and adapt from experience, just like humans.

Physical intelligence is also capable of physically interpreting and executing complex commands derived from text or images, bridging the gap between digital instruction and real-world execution. For example, in my lab, we have developed a physically intelligent system that, in less than a minute, can iteratively design and then 3D print small robots based on suggestions such as “robot that can walk forward” or ” robot that can pick up objects”.

Other laboratories are also making significant advances. For example, the robotics startup Covariant, founded by UC-Berkeley researcher Pieter Abbeel, is developing chatbots—similar to ChatGTP—that can control robotic arms when invited. They have already secured more than $222 million to develop and deploy sorting robots in warehouses around the world. A team from Carnegie Mellon University has also recently demonstrated that a robot with a single camera and imprecise actuation can perform dynamic and complex parkour moves—including jumping over obstacles twice its height and across spaces twice its length—using a single neural network trained via learning of reinforcement.

If 2023 was the year of text to images and 2024 was text to video, then 2025 will mark the era of physical intelligence, with a new generation of devices, not only robots, but also anything from energy networks to smart homes. – that can interpret what we tell them and carry out their activities in the real world.



Source link