Why Nvidia is teaching robots to twirl pens and how generative AI is helping

The field of robotics, a classic application of artificial intelligence, has recently been amplified by the very new and fashionable technology of generative AI, programs such as large language models from OpenAI that can interact with natural language statements. For example, Google’s DeepMind unit this year unveiled RT-2, a large language model that can be presented with an image and a command, and then spit out both a plan of action and the coordinates necessary to complete the command. But there is a threshold that generative programs cannot cross: They can handle “high-level” tasks such as panning the route for a robot to a destination, but they cannot handle “low-level” tasks, such as manipulating the joints of a robot for fine motor control. New work from Nvidia published this month suggests language models may be closer to crossing that divide. A program called Eureka uses language models to set goals that in turn can be used to direct robots at a low level, including inducing them to perform fine-motor tasks such as robot hands manipulating objects. The Eureka program is just the first in what will probably have to be many efforts to cross the divide because Eureka is operating inside of a computer simulation of robotics, and it doesn’t yet control a physical robot in the real world.

Full story : Nvidia used GPT-4 to improve reinforcement learning for robots by crafting better rewards than human programmers could.

About OODA Analyst