Robots That Learn From Watching You Are Coming Whether You Like It or Not

Nvidia wants to make robots smarter by making them creepier.

March 19, 2024

Nvidia

Humans, like primates, are one of only a few species on Earth that can look at someone doing something and then learn from it — monkey see, monkey do.

But just because that skill is somewhat unique in the animal kingdom doesn’t mean it has to be in the robotic one, especially if Nvidia has anything to do with it.

Meet GR00T

There’s a lot to unpack from Nvidia’s latest GTC 2024 keynote — a whole new generation of AI-enabling chips called Blackwell, for example — but nestled in the unveiling of its next-gen Blackwell chip, there was GR00T, which stands for Generalist Robot 00 Technology.

As the name implies, GR00T is focused on accelerating robotic development. How does it plan to do that? In the creepiest way possible, of course. As per Nvidia:

Robots powered by GR00T... will be designed to understand natural language and emulate movements by observing human actions — quickly learning coordination, dexterity and other skills in order to navigate, adapt and interact with the real world.

So, to synopsize, Nvidia wants to teach robots how to teach themselves. Part of its trick is making GR00T multimodal, which means, for example, that one could show GR00T a video of a person performing an action and the model might be able to mime the action based off that video input. No programming, no simulating, just AI mimicry.

No programming, no simulating, just AI mimicry.

While GR00T has a long way to go before it can call itself a robotics revolution, it does make a strong case for addressing one of the biggest problems facing functional humanoid robots, which is the sheer variability of tasks or scenarios one might encounter.

ANAHEIM, CA - MAY 17: I am Groot sipper available inside Disney California Adventure Park is one of ...

No, not that Groot.

MediaNews Group/Orange County Register via Getty Images/MediaNews Group/Getty Images

Since it’s impossible to pre-program a response for every real-life scenario a bot might encounter out in the world, the ability to learn or react on the fly with generative AI could theoretically provide the level of versatility robots need to navigate their entropic environments.

Plus, this is Nvidia we’re talking about here — the company that has primarily powered the current AI revolution — so it’s pretty hard to rule out the success of a model like GR00T, even if the challenges of functional humanoid robots are vast.

Naturally, Nvidia also has hardware to go along with its robotic endeavors, which it’s calling Jetson Thor. Nvidia says Jetson Thor is a “modular architecture optimized for performance, power and size” and is specifically designed to run multimodal models like GR00T.

When Will We See GR00T in Robots?

Lots of big names in robotics are already on board with GR00T. Nvidia says it’s already building a “comprehensive AI platform” for big names like Agility Robotics, Boston Dynamics, Figure AI, and more.

And if competition is an accelerator, then humanoid bots also have that working in their favor. OpenAI, an equally huge name in AI as Nvidia, has thrown its hat into the robot ring with bots like those made by Figure AI. The results are already just as impressive and creepy as you might imagine.

In any outcome, it looks like I could end up eating more words after all — maybe at-home robots are closer than I thought.

Related Tags