Go to the 1:21 mark to see what was the eureka moment for me. Note that "thirsty" does not show up anywhere in my code, just actions like "pick" and "place" and the word "bottle" comes from vision.
Being the first company to integrate GPT-3 into a humanoid body, creating an embodied LLM, it is nothing short of extraordinary. It's akin to witnessing the dawn of a new era in AI-human interaction, where the humanoid can intuitively understand queries without explicit coding. This breakthrough represents a leap forward in technology, blurring the lines between man and machine, and paving the way for a future where seamless communication between humans and AI becomes a reality.
The robot stack contains solely basic motion functions. The LLM performs the "compute" on a higher level, connecting user intent, robotic affordances, and environmental context. The reasoning where robotics has always hit a ceiling in the past.
Imagine the possibilities - let's use it for good.
Commentaires