Hands are more critical for robot intelligence than AI, says robotics CEO

May 31, 2023

Dyslexia mode

(Image sourced via Sanctuary AI)

For the CEO of a company called Sanctuary AI, Geordie Rose seems rather dismissive of the world’s current tech obsession: generative and large-language AIs, such as OpenAI’s GPT engine. He tells me:

I don’t want to underestimate generative AIs’ abilities. But what cats and dogs do when they understand the world, without language, is dramatically more advanced than the best possible AI systems that we can build today. That alone should give people pause when they talk about large-language models [LLMs] as if they're the Second Coming!

They're really not. They're just an important new technology that is a part of the landscape, that’s all. But LLMs have virtually nothing to do with the true problem of building an intelligence. In fact, they don't address the central problem of intelligence at all, which is how to understand the world around you, and then act on it.

For Rose, and for the humanoid robots built and operated by Sanctuary AI, that is the only meaningful definition of intelligence. And it is one the company is actively pursuing with its sixth-generation, general-purpose robot, Phoenix – just announced – and its new AI-based control system, Carbon.

Rose continues:

Intelligence is about understanding the world. And understanding it well enough to be able to act on it to your benefit: that's what intelligence actually is.

Rose’s point is that LLMs have no conception of the human world at all; they merely sound intelligent, because they use the same language that clever people do – largely because they are recycling things that clever people once said. Yet many users are still taken in by what he calls LLMs’ “circus act”.

For example, you could ask a GPT-powered system what a humanoid robot is, and it would tell you. But behind that answer would be zero understanding of the concept itself: an absence of sentience or cognition, and of any ability to engage with the world around it. The latter is what Carbon seeks to add to general-purpose robots.

Despite this, companies like Engineered Arts, maker of the impressive ‘Ex Machina’ style Ameca robot, have used GPT to turn their machines into something that resembles our science-fiction image of robots: humanoid doppelgangers that can converse in multiple languages and answer in-depth questions.

Rose notes:

Running a large-language model in a robot is 30 minutes’ work for anybody who's half competent. It’s just not an impressive thing, it’s really easy to do.

But what we’re focused on at Sanctuary AI is understanding how to do something valuable with robots that people would actually pay for. Forget about the technology itself. A user might ask, ‘What are robot companies doing that is useful, that I can't do myself?’ That's what we focus on.

Or as another robotics CEO told me once, people won’t pay for robots; but they will pay for whatever useful services they provide.

Sanctuary AI is one of the few companies that understands this obvious point. Other makers’ realistic androids, or running, jumping, somersaulting robots, may be significant engineering achievements, but what are they actually for, given the enormous likely cost of such machines?

When diginomica last spoke to Rose back in March, he explained how training robots to do repetitive tasks without a human in the loop – autonomously sorting and stacking items on shelves, for example – is far more challenging, time-consuming, and complex than teaching LLMs about any subject at all. For the latter, you simply need to scrape enough human-made data, plus a system that can respond accurately to natural-language prompts.

But since then, there have been two significant developments at Sanctuary AI: the new Phoenix robot, and Carbon. He explains:

To be general-purpose, a robot needs to be able to do nearly any task, in the same way you’d expect a person to, and in the environment where the work actually is.

While it is easy to get fixated on the physical aspects of a robot, our view is that the robot is just a tool for the real star of the show, which in our case is our proprietary AI control system: the robot’s new, Carbon-based mind.

An archetype for future AI systems

Carbon is a cognitive architecture and software platform for humanoid general-purpose robots, such as Phoenix and the generations that will follow it. The long-term aim is to create a robot that can think and complete tasks like a person, via deep learning and reinforcement learning, and photo-realistic, physics-based world simulations.

However, Carbon will also couple logical and symbolic reasoning with LLMs for general, conversational knowledge, he says. But adds:

The speaking part is not that valuable for robots to work. It’s unlikely to help a robot do its job, unless it needs to know specific facts about the world in order to complete a task.

But to make a robot that’s actually useful, you need to start integrating vision systems, audio systems, goal-seeking behaviours, and much more. And what you get is something like Carbon.

Carbon is an archetype for the way that AI systems will be built in the future. It's a combination of different techniques that all work together to achieve a valuable goal.

LLMs are part of that technology universe, but they're not some path to transcendence. They're just a tool, like a calculator. And in order to make them into something that's far more than that, you need to integrate all sorts of other things. And that's what Carbon does.

Teleoperation, fleet management, and human-in-the-loop operation are also part of what Carbon provides, with the latter still essential until future generations of robots will be sufficiently advanced and trained to operate autonomously. In the same way that driver-assistance systems are just one stage in the slow journey towards fully autonomous cars.

But what does the Phoenix robot itself offer that previous generations did not? The answer, says Rose, is far more advanced and capable hands – in Sanctuary AI’s case, incorporating micro-hydraulics and other technologies.

Indeed, he explains that sophisticated hands are the single most important element in the development of intelligent robots – in much the same way that having opposable thumbs was stage one in humans’ evolutionary journey from using primitive tools to creating robots and AIs.

He explains:

If you are building humanoid robots, the place where you have to focus the most effort is the hands. In many ways, the rest of the robot’s form factor is only there to get the hands to where they need to be. So, if you don't have highly capable hands on your robot, it doesn't really matter what else it can do. Because the things a robot can do without hands are, generally, not that valuable. They’re just performative.

If I wanted to build a robot that can do a bunch of athletic things, but it couldn’t do anything more, what would be the point? I can see there's a pure research angle, and that's fine. But if you want to build a viable business, you need to be able to make something that actually delivers value. So, how do you build a robot that delivers that? The answer is you replicate the human hand.

But building a hand that works is far harder than building a robot that can do athletic manoeuvres. For us, the problem that we've set ourselves is the very ambitious one of replicating human form, function, and cognition. And we've made a lot of progress in that, so I’m very proud of what we are doing.

He adds:

The system itself is undergoing a process of untethering. which means it's going to be free to move around. In about three months, we'll have we'll have a fully untethered robot.

Good news for the robotics sector. But some may ask, why does a general-purpose robot need to be humanoid? One answer is that it will exist in a built environment designed for, and by, human beings:

There are a lot of reasons for a humanoid form factor. One is that virtually every artefact that we, as a society, build is constructed to conform to the human body. Everything from chairs to cars to door handles.

So, a general-purpose robot would need to exist and work in that world, in a range of environments – from factory floors, warehouses, and large retail spaces to energy platforms and power stations.

Then Rose adds:

And because you need highly capable hands.

My take

The ‘autonomous car’ comparison is inaccurate, says Rose. That’s because driving on mapped roads that are designed for vehicles, with a supporting infrastructure, is relatively simple compared with building a robot that can work autonomously in any location. But even so, it’s fair to say that we are at the ‘driver assistance’ stage of humanoid robotics.

Such devices, some still human ‘piloted’, increasingly look human, move like us, and, with LLMs onboard or in the cloud, can simulate intelligence. But we are some way from the vision of a genuinely intelligent, multi-purpose robot that can both comprehend the world around it and carry out tasks autonomously, to order.

To reach that point demands a company like Sanctuary AI, which understands the one critical point: humanoid robots must actually be for something; they must carry out useful tasks and services, safely, reliably, and adeptly, with minimal human intervention.

By contrast, any expensive piece of design/engineering that, while entertaining or superficially impressive, rapidly loses its appeal after being switched on, would be little more than a hyper-realistic mannequin.

Hands are more critical for robot intelligence than AI, says robotics CEO

Hands are more critical for robot intelligence than AI, says robotics CEO

An archetype for future AI systems

My take

Recommend

Snapchat Adds Visual Cues into ‘My AI’ Generative Chatbot Experience

TikTok Launches 2023 Holiday Marketing Playbook to Assist in Campaign Planning

JavaScript ES6 模块

Top 10 ChatGPT Plugins You Can Use Now

Blocs for iPad

'Diablo IV' is almost here. What to know about the video game's coming release

Plants can distinguish when touch starts and stops

Travel firm GetYourGuide raises $194 million at $2 billion valuation

How to use the psychological mechanisms for enhanced user experience

The free version of Microsoft Teams is adding Windows 11 communities and Designe...

About Joyk