The AI Box Experiment

What a Simple Experiment Can Teach Us About Superintelligence

Oct 15 ·3min read

I magine it’s 2040. After years of research and dedicated programming, you believe you have created the World’s first Artificial General Intelligence (AGI): an Artificial Intelligence (AI) that’s roughly as intelligent as humans are among all their intellectual domains.

YzqAVz3.jpg!web

A superintelligence will find a way to get out of the box.

Since one of these domains is of course programming AIs and your AGI has access to its own source code, it soon starts to make improvements on itself. After multiple cycles of self-improvement, this leads to it becoming an ASI: Artificial Superintelligence , an intelligence much greater than any we know. You have heard of the dangers posed by ASI: thinkers like Elon Musk and the late physicist Stephen Hawking have warned humanity that if we’re not careful, such an AI could lead to the extinction of the human race. But you have plan. Your ASI is in a virtual box , some kind of prison it can’t escape from. It’s running on a computer with no internet connection or anything like that. It has no robotic body to control. The only way of influencing the outside World is via a screen where it can post messages on. You’re being smart: such an ASI could never cause any harm. Right?

Out of five runs, Yudkowsky won the experiment three times.

Unfortunately, it’s not that simple. Research has suggested that in a scenario like the above, the ASI would probably find a way to convince you to let it out of the box. In a series of experiments known as the AI Box Experiment , done by Eliezer Yudkowsky, Yudkowsky played the role of a boxed ASI while texting with a “Gatekeeper”, another person who could let him out of the hypothetical box. Keeping the “ASI” (Yudkowsky) in for the entire experiment would earn the Gatekeeper a monetary reward. Out of five runs, Yudkowsky won the experiment three times.

What does the result of the AI Box Experiment mean? It tells me that an ASI would find a way to get out of the box . If Yudkowsky can do it three out of five times, an ASI can definitely do it. The thing is, Eliezer Yudkowsky is a man of (far) above average intelligence, but he’s not nearly as smart as an ASI could be. As Yudkowsky says himself here , the ASI would make you want to let it out.

Advanced AI has to be made inherently safe.

However, the AI Box Experiment is a symbol for a greater truth: advanced AI (e.g. ASI) has to be made inherently safe . You (probably) can’t find a way to safeguard ASI after you built it; it will by definition be extremely good at reaching its goals, and will escape its box (if it’s in one). If those goals (or the ASI’s method of reaching them) are dangerous to us, tough luck. If they are beneficial to us, it could easily lead to extreme human longevity , interstellar space travel, and more incredibly amazing things. Right now, humans are still in control, and we need to find a way to make future ASIs safe .

What a Simple Experiment Can Teach Us About Superintelligence

Recommend

Prototyping an anomaly detection system for videos, step by step using LSTM conv...

谷歌联合斯坦福推出可解释 AI 新方法，揭秘图像分类器到底是如何工作的

使用Frida绕过Android App的SSL Pinning

一文了解超级账本DLT、库、开发工具有哪些，Hyperledger家族成员你认识几个？

学习 Spring 的思考框架

Fear Tells Us What We Have To Do

为什么说Kubernetes的崛起预示着云原生时代到来？

Spring Boot 2.x基础教程：Swagger静态文档的生成

游戏服务器和Web服务器的区别

Python3.8 新特性：赋值表达式

About Joyk