AI-Box Experiment


Not mentioned directly (at this point) in a website or document, but in discussions (such as here and here and here a discussion where sean murray mentions Roko’s Basilisk)

Definition AI Box Experiment

from -

a hypothetical isolated computer hardware system where a possibly dangerous artificial intelligence, or AI, is kept constrained in a “virtual prison” and not allowed to manipulate events in the external world.

This also connects to Roko’s Basilisk

Definition of Roko’s Basilisk


Roko’s basilisk is a thought experiment proposed in 2010 by the user Roko on the Less Wrong community blog. Roko used ideas in decision theory to argue that a sufficiently powerful AI agent would have an incentive to torture anyone who imagined the agent but didn’t work to bring the agent into existence. The argument was called a “basilisk” because merely hearing the argument would supposedly put you at risk of torture from this hypothetical agent — a basilisk in this context is any information that harms or endangers the people who hear it.


In July 2010, LessWrong contributor Roko posted a thought experiment to the site in which an otherwise benevolent future AI system tortures simulations of those who did not work to bring the system into existence. This idea came to be known as “Roko’s basilisk,” based on Roko’s idea that merely hearing about the idea would give the hypothetical AI system stronger incentives to employ blackmail.

References and Connections

Related Concepts


I had a friend, who was later diagnosed with Dissociative Identity Disorder, who stated that his “other self” would mentally torture him if “it” heard other people mention “it” by name in conversations and my friend wouldn’t let “it” to come to the forefront to argue with those people. And yes, my friend got help he needed to “it”. Roko’s Basilisk sounds a lot like my past friend’s mental disorder.