Mark Ring, Laurent Orseau, Delusion, Survival, and Intelligent Agents.
This paper considers the consequences of endowing an intelligent
agent with the ability to modify its own code. The intelligent agent
is patterned closely after AIXI with these specific assumptions: 1)
The agent is allowed to arbitrarily modify its own inputs if it so
chooses; 2) The agent's code is a part of the environment and may be
read and written by the environment. The first of these we call the
"delusion box"; the second we call "mortality". Within this
framework, we discuss and compare four very different kinds of
agents, specifically: reinforcement-learning, goal-seeking,
prediction-seeking, and knowledge-seeking agents. Our main results
are that: 1) The reinforcement-learning agent under reasonable
circumstances behaves exactly like an agent whose sole task is to
survive (to preserve the integrity of its code); and 2) Only the
knowledge-seeking agent behaves completely as expected.