Eliezer Yudkowsky

Learn More
The first and most popular is to work with meta-languages. For any language L we may form a new language by adding a predicate True, which acts only on sentences of L (i.e., sentences without the symbol True) and satisfies property 1. We can iterate this construction to obtain an infinite sequence of languages L1, L2, . . ., each of which contains the(More)
The goal of the field of Artificial Intelligence is to understand intelligence and create a human-equivalent or transhuman mind. Beyond this lies another question—whether the creation of this mind will benefit the world; whether the AI will take actions that are benevolent or malevolent, safe or uncaring, helpful or hostile. Creating Friendly AI describes(More)
We consider the one-shot Prisoner’s Dilemma between algorithms with access to one anothers’ source codes, and apply the modal logic of provability to achieve a flexible and robust form of mutual cooperation. We discuss some variants, and point out obstacles to definitions of optimality. 1 Informal Introduction Many philosophers have suggested that mutual(More)
The possibility of creating thinking machines raises a host of ethical issues. These questions relate both to ensuring that such machines do not harm humans and other morally relevant beings, and to the moral status of the machines themselves. The first section discusses issues that may arise in the near future of AI.The second section outlines challenges(More)
All else being equal, not many people would prefer to destroy the world. Even faceless corporations, meddling governments, reckless scientists, and other agents of doom, require a world in which to achieve their goals of profit, order, tenure, or other villainies. If our extinction proceeds slowly enough to allow a moment of horrified realization, the doers(More)
Applications of game theory often neglect that real-world agents normally have some amount of out-of-band information about each other. We consider the limiting case of a one-shot Prisoner’s Dilemma between algorithms with readaccess to one anothers’ source code. Previous work has shown that cooperation is possible at a Nash equilibrium in this setting, but(More)
Disputes between evidential decision theory and causal decision theory have continued for decades, and many theorists state dissatisfaction with both alternatives. Timeless decision theory (TDT) is an extension of causal decision networks that compactly represents uncertainty about correlated computational processes and represents the decisionmaker as such(More)