# Universal Policies for Software-Defined MDPs

@article{Selsam2020UniversalPF, title={Universal Policies for Software-Defined MDPs}, author={Daniel Selsam and Jesse Michael Han and Leonardo Mendonça de Moura and Patrice Godefroid}, journal={ArXiv}, year={2020}, volume={abs/2012.11401} }

We introduce a new programming paradigm called oracle-guided decision programming in which a program specifies a Markov Decision Process (MDP) and the language provides a universal policy. We prototype a new programming language, Dodona, that manifests this paradigm using a primitive choose representing nondeterministic choice. The Dodona interpreter returns either a value or a choicepoint that includes a lossless encoding of all information necessary in principle to make an optimal decision…

## 3 Citations

### Learning to Find Proofs and Theorems by Learning to Refine Search Strategies

- Computer ScienceNeurIPS
- 2022

We propose a new approach to automated theorem proving where an AlphaZero-style agent is self-training to refine a generic high-level expert strategy expressed as a nondeterministic program. An…

### Proof Artifact Co-training for Theorem Proving with Language Models

- Computer ScienceICLR
- 2022

PACT is proposed, a general methodology for extracting abundant self-supervised data from kernel-level proof terms for co-training alongside the usual tactic prediction objective and applied to Lean, an interactive proof assistant which hosts some of the most sophisticated formalized mathematics to date.

### ING WITH LANGUAGE MODELS

- Computer Science
- 2021

This work proposes PACT (Proof Artifact Co-Training), a general methodology for extracting abundant self-supervised data from kernel-level proof terms for joint training alongside the usual tactic prediction objective and applies this methodology to Lean, a proof assistant host to some of the most sophisticated formalized mathematics to date.

## 49 References

### State abstraction for programmable reinforcement learning agents

- Computer ScienceAAAI/IAAI
- 2002

This paper explores safe state abstraction in hierarchical reinforcement learning, where learned behaviors must conform to a given partial, hierarchical program, and shows how to achieve this for a partial programming language that is essentially Lisp augmented with nondeterministic constructs.

### Inference Compilation and Universal Probabilistic Programming

- Computer ScienceAISTATS
- 2017

We introduce a method for using deep neural networks to amortize the cost of inference in models from the family induced by universal probabilistic programming languages, establishing a framework…

### Church: a language for generative models

- Computer ScienceUAI
- 2008

This work introduces Church, a universal language for describing stochastic generative processes, based on the Lisp model of lambda calculus, containing a pure Lisp as its deterministic subset.

### Scheme: A Interpreter for Extended Lambda Calculus

- Computer ScienceHigh. Order Symb. Comput.
- 1998

A completely annotated interpreter for SCHEME, written in MacLISP, is presented to acquaint programmers with the tricks of the trade of implementing non-recursive control structures in a recursive language like LISP.

### ProGraML: Graph-based Deep Learning for Program Optimization and Analysis

- Computer ScienceArXiv
- 2020

This work introduces ProGraML - Program Graphs for Machine Learning - a novel graph-based program representation using a low level, language agnostic, and portable format; and machine learning models capable of performing complex downstream tasks over these graphs.

### Machine Learning in Compilers: Past, Present and Future

- Computer Science2020 Forum for Specification and Design Languages (FDL)
- 2020

A retrospective of machine learning in compiler optimisation from its earliest inception, through some of the works that set themselves apart, to today's deep learning, finishing with the vision of the field's future.

### A Theory of Universal Artificial Intelligence based on Algorithmic Complexity

- Computer ScienceArXiv
- 2000

This work constructs a modified algorithm AI tl, which is still eectively more intelligent than any other time t and space l bounded agent, and gives strong arguments that the resulting AI model is the most intelligent unbiased agent possible.

### Dynamic Neural Program Embedding for Program Repair

- Computer ScienceICLR
- 2018

A novel semantic program embedding that is learned from program execution traces is proposed, showing that program states expressed as sequential tuples of live variable values not only captures program semantics more precisely, but also offer a more natural fit for Recurrent Neural Networks to model.

### Compiler-based graph representations for deep learning models of code

- Computer ScienceCC
- 2020

This paper uses graph neural networks (GNNs) for learning predictive compiler tasks on two representations based on ASTs and CDFGs, which significantly outperforms the state-of-the-art in the task of heterogeneous OpenCL mapping, while providing orders of magnitude faster inference times, crucial for compiler optimizations.

### Global Relational Models of Source Code

- Computer ScienceICLR
- 2020

This work bridges the divide between global and structured models by introducing two new hybrid model families that are both global and incorporate structural bias: Graph Sandwiches, which wrap traditional (gated) graph message-Passing layers in sequential message-passing layers; and Graph Relational Embedding Attention Transformers, which bias traditional Transformers with relational information from graph edge types.