RLang: A Declarative Language for Expression Prior Knowledge for Reinforcement Learning

  title={RLang: A Declarative Language for Expression Prior Knowledge for Reinforcement Learning},
  author={Rafael Rodr{\'i}guez-S{\'a}nchez and Benjamin A. Spiegel and Jennifer Wang and Roma Patel and Stefanie Tellex and George Dimitri Konidaris},
Communicating useful background knowledge to reinforce- ment learning (RL) agents is an important and effective method for accelerating learning. We introduce RLang, a domain-specific language (DSL) for communicating domain knowledge to an RL agent. Unlike other existing DSLs pro- posed by the RL community that ground to single elements of a decision-making formalism (e.g., the reward function or policy function), RLang can specify information about every element of a Markov decision process. We… 

Figures and Tables from this paper



Proximal Policy Optimization Algorithms

We propose a new family of policy gradient methods for reinforcement learning, which alternate between sampling data through interaction with the environment, and optimizing a "surrogate" objective

Deep Reinforcement Learning with Double Q-Learning

This paper proposes a specific adaptation to the DQN algorithm and shows that the resulting algorithm not only reduces the observed overestimations, as hypothesized, but that this also leads to much better performance on several games.

R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning

R-MAX is a very simple model-based reinforcement learning algorithm which can attain near-optimal average reward in polynomial time and formally justifies the ``optimism under uncertainty'' bias used in many RL algorithms.

Creating Advice-Taking Reinforcement Learners

This work presents and evaluates a design that addresses this shortcoming by allowing a connectionist Q-learner to accept advice given, at any time and in a natural manner, by an external observer, and shows that, given good advice, a learner can achieve statistically significant gains in expected reward.

Simple statistical gradient-following algorithms for connectionist reinforcement learning

This article presents a general class of associative reinforcement learning algorithms for connectionist networks containing stochastic units that are shown to make weight adjustments in a direction that lies along the gradient of expected reinforcement in both immediate-reinforcement tasks and certain limited forms of delayed-reInforcement tasks, and they do this without explicitly computing gradient estimates.

A Survey of Reinforcement Learning Informed by Natural Language

The time is right to investigate a tight integration of natural language understanding into Reinforcement Learning in particular, and the state of the field is surveyed, including work on instruction following, text games, and learning from textual domain knowledge.

Modular Multitask Reinforcement Learning with Policy Sketches

Experiments show that using the approach to learn policies guided by sketches gives better performance than existing techniques for learning task-specific or shared policies, while naturally inducing a library of interpretable primitive behaviors that can be recombined to rapidly adapt to new tasks.

Rectifier Nonlinearities Improve Neural Network Acoustic Models

This work explores the use of deep rectifier networks as acoustic models for the 300 hour Switchboard conversational speech recognition task, and analyzes hidden layer representations to quantify differences in how ReL units encode inputs as compared to sigmoidal units.

Weakly Supervised Learning of Semantic Parsers for Mapping Instructions to Actions

This paper shows semantic parsing can be used within a grounded CCG semantic parsing approach that learns a joint model of meaning and context for interpreting and executing natural language instructions, using various types of weak supervision.

PPDDL 1.0: The language for the probabilistic part of IPC-4

  • Proc. International Planning Competition
  • 2004