• Corpus ID: 235490131

Scenic4RL: Programmatic Modeling and Generation of Reinforcement Learning Environments

  title={Scenic4RL: Programmatic Modeling and Generation of Reinforcement Learning Environments},
  author={Abdus Salam Azad and Edward J. Kim and Qiancheng Wu and Kimin Lee and Ion Stoica and P. Abbeel and Sanjit A. Seshia},
The capability of reinforcement learning (RL) agent directly depends on the diversity of learning scenarios the environment generates and how closely it captures real-world situations. However, existing environments/simulators lack the support to systematically model distributions over initial states and transition dynamics. Furthermore, in complex domains such as soccer, the space of possible scenarios is infinite, which makes it impossible for one research group to provide a comprehensive set… 
Assisting Reinforcement Learning in Real-time Strategy Environments with SCENIC
The success of Reinforcement Learning (RL) methods relies heavily on the diversity and quality of learning scenarios generated by the environment. However, while RL methods are applied to
Formal Analysis of AI-Based Autonomy: From Modeling to Runtime Assurance
VerifAI, an open-source toolkit for the formal design and analysis of systems that include AI/ML components, is presented and the use of VerifAI for generating runtime monitors that capture the safe operational environment of systems with AI/ ML components is described.


The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract)
The promise of ALE is illustrated by developing and benchmarking domain-independent agents designed using well-established AI techniques for both reinforcement learning and planning, and an evaluation methodology made possible by ALE is proposed.
Natural Environment Benchmarks for Reinforcement Learning
This work proposes three new families of benchmark RL domains that contain some of the complexity of the natural world, while still supporting fast and extensive data acquisition.
RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning
This paper proposes a benchmark called RL Unplugged to evaluate and compare offline RL methods, a suite of benchmarks that will increase the reproducibility of experiments and make it possible to study challenging tasks with a limited computational budget, thus making RL research both more systematic and more accessible across the community.
Benchmarking Safe Exploration in Deep Reinforcement Learning
This work proposes to standardize constrained RL as the main formalism for safe exploration, and presents the Safety Gym benchmark suite, a new slate of high-dimensional continuous control environments for measuring research progress on constrained RL.
D4RL: Datasets for Deep Data-Driven Reinforcement Learning
This work introduces benchmarks specifically designed for the offline setting, guided by key properties of datasets relevant to real-world applications of offline RL, and releases benchmark tasks and datasets with a comprehensive evaluation of existing algorithms and an evaluation protocol together with an open-source codebase.
Illuminating Generalization in Deep Reinforcement Learning through Procedural Level Generation
It is shown that for some games procedural level generation enables generalization to new levels within the same distribution and it is possible to achieve better performance with less data by manipulating the difficulty of the levels in response to the performance of the agent.
Scenic: A Language for Scenario Specification and Data Generation
A domain-specific language, Scenic, is designed for describing scenarios that are distributions over scenes and the behaviors of their agents over time, which combines concise, readable syntax for spatiotemporal relationships with the ability to declaratively impose hard and soft constraints over the scenario.
Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning
An open-source simulated benchmark for meta-reinforcement learning and multi-task learning consisting of 50 distinct robotic manipulation tasks is proposed to make it possible to develop algorithms that generalize to accelerate the acquisition of entirely new, held-out tasks.
Leveraging Procedural Generation to Benchmark Reinforcement Learning
This work empirically demonstrate that diverse environment distributions are essential to adequately train and evaluate RL agents, thereby motivating the extensive use of procedural content generation and uses this benchmark to investigate the effects of scaling model size.
Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents
This paper takes a big picture look at how the ALE is being used by the research community and focuses on how diverse the evaluation methodologies in the ALE have become and highlights some key concerns when evaluating agents in this platform.