DRiLLS: Deep Reinforcement Learning for Logic Synthesis

@article{Hosny2020DRiLLSDR,
  title={DRiLLS: Deep Reinforcement Learning for Logic Synthesis},
  author={Abdelrahman Hosny and Soheil Hashemi and Mohamed Shalan and Sherief Reda},
  journal={2020 25th Asia and South Pacific Design Automation Conference (ASP-DAC)},
  year={2020},
  pages={581-586}
}
Logic synthesis requires extensive tuning of the synthesis optimization flow where the quality of results (QoR) depends on the sequence of optimizations used. Efficient design space exploration is challenging due to the exponential number of possible optimization permutations. Therefore, automating the optimization process is necessary. In this work, we propose a novel reinforcement learning-based methodology that navigates the optimization space without human intervention. We demonstrate the… 

Figures and Tables from this paper

Rethinking Reinforcement Learning based Logic Synthesis
TLDR
A new RL-based method that can automatically recognize critical operators and generate common operator sequences generalizable to unseen circuits is developed that achieves a good balance among delay, area and runtime, and is practical for industrial usage.
BOiLS: Bayesian Optimisation for Logic Synthesis
TLDR
BOiLS is proposed, the first algorithm adapting Bayesian optimisation to navigate the space of synthesis operations and superior performance compared to state-of-the-art in terms of both sample efficiency and QoR values is demonstrated.
RL-Guided Runtime-Constrained Heuristic Exploration for Logic Synthesis
TLDR
This work proposes a runtime-constrained reinforcement learning (RL) approach which is able to generate scripts to carry out logic synthesis flows and develops a framework for the EPFL mockturtle libraries and generates custom scripts using this RL-based approach.
Too Big to Fail? Active Few-Shot Learning Guided Logic Synthesis
TLDR
This work proposes a new approach, Bulls-Eye, that tunes a pre-trained model on past synthesis data to accurately predict the quality of a synthesis recipe for an unseen netlist and achieves 2x-10x run-time improvement and better quality-of-result (QoR) than state- of- the-art machine learning approaches.
Collaborative Distillation Meta Learning for Simulation Intensive Hardware Design
TLDR
The CDML framework consists of a context-based meta learner and collaborative distillation scheme to produce a reusable solver that outperforms both neural baselines and iterative conventional design methods in terms of real-world objective, power integrity, with zero-shot transfer-ability.
Reinforcement Learning for Scalable Logic Optimization with Graph Neural Networks
TLDR
This work proposes to combine graph convolutional networks with reinforcement learning and a novel, scalable node embedding method to learn which local transforms should be applied to the logic graph and shows that this method achieves a similar size reduction as ABC on smaller circuits and outperforms it by 1.5–1.75× on larger random graphs.
Autonomous Application of Netlist Transformations Inside Lagrangian Relaxation-Based Optimization
TLDR
This work extends LR-based optimization by interleaving in each iteration various techniques, such as gate and flip-flop sizing, buffering to fix late and early timing violations, pin swapping, gate merge/split transformations, and useful clock skew, and allows for the autonomous execution of the optimization flow.
Hybrid Graph Models for Logic Optimization via Spatio-Temporal Information
TLDR
This work proposes hybrid graph neural network (GNN) based approaches towards highly accurate quality-of-result (QoR) estimations with great generalization capability, specifically targeting logic synthesis optimization.
DeepTPI: Test Point Insertion with Deep Reinforcement Learning
TLDR
This paper trains a novel DRL agent, instantiated as the combination of a graph neural network (GNN) and a Deep Q-Learning network (DQN), to maximize the test coverage improvement.
End-to-end Automatic Logic Optimization Exploration via Domain-specific Multi-armed Bandit
TLDR
A generic end-to-end sequential decision making framework FlowTune for synthesis tooflow optimization, with a novel high-performance domainspecific, multi-stage multi-armed bandit (MAB) approach.
...
...

References

SHOWING 1-10 OF 20 REFERENCES
Continuous control with deep reinforcement learning
TLDR
This work presents an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces, and demonstrates that for many of the tasks the algorithm can learn policies end-to-end: directly from raw pixel inputs.
On learning-based methods for design-space exploration with High-Level Synthesis
TLDR
A study on the application of learning-based methods for the DSE problem is presented, and a learning model for HLS that is superior to the best models described in the literature is proposed.
Scalable Auto-Tuning of Synthesis Parameters for Optimizing High-Performance Processors
TLDR
A novel learning-based algorithm for synthesis parameter optimization that has been integrated into the existing autonomous parameter-tuning system, which was used to design multiple 22nm industrial chips and is currently being used for 14nm chips.
A synthesis-parameter tuning system for autonomous design-space exploration
TLDR
The overall organization of SynTunSys is presented, its main components are described, and results from employing it for the design of an industrial chip, the IBM z13 22nm high-performance server chip, yielding on average a 36% improvement in total negative slack and a 7% power reduction.
Developing Synthesis Flows Without Human Knowledge
TLDR
This work presents a fully autonomous framework that artificially produces design-specific synthesis flows without human guidance and baseline flows, using Convolutional Neural Network (CNN).
Human-level control through deep reinforcement learning
TLDR
This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.
Human-level performance in first-person multiplayer games with population-based deep reinforcement learning
TLDR
It is demonstrated for the first time that an agent can achieve human-level in a popular 3D multiplayer first-person video game, Quake III Arena Capture the Flag, using only pixels and game points as input.
Policy Gradient Methods for Reinforcement Learning with Function Approximation
TLDR
This paper proves for the first time that a version of policy iteration with arbitrary differentiable function approximation is convergent to a locally optimal policy.
Mastering the game of Go without human knowledge
TLDR
An algorithm based solely on reinforcement learning is introduced, without human data, guidance or domain knowledge beyond game rules, that achieves superhuman performance, winning 100–0 against the previously published, champion-defeating AlphaGo.
Q-learning
TLDR
This paper presents and proves in detail a convergence theorem forQ-learning based on that outlined in Watkins (1989), showing that Q-learning converges to the optimum action-values with probability 1 so long as all actions are repeatedly sampled in all states and the action- values are represented discretely.
...
...