AlphaSeq: Sequence Discovery With Deep Reinforcement Learning

@article{Shao2020AlphaSeqSD,
  title={AlphaSeq: Sequence Discovery With Deep Reinforcement Learning},
  author={Yulin Shao and Soung Chang Liew and Taotao Wang},
  journal={IEEE Transactions on Neural Networks and Learning Systems},
  year={2020},
  volume={31},
  pages={3319-3333}
}
Sequences play an important role in many applications and systems. Discovering sequences with desired properties has long been an interesting intellectual pursuit. This article puts forth a new paradigm, AlphaSeq, to discover desired sequences algorithmically using deep reinforcement learning (DRL) techniques. AlphaSeq treats the sequence discovery problem as an episodic symbol-filling game, in which a player fills symbols in the vacant positions of a sequence set sequentially during an episode… 
HpGAN: Sequence Search with Generative Adversarial Networks
TLDR
This article proposes a novel method, called HpGAN, to search desired sequences algorithmically using generative adversarial networks (GANs), based on the idea of zero-sum game to train a generative model, which can generate sequences with characteristics similar to the training sequences.
Phase Code Discovery for Pulse Compression Radar: A Genetic Algorithm Approach
TLDR
The developed GA, dubbed GASeq, discovers better phase codes than the state of the art, and enables us to search phase codes with a longer code length, which thwarts existing deep learning-based approaches.
Sensible Artificial Intelligence that plays Go 1
TLDR
A multiple-komi modification of the AlphaGo Zero/Leela Zero paradigm that is based on self-play games that occasionaly branch –with changed komi– when the position is uneven is proposed, and reinforcement learning is shown to work on 7×7 Go, obtaining very strong playing agents.
Design of Deterministic Grant-Free Access with Deep Reinforcement Learning
TLDR
A deep reinforcement learning (DRL) based algorithm is put forth to search IC codes, with carefully designed metrics and reward functions as per the underlying mathematical constraints, which indicate that the algorithm can efficiently discover IC codes and yield significantly lower failure probability than the random access protocol given the same latency requirements.
SAI a Sensible Artificial Intelligence that plays Go
TLDR
A multiple-komi modification of the AlphaGo Zero/Leela Zero paradigm that is based on self-play games that occasionaly branch –with changed komi– when the position is uneven is proposed, and reinforcement learning is shown to work on 7×7 Go, obtaining very strong playing agents.
Deep Spiking Neural Network with Neural Oscillation and Spike-Phase Information
TLDR
Inspired by the biological neural networks, a Spike-Level-Dependent Back-Propagation (SLDBP) learning algorithm for DSNNs is proposed and a new spiking neuron model is put forward, namely Resonate Spiking Neuron (RSN).
New Transceiver Designs for Interleaved Frequency-Division Multiple Access
TLDR
This paper puts forth a class of new transceiver designs for interleaved frequency division multiple access (IFDMA) systems that are significantly less complex than conventional IFDMA transceivers and has the following advantages:.
Sporadic Ultra-Time-Critical Crowd Messaging in V2X
TLDR
This paper adopts an override network architecture whereby warning messages are delivered on the spectrum of the ordinary vehicular messages, and employs advanced channel access techniques to ensure reliable message delivery within an ultra-short time in the order of 10 ms.

References

SHOWING 1-10 OF 40 REFERENCES
Mastering the game of Go with deep neural networks and tree search
TLDR
Using this search algorithm, the program AlphaGo achieved a 99.8% winning rate against other Go programs, and defeated the human European Go champion by 5 games to 0.5, the first time that a computer program has defeated a human professional player in the full-sized game of Go.
Mastering the game of Go without human knowledge
TLDR
An algorithm based solely on reinforcement learning is introduced, without human data, guidance or domain knowledge beyond game rules, that achieves superhuman performance, winning 100–0 against the previously published, champion-defeating AlphaGo.
Deep Reinforcement Learning: An Overview
TLDR
This work discusses core RL elements, including value function, in particular, Deep Q-Network (DQN), policy, reward, model, planning, and exploration, and important mechanisms for RL, including attention and memory, unsupervised learning, transfer learning, multi-agent RL, hierarchical RL, and learning to learn.
Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
TLDR
This paper generalises the approach into a single AlphaZero algorithm that can achieve, tabula rasa, superhuman performance in many challenging domains, and convincingly defeated a world-champion program in each case.
Human-level control through deep reinforcement learning
TLDR
This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.
The On-Line Encyclopedia of Integer Sequences
  • N. Sloane
  • Computer Science
    Electron. J. Comb.
  • 1994
TLDR
The On-Line Encyclopedia of Integer Sequences (or OEIS) is a database of some 130000 number sequences which serves as a dictionary, to tell the user what is known about a particular sequence and is widely used.
Markov Decision Processes: Discrete Stochastic Dynamic Programming
  • M. Puterman
  • Computer Science
    Wiley Series in Probability and Statistics
  • 1994
TLDR
Markov Decision Processes covers recent research advances in such areas as countable state space models with average reward criterion, constrained models, and models with risk sensitive optimality criteria, and explores several topics that have received little or no attention in other books.
Contemporary Mathematics What can be used instead of a Barker sequence ?
TLDR
A classical problem of digital sequence design is to determine long binary sequences for which the absolute values of the aperiodic autocorrelations are collectively as small as possible, but there is overwhelming evidence that no Barker sequence of length greater than 13 exists.
A Survey of Monte Carlo Tree Search Methods
TLDR
A survey of the literature to date of Monte Carlo tree search, intended to provide a snapshot of the state of the art after the first five years of MCTS research, outlines the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarizes the results from the key game and nongame domains.
Crosscorrelation properties of pseudorandom and related sequences
TLDR
This paper presents a survey of recent results and provides several new results on the periodic and aperiodic crosscorrelation functions for pairs of m-sequences and for Pair of related (but not maximal-length) binary shift register sequences.
...
...