• Corpus ID: 28409722

Sequence stacking using dual encoder Seq2Seq recurrent networks

  title={Sequence stacking using dual encoder Seq2Seq recurrent networks},
  author={Alessandro Bay and Biswa Sengupta},
A widely studied non-polynomial (NP) hard problem lies in finding a route between the two nodes of a graph. Often meta-heuristics algorithms such as $A^{*}$ are employed on graphs with a large number of nodes. Here, we propose a deep recurrent neural network architecture based on the Sequence-2-Sequence model, widely used, for instance in text translation. Particularly, we illustrate that utilising a context vector that has been learned from two different recurrent networks enables increased… 

Figures and Tables from this paper

GeoSeq2Seq: Information Geometric Sequence-to-Sequence Networks
This work proposes the information geometric Seq2Seq (GeoSeq 2Seq) network, a network which abridges the gap between deep recurrent neural networks and information geometry, and utilises such a network to predict the shortest routes between two nodes of a graph by learning the adjacency matrix using the GeoSeq1Seq formalism.


Approximating meta-heuristics with homotopic recurrent neural networks
This work demonstrates that it is possible to approximate solutions generated by a meta-heuristic algorithm using a deep recurrent neural network and argues that a sequence-to-sequence network rather than other recurrent networks has improved approximation quality.
Sequence to Sequence Learning with Neural Networks
This paper presents a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure, and finds that reversing the order of the words in all source sentences improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.
Pointer Networks
A new neural architecture to learn the conditional probability of an output sequence with elements that are discrete tokens corresponding to positions in an input sequence using a recently proposed mechanism of neural attention, called Ptr-Nets, which improves over sequence-to-sequence with input attention, but also allows it to generalize to variable size output dictionaries.
Training Recurrent Neural Networks by Diffusion
This work presents a new algorithm for training recurrent neural networks derived from a theory in nonconvex optimization related to the diffusion equation that can achieve similar level of generalization accuracy of SGD in much fewer number of epochs.
Hybrid computing using a neural network with dynamic external memory
A machine learning model called a differentiable neural computer (DNC), which consists of a neural network that can read from and write to an external memory matrix, analogous to the random-access memory in a conventional computer.
On the Properties of Neural Machine Translation: Encoder–Decoder Approaches
It is shown that the neural machine translation performs relatively well on short sentences without unknown words, but its performance degrades rapidly as the length of the sentence and the number of unknown words increase.
Long Short-Term Memory
A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
Adam: A Method for Stochastic Optimization
This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Neural networks for shortest path computation and routing in computer networks
An efficient neural network shortest path algorithm that is an improved version of previously suggested Hopfield models is proposed that will enable the routing algorithm to be implemented in real time and also to be adaptive to changes in link costs and network topology.
A note on two problems in connexion with graphs
  • E. Dijkstra
  • Mathematics, Computer Science
    Numerische Mathematik
  • 1959
A tree is a graph with one and only one path between every two nodes, where at least one path exists between any two nodes and the length of each branch is given.