Learn More
An attentional mechanism has lately been used to improve neural machine translation (NMT) by selectively focusing on parts of the source sentence during translation. However, there has been little work exploring useful architectures for attention-based NMT. This paper examines two simple and effective classes of at-tentional mechanism: a global approach(More)
Recent work in learning bilingual representations tend to tailor towards achieving good performance on bilingual tasks, most often the crosslingual document classification (CLDC) evaluation, but to the detriment of preserving clustering structures of word representations monolin-gually. In this work, we propose a joint model to learn word representations(More)
This paper presents a framework to tackle combinatorial optimization problems using neural networks and reinforcement learning. We focus on the traveling salesman problem (TSP) and train a recurrent neural network that, given a set of city coordinates, predicts a distribution over different city permutations. Using negative tour length as the reward signal,(More)
We propose a novel approach to learning distributed representations of variable-length text sequences in multiple languages simultaneously. Unlike previous work which often derive representations of multi-word sequences as weighted sums of individual word vectors , our model learns distributed representations for phrases and sentences as a whole. Our work(More)
The past few years have witnessed a growth in size and computational requirements for training and inference with neural networks. Currently, a common approach to address these requirements is to use a heterogeneous distributed environment with a mixture of hardware devices such as CPUs and GPUs. Importantly, the decision of placing parts of the neural(More)
Motivation and contribution Recurrent neural networks (RNN) with long short-term memory (LSTM) are recently proposed to model sequences without prior domain knowledge [3, 6]. In these work, the authors empirically observed that RNN-LSTMs trained with vanilla optimization algorithms , such as stochastic gradient descent (SGD) with a simple learning rate(More)
Today we will talk about algorithms for finding shortest paths in a graph. We will also talk about algorithms for finding the diameter of a graph. There are two types of shortest-path problems: single-source shortest paths and all-pairs shortest paths. Single-source shortest paths. In the single-source shortest paths problem (SSSP), we are given a graph G =(More)
—This paper describes an evolutionary strategy called PSOGA-NN, which uses Neural Network (NN) for self-adaptive control of hybrid Particle Swarm Optimization and Adaptive Plan system with Genetic Algorithm (PSO-APGA) to solve large scale problems and constrained real-parameter optimization. This approach combines the search ability of all optimization(More)