• Corpus ID: 11512678

A Convolutional Architecture for Word Sequence Prediction

  title={A Convolutional Architecture for Word Sequence Prediction},
  author={Mingxuan Wang and Zhengdong Lu and Hang Li and Wenbin Jiang and Qun Liu},
We propose a convolutional neural network, named genCNN, for word sequence prediction. Different from previous work on neural networkbased language modeling and generation (e.g., RNN or LSTM), we choose not to greedily summarize the history of words as a fixed length vector. Instead, we use a convolutional neural network to predict the next word with the history of words of variable length. Also different from the existing feedforward networks for language modeling, our model can effectively… 

Figures and Tables from this paper

An Empirical Study of Language CNN for Image Captioning

This paper introduces a language CNN model which is suitable for statistical language modeling tasks and shows competitive performance in image captioning, and is competitive with the state-of-the-art methods.

Recurrent neural network language models for automatic speech recognition

A novel pre-training method for RNNLMs is presented, in which the output weights of a feed-forward neural network language model (NNLM) are shared with the RNNLM, and the context-enhancement of RNNlMs is enhanced using prosody and syntactic features.

Recurrent Highway Networks with Language CNN for Image Captioning

This model can naturally exploit the hierarchical and temporal structure of history words, which are critical for image caption generation, and is competitive with the state-of-the-art methods.

Character-Level Quantum Mechanical Approach for a Neural Language Model

A character-level neural language model (NLM) that is based on quantum theory that integrates a convolutional neural network that isbased on network-in-network (NIN) and the quantum semantic space model.

Learning Semantically Coherent and Reusable Kernels in Convolution Neural Nets for Sentence Classification

Experimental results show the usefulness of the core ideas of learning semantically coherent kernels and leveraging reusable kernels for efficient learning on several benchmark datasets by achieving performance close to the state-of-the-art methods but with semantic and reusable properties.

Locally Smoothed Neural Networks

A novel locally smoothed neural network (LSNN) is proposed, which is to represent the weight matrix of the locally connected layer as the product of the kernel and the smoother, where the kernel is shared over different local receptive fields, and the smoothness is for determining the importance and relations of differentLocal receptive fields.

Topic Augmented Neural Network for Short Text Conversation

The extensive evaluation of TANN, using large human annotated data sets, shows that TANN outperforms simple neutral network methods, while beating other typical matching models with a large margin.

Response selection with topic clues for retrieval-based chatbots

Deep Learning applied to NLP

This paper will try to explain the basics of CNNs, its different variations and how they have been applied to NLP.

Dialogue Systems & Dialogue Management

This document provides a general overview of the key issues surrounding the development of SDSs, providing an engaging spoken interface to high-level information fusion software.



A Convolutional Neural Network for Modelling Sentences

A convolutional architecture dubbed the Dynamic Convolutional Neural Network (DCNN) is described that is adopted for the semantic modelling of sentences and induces a feature graph over the sentence that is capable of explicitly capturing short and long-range relations.

Neural Machine Translation by Jointly Learning to Align and Translate

It is conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and it is proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.

Convolutional Neural Network Architectures for Matching Natural Language Sentences

Convolutional neural network models for matching two sentences are proposed, by adapting the convolutional strategy in vision and speech and nicely represent the hierarchical structures of sentences with their layer-by-layer composition and pooling.

A Neural Probabilistic Language Model

This work proposes to fight the curse of dimensionality by learning a distributed representation for words which allows each training sentence to inform the model about an exponential number of semantically neighboring sentences.

Recurrent Continuous Translation Models

We introduce a class of probabilistic continuous translation models called Recurrent Continuous Translation Models that are purely based on continuous representations for words, phrases and sentences

Decoding with Large-Scale Neural Language Models Improves Translation

This work develops a new model that combines the neural probabilistic language model of Bengio et al., rectified linear units, and noise-contrastive estimation, and incorporates it into a machine translation system both by reranking k-best lists and by direct integration into the decoder.

Improving deep neural networks for LVCSR using rectified linear units and dropout

Modelling deep neural networks with rectified linear unit (ReLU) non-linearities with minimal human hyper-parameter tuning on a 50-hour English Broadcast News task shows an 4.2% relative improvement over a DNN trained with sigmoid units, and a 14.4% relative improved over a strong GMM/HMM system.

Domain Adaptation via Pseudo In-Domain Data Selection

The results show that more training data is not always better, and that best results are attained via proper domain-relevant data selection, as well as combining in- and general-domain systems during decoding.

Applying Convolutional Neural Networks concepts to hybrid NN-HMM model for speech recognition

The proposed CNN architecture is applied to speech recognition within the framework of hybrid NN-HMM model to use local filtering and max-pooling in frequency domain to normalize speaker variance to achieve higher multi-speaker speech recognition performance.

ImageNet classification with deep convolutional neural networks

A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective.