A Latent Variable Recurrent Neural Network for Discourse-Driven Language Models

  title={A Latent Variable Recurrent Neural Network for Discourse-Driven Language Models},
  author={Yangfeng Ji and Gholamreza Haffari and Jacob Eisenstein},
This paper presents a novel latent variable recurrent neural network architecture for jointly modeling sequences of words and (possibly latent) discourse relations between adjacent sentences. A recurrent neural network generates individual words, thus reaping the benefits of discriminatively-trained vector representations. The discourse relations are represented with a latent variable, which can be predicted or marginalized, depending on the task. The resulting model can therefore employ a… 

Figures and Tables from this paper

A Systematic Study of Neural Discourse Models for Implicit Discourse Relation

To their surprise, the best-configured feedforward architecture outperforms LSTM-based model in most cases despite thorough tuning, and it is found that feedforward can actually be more effective.

Variational Neural Discourse Relation Recognizer

This paper introduces neural discourse relation models to approximate the prior and posterior distributions of the latent variable, and employs these approximated distributions to optimize a reparameterized variational lower bound that allows VarNDRR to be trained with standard stochastic gradient methods.

A Dual-Attention Hierarchical Recurrent Neural Network for Dialogue Act Classification

Experimental results show that by modelling topic as an auxiliary task, the proposed dual-attention hierarchical recurrent neural network can significantly improve DA classification, yielding better or comparable performance to the state-of-the-art method on three public datasets.

Probabilistic Word Association for Dialogue Act Classification with Recurrent Neural Networks

A novel probabilistic method of utterance representation is presented and a RNN sentence model for out-of-context DA Classification is described and generated from keywords selected for their frequency association with certain DA’s.

The Context-Dependent Additive Recurrent Neural Net

The experimental results on public datasets in the dialog problem, contextual language model, question answering and question answering show that the novel CARNN-based architectures outperform previous methods.

Neural Net Models of Open-domain Discourse Coherence

This work describes domain-independent neural models of discourse coherence that are capable of measuring multiple aspects of coherence in existing sentences and can maintain coherence while generating new sentences, marking an initial step in generating coherent texts given discourse contexts.

Tag-Enhanced Tree-Structured Neural Networks for Implicit Discourse Relation Classification

This work employs the Tree-LSTM model and Tree-GRU model, which is based on the tree structure, to encode the arguments in a relation and further leverage the constituent tags to control the semantic composition process in these tree-structured neural networks.

A Parallel Recurrent Neural Network for Language Modeling with POS Tags

A parallel structure which contains two recurrent neural networks, one for word sequence modeling and another for part-of-speech sequence modeling is proposed, which performs better than the traditional recurrent network on perplexity and is better at reranking machine translation outputs.

A Recurrent Neural Model with Attention for the Recognition of Chinese Implicit Discourse Relations

An attention-based Bi-LSTM for Chinese implicit discourse relations is introduced and it is demonstrated that modeling argument pairs as a joint sequence can outperform word order-agnostic approaches.



Context dependent recurrent neural network language model

This paper improves recurrent neural network language models performance by providing a contextual real-valued input vector in association with each word to convey contextual information about the sentence being modeled by performing Latent Dirichlet Allocation using a block of preceding text.

Recurrent Convolutional Neural Networks for Discourse Compositionality

The discourse model coupled to the sentence model obtains state of the art performance on a dialogue act classification experiment and is able to capture both the sequentiality of sentences and the interaction between different speakers.

Hierarchical Recurrent Neural Network for Document Modeling

A novel hierarchical recurrent neural network language model (HRNNLM) for document modeling that integrates it as the sentence history information into the word level RNN to predict the word sequence with cross-sentence contextual information.

Larger-Context Language Modelling

It is found that content words, including nouns, adjec- tives and verbs, benefit most from an increasing number of context sentences, and this analysis suggests that larger-context language model improves the unconditional language model by capturing the theme of a document better and more easily.

Document Context Language Models

A set of multi-level recurrent neural network language models, called Document-Context Language Models (DCLM), which incorporate contextual information both within and beyond the sentence, are presented and empirically evaluated.

Improving the Inference of Implicit Discourse Relations via Classifying Explicit Discourse Connectives

This work investigates the interaction between discourse connectives and the discourse relations and proposes the criteria for selecting the discourse connective that can be dropped independently of the context without changing the interpretation of the discourse.

Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank

A Sentiment Treebank that includes fine grained sentiment labels for 215,154 phrases in the parse trees of 11,855 sentences and presents new challenges for sentiment compositionality, and introduces the Recursive Neural Tensor Network.

Representation Learning for Text-level Discourse Parsing

A representation learning approach, in which surface features are transformed into a latent space that facilitates RST discourse parsing, which obtains substantial improvements over the previous state-of-the-art in predicting relations and nuclearity on the RST Treebank.

Automatic sense prediction for implicit discourse relations in text

We present a series of experiments on automatically identifying the sense of implicit discourse relations, i.e. relations that are not marked with a discourse connective such as "but" or "because".

Recursive Deep Models for Discourse Parsing

This paper proposes a recursive model for discourse parsing that jointly models distributed representations for clauses, sentences, and entire discourses that obtains comparable performance regarding standard discoursing parsing evaluations when compared against current state-of-art systems.