Evaluating Discourse in Structured Text Representations

@article{Ferracane2019EvaluatingDI,
  title={Evaluating Discourse in Structured Text Representations},
  author={Elisa Ferracane and Greg Durrett and Junyi Jessy Li and Katrin Erk},
  journal={ArXiv},
  year={2019},
  volume={abs/1906.01472}
}
Discourse structure is integral to understanding a text and is helpful in many NLP tasks. Learning latent representations of discourse is an attractive alternative to acquiring expensive labeled discourse data. Liu and Lapata (2018) propose a structured attention mechanism for text classification that derives a tree over a text, akin to an RST discourse tree. We examine this model in detail, and evaluate on additional discourse-relevant tasks and datasets, in order to assess whether the… Expand
Representation learning in discourse parsing: A survey
TLDR
This survey covers text-level discourse parsing, shallow discourse parsing and coherence assessment, and introduces a trend of discourse structure aware representation learning that is to exploit discourse structures or discourse objectives for learning representations of sentences and documents for specific applications or for general purpose. Expand
Unsupervised Learning of Discourse Structures using a Tree Autoencoder
TLDR
A new strategy to generate tree structures in a task-agnostic, unsupervised fashion by extending a latent tree induction framework with an auto-encoding objective is proposed, inferring general tree structures of natural text in multiple domains, showing promising results on a diverse set of tasks. Expand
From neural discourse parsing to content structuring : towards a large-scale data-driven approach to discourse processing
TLDR
In extending the work to natural language generation, it is demonstrated that the novel content structuring system utilizing silver-standard discourse structures outperforms text-only systems on the proposed task of elementary discourse unit ordering, a significantly more difficult version of sentence ordering task. Expand
Predicting Discourse Structure using Distant Supervision from Sentiment
TLDR
This work proposes a novel approach that uses distant supervision on an auxiliary task (sentiment classification), to generate abundant data for RST-style discourse structure prediction, which combines a neural variant of multiple-instance learning, using document-level supervision, with an optimal CKY-style tree generation algorithm. Expand
Predicting Discourse Trees from Transformer-based Neural Summarizers
TLDR
Experiments across models and datasets reveal that the summarizer learns both, dependency- and constituency-style discourse information, which is typically encoded in a single head, covering long- and short-distance discourse dependencies. Expand
Centering-based Neural Coherence Modeling with Hierarchical Discourse Segments
TLDR
This work proposes a coherence model which takes discourse structural information into account without relying on human annotations, and approximate a linguistic theory of coherence, Centering theory, which is used to track the changes of focus between discourse segments. Expand
MEGA RST Discourse Treebanks with Structure and Nuclearity from Scalable Distant Sentiment Supervision
TLDR
This work presents a novel scalable methodology to automatically generate discourse treebanks using distant supervision from sentiment-annotated datasets, creating and publishing MEGA-DT, a new large-scale discourse-annotation corpus. Expand
Towards Domain-Independent Text Structuring Trainable on Large Discourse Treebanks
TLDR
This work proposes the new and domain-independent NLG task of structuring and ordering a (possibly large) set of EDUs, and presents a solution for this task that combines neural dependency tree induction with pointer networks, and can be trained on large discourse treebanks that have only recently become available. Expand
Towards Domain-Independent Text Structuring Trainable on Large Discourse Treebanks
Text structuring is a fundamental step in NLG, especially when generating multi-sentential text. With the goal of fostering more general and data-driven approaches to text structuring, we propose theExpand
StructSum: Incorporating Latent and Explicit Sentence Dependencies for Single Document Summarization
TLDR
An extensive analysis of the summaries is presented and it is shown that modeling document structure reduces copying long sequences and incorporates richer content from the source document while maintaining comparable summary lengths and an increased degree of abstraction. Expand
...
1
2
...

References

SHOWING 1-10 OF 28 REFERENCES
Learning Structured Text Representations
TLDR
A model that can encode a document while automatically inducing rich structural dependencies is proposed that embeds a differentiable non-projective parsing algorithm into a neural model and uses attention mechanisms to incorporate the structural biases. Expand
Discourse Structure in Machine Translation Evaluation
TLDR
This article first design discourse-aware similarity measures, which use all-subtree kernels to compare discourse parse trees in accordance with the Rhetorical Structure Theory (RST), and shows that a simple linear combination with these measures can help improve various existing machine translation evaluation metrics regarding correlation with human judgments. Expand
Do latent tree learning models identify meaningful structure in sentences?
TLDR
This paper replicates two latent tree learning models in a shared codebase and finds that only one of these models outperforms conventional tree-structured models on sentence classification, and its parsing strategies are not especially consistent across random restarts. Expand
Single Document Summarization as Tree Induction
TLDR
A new iterative refinement algorithm is designed that induces a multi-root dependency tree while predicting the output summary of single-document extractive summarization, and performs competitively against state-of-the-art methods. Expand
Neural Discourse Structure for Text Categorization
TLDR
This work uses a recursive neural network and a newly proposed attention mechanism to compute a representation of the text that focuses on salient content, from the perspective of both RST and the task. Expand
Learning-Based Single-Document Summarization with Compression and Anaphoricity Constraints
TLDR
A discriminative model for single-document summarization that integrally combines compression and anaphoricity constraints that outperforms prior work on both ROUGE as well as on human judgments of linguistic quality. Expand
A Linear-Time Bottom-Up Discourse Parser with Constraints and Post-Editing
TLDR
A much faster model whose time complexity is linear in the number of sentences, with two linear-chain CRFs applied in cascade as local classifiers and a novel approach of post-editing, which modifies a fully-built tree by considering information from constituents on upper levels, can improve the accuracy. Expand
Learning to Compose Words into Sentences with Reinforcement Learning
TLDR
Reinforcement learning is used to learn tree-structured neural networks for computing representations of natural language sentences and it is shown that while they discover some linguistically intuitive structures, they are different than conventional English syntactic structures. Expand
Linguistically-Informed Self-Attention for Semantic Role Labeling
TLDR
LISA is a neural network model that combines multi-head self-attention with multi-task learning across dependency parsing, part-of-speech tagging, predicate detection and SRL, and can incorporate syntax using merely raw tokens as input. Expand
Modeling Local Coherence: An Entity-Based Approach
TLDR
This article re-conceptualize coherence assessment as a learning task and shows that the proposed entity-grid representation of discourse is well-suited for ranking-based generation and text classification tasks. Expand
...
1
2
3
...