• Corpus ID: 222341644

# Summarize, Outline, and Elaborate: Long-Text Generation via Hierarchical Supervision from Extractive Summaries

@article{Sun2020SummarizeOA,
title={Summarize, Outline, and Elaborate: Long-Text Generation via Hierarchical Supervision from Extractive Summaries},
author={Xiaofei Sun and Chun Fan and Zijun Sun and Yuxian Meng and Fei Wu and Jiwei Li},
journal={ArXiv},
year={2020},
volume={abs/2010.07074}
}
• Published 14 October 2020
• Computer Science
• ArXiv
Long-text generation remains a challenge. The difficulty of generating coherent long texts lies in the fact that existing models overwhelmingly focus on the tasks of local word prediction, and cannot make high level plans on what to generate or capture the high-level discourse dependencies between chunks of texts. Inspired by how humans write, where a list of bullet points or a catalog is first outlined, and then each bullet point is expanded to form the whole article, we propose {\it SOE}, a…
2 Citations

## Figures and Tables from this paper

Plot Writing From Pre-Trained Language Models
• Computer Science
ArXiv
• 2022
This work proposes generating story plots using off-the-shelf PLMs while maintaining the bene-ﬁt of content planning to generate cohesive and contentful stories.
DYLE: Dynamic Latent Extraction for Abstractive Long-Input Summarization
• Computer Science
ACL
• 2022
DYLE jointly trains an extractor and a generator and treats the extracted text snippets as the latent variable, allowing dynamic snippet-level attention weights during decoding, and shows that the proposed dynamic weights provide interpretability of the generation process.

## References

SHOWING 1-10 OF 78 REFERENCES
Sentence-Level Content Planning and Style Specification for Neural Text Generation
• Computer Science
EMNLP
• 2019
This work presents an end-to-end trained two-step generation model, where a sentence-level content planner first decides on the keyphrases to cover as well as a desired language style, followed by a surface realization decoder that generates relevant and coherent text.
Progressive Generation of Long Text
• Computer Science
ArXiv
• 2020
This work proposes a simple but effective method of generating text in a progressive manner, inspired by generating images from low to high resolution, and significantly improves upon the fine-tuned GPT-2 in terms of domain-specific quality and sample efficiency.
Order-Planning Neural Text Generation From Structured Data
• Computer Science
AAAI
• 2018
This paper proposes an order-planning text generation model to capture the relationship between different fields and use such relationship to make the generated text more fluent and smooth.
Bottom-Up Abstractive Summarization
• Computer Science
EMNLP
• 2018
This work explores the use of data-efficient content selectors to over-determine phrases in a source document that should be part of the summary, and shows that this approach improves the ability to compress text, while still generating fluent summaries.
Extractive Summarization as Text Matching
• Computer Science
ACL
• 2020
This paper forms the extractive summarization task as a semantic text matching problem, in which a source document and candidate summaries will be matched in a semantic space to create a semantic matching framework.
Towards Generating Long and Coherent Text with Multi-Level Latent Variable Models
• Computer Science
ACL
• 2019
Extensive experimental results demonstrate that the proposed multi-level VAE model produces more coherent and less repetitive long text compared to baselines as well as can mitigate the posterior-collapse issue.
Long Text Generation via Adversarial Training with Leaked Information
• Computer Science
AAAI
• 2018
The discriminative net is allowed to leak its own high-level extracted features to the generative net to further help the guidance, and without any supervision, LeakGAN would be able to implicitly learn sentence structures only through the interaction between Manager and Worker.
A Hierarchical Neural Autoencoder for Paragraphs and Documents
• Computer Science
ACL
• 2015
This paper introduces an LSTM model that hierarchically builds an embedding for a paragraph from embeddings for sentences and words, then decodes this embedding to reconstruct the original paragraph and evaluates the reconstructed paragraph using standard metrics to show that neural models are able to encode texts in a way that preserve syntactic, semantic, and discourse coherence.
Step-by-Step: Separating Planning from Realization in Neural Data-to-Text Generation
• Computer Science
NAACL
• 2019
The results demonstrate that decoupling text planning from neural realization indeed improves the system’s reliability and adequacy while maintaining fluent output, and improvements both in BLEU scores and in manual evaluations are observed.
Data-to-Text Generation with Content Selection and Planning
• Computer Science
AAAI
• 2019
This work presents a neural network architecture which incorporates content selection and planning without sacrificing end-to-end training and shows that this model outperforms strong baselines improving the state-of-the-art on the recently released RotoWire dataset.