Earlier Isn't Always Better: Sub-aspect Analysis on Corpus and System Biases in Summarization

@inproceedings{Jung2019EarlierIA,
  title={Earlier Isn't Always Better: Sub-aspect Analysis on Corpus and System Biases in Summarization},
  author={Taehee Jung and Dongyeop Kang and L. Mentch and E. Hovy},
  booktitle={EMNLP/IJCNLP},
  year={2019}
}
Despite the recent developments on neural summarization systems, the underlying logic behind the improvements from the systems and its corpus-dependency remains largely unexplored. Position of sentences in the original text, for example, is a well known bias for news summarization. Following in the spirit of the claim that summarization is a combination of sub-functions, we define three sub-aspects of summarization: position, importance, and diversity and conduct an extensive analysis of the… Expand
Attend to the beginning: A study on using bidirectional attention for extractive summarization
TLDR
This work proposes attending to the beginning of a document, to improve the performance of extractive summarization models when applied to forum discussion data, and makes use of the tendency of introducing important information early in the text, by attended to the first few sentences in generic textual data. Expand
Make Lead Bias in Your Favor: Zero-shot Abstractive News Summarization
TLDR
This work proposes a self-supervised pre-training method to pre-train abstractive news summarization models on large-scale unlabeled news corpora, and shows that this approach can dramatically improve the summarization quality and achieve state-of-the-art results for zero-shot news summarizing without any fine-tuning. Expand
Corpora Evaluation and System Bias detection in Multi Document Summarization
TLDR
An attempt to quantify the quality of summarization corpus and prescribe a list of points to consider while proposing a new MDS corpus and the reason behind the absence of an MDS system which achieves superior performance across all corpora is analyzed. Expand
Leveraging Lead Bias for Zero-shot Abstractive News Summarization
TLDR
This work proposes a simple and effective way to pre-train abstractive news summarization models on large-scale unlabeled news corpora: predicting the leading sentences using the rest of an article using self-supervised pre-training. Expand
SUPERT: Towards New Frontiers in Unsupervised Evaluation Metrics for Multi-Document Summarization
TLDR
This work proposes SUPERT, which rates the quality of a summary by measuring its semantic similarity with a pseudo reference summary, i.e. selected salient sentences from the source documents, using contextualized embeddings and soft token alignment techniques. Expand
Facet-Aware Evaluation for Extractive Summarization
TLDR
This paper demonstrates that facet-aware evaluation manifests better correlation with human judgment than ROUGE, enables fine-grained evaluation as well as comparative analysis, and reveals valuable insights of state-of-the-art summarization methods. Expand
Dynamic Sliding Window for Meeting Summarization
  • Zhengyuan Liu, Nancy F. Chen
  • Computer Science
  • ArXiv
  • 2021
Recently abstractive spoken language summarization raises emerging research interest, and neural sequence-to-sequence approaches have brought significant performance improvement. However, summarizingExpand
SummVis: Interactive Visual Analysis of Models, Data, and Evaluation for Text Summarization
TLDR
Summvis, an open-source tool for visualizing abstractive summaries that enables fine-grained analysis of the models, data, and evaluation metrics associated with text summarization, is introduced. Expand
How well do you know your summarization datasets?
TLDR
This study manually analyse 600 samples from three popular summarization datasets using a six-class typology which captures different noise types (missing facts, entities) and degrees of summarization difficulty (extractive, abstractive). Expand
News Editorials: Towards Summarizing Long Argumentative Texts
TLDR
This paper presents Webis-EditorialSum-2020, a corpus of 1330 carefully curated summaries for 266 news editorials, targeting newseditorials, i.e., opinionated articles with a well-defined argumentation structure, and evaluates these summaries based on a tailored annotation scheme. Expand
...
1
2
3
...

References

SHOWING 1-10 OF 55 REFERENCES
Don’t Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization
TLDR
A novel abstractive model is proposed which is conditioned on the article’s topics and based entirely on convolutional neural networks, outperforming an oracle extractive system and state-of-the-art abstractive approaches when evaluated automatically and by humans. Expand
Learning-Based Single-Document Summarization with Compression and Anaphoricity Constraints
TLDR
A discriminative model for single-document summarization that integrally combines compression and anaphoricity constraints that outperforms prior work on both ROUGE as well as on human judgments of linguistic quality. Expand
Bottom-Up Abstractive Summarization
TLDR
This work explores the use of data-efficient content selectors to over-determine phrases in a source document that should be part of the summary, and shows that this approach improves the ability to compress text, while still generating fluent summaries. Expand
Multi-Document Abstractive Summarization Using ILP Based Multi-Sentence Compression
TLDR
The proposed approach identifies the most important document in the multi-document set and generates K-shortest paths from the sentences in each cluster using a word-graph structure, and selects sentences from the set of shortest paths generated from all the clusters employing a novel integer linear programming model. Expand
Content Selection in Deep Learning Models of Summarization
TLDR
It is suggested that it is easier to create a summarizer for a new domain than previous work suggests and the benefit of deep learning models for summarization for those domains that do have massive datasets is brought into question. Expand
Neural Summarization by Extracting Sentences and Words
TLDR
This work develops a general framework for single-document summarization composed of a hierarchical document encoder and an attention-based extractor that allows for different classes of summarization models which can extract sentences or words. Expand
Jointly Learning to Extract and Compress
TLDR
A joint model of sentence extraction and compression for multi-document summarization and its jointly extracted and compressed summaries outperform both unlearned baselines and the authors' learned extraction-only system on both ROUGE and Pyramid, without a drop in judged linguistic quality. Expand
Get To The Point: Summarization with Pointer-Generator Networks
TLDR
A novel architecture that augments the standard sequence-to-sequence attentional model in two orthogonal ways, using a hybrid pointer-generator network that can copy words from the source text via pointing, which aids accurate reproduction of information, while retaining the ability to produce novel words through the generator. Expand
Improving the Estimation of Word Importance for News Multi-Document Summarization
TLDR
A supervised model for ranking word importance that incorporates a rich set of features is proposed that is superior to prior approaches for identifying words used in human summaries and shows that an extractive summarizer which includes the estimation of word importance results in summaries comparable with the state-of-the-art by automatic evaluation. Expand
A Neural Attention Model for Abstractive Sentence Summarization
TLDR
This work proposes a fully data-driven approach to abstractive sentence summarization by utilizing a local attention-based model that generates each word of the summary conditioned on the input sentence. Expand
...
1
2
3
4
5
...