Sentence Centrality Revisited for Unsupervised Summarization

  title={Sentence Centrality Revisited for Unsupervised Summarization},
  author={Hao Zheng and Mirella Lapata},
Single document summarization has enjoyed renewed interest in recent years thanks to the popularity of neural network models and the availability of large-scale datasets. In this paper we develop an unsupervised approach arguing that it is unrealistic to expect large-scale and high-quality training data to be available or created for different types of summaries, domains, or languages. We revisit a popular graph-based ranking algorithm and modify how node (aka sentence) centrality is computed… 

Figures and Tables from this paper

Improving Unsupervised Extractive Summarization with Facet-Aware Modeling

Experimental results show that the novel facet-aware centrality-based ranking model consistently outperforms strong baselines especially in longand multi-document scenarios and even performs comparably to some supervised models.

HipoRank: Incorporating Hierarchical and Positional Information into Graph-based Unsupervised Long Document Extractive Summarization

This work proposes a novel graph-based ranking model for unsupervised extractive summarization of long documents that leverages positional and hierarchical information grounded in discourse structure to augment a document's graph representation with hierarchy and directionality.

Centrality Meets Centroid: A Graph-based Approach for Unsupervised Document Summarization

This paper proposes a graph-based unsupervised approach for extractive document summarization that works at a summary-level by utilizing graph centrality and centroid.

Unsupervised Extractive Text Summarization with Distance-Augmented Sentence Graphs

An unsupervised approach to extractive text summarization is proposed, which uses an automatically constructed sentence graph from each document to select salient sentences for summarization based on both the similarities and relative distances in the neighborhood of each sentences.

Incorporating External Knowledge into Unsupervised Graph Model for Document Summarization

This work mainly focuses on improving the performance of the popular unsupervised Textrank algorithm that requires no labeled training data for extractive summarization, and innovatively incorporates external knowledge from open-source knowledge graphs into the model by entity linking.

Discourse-Aware Unsupervised Summarization for Long Scientific Documents

This work proposes an unsupervised graph-based ranking model for extractive summarization of long scientific documents, and suggests that patterns in the discourse structure are a strong signal for determining importance in scientific articles.

Unsupervised Extractive Summarization with Heterogeneous Graph Embeddings for Chinese Document

This paper is the first to propose an unsupervised extractive summarizaiton method with heterogeneous graph embeddings (HGEs) for Chinese document with results demonstrating that the method consistently outperforms the strong baseline in three summarization datasets.

SUPERT: Towards New Frontiers in Unsupervised Evaluation Metrics for Multi-Document Summarization

This work proposes SUPERT, which rates the quality of a summary by measuring its semantic similarity with a pseudo reference summary, i.e. selected salient sentences from the source documents, using contextualized embeddings and soft token alignment techniques.

Tweet-aware News Summarization with Dual-Attention Mechanism

This paper focuses on unsupervised summarization problem by exploring news and readers’ comments in linking tweets, i.e., tweets with URLs linking to the news, and proposes position-dependent word salience, which reflects the effect of local context.


  • Computer Science
  • 2019
T TED, a transformer-based unsupervised summarization system with pretraining on largescale data, is proposed, leveraging the lead bias in news articles to pretrain the model on large-scale corpora and finetune TED on target domains through theme modeling and a denoising autoencoder to enhance the quality of summaries.



LexRank: Graph-based Lexical Centrality as Salience in Text Summarization

A new approach, LexRank, for computing sentence importance based on the concept of eigenvector centrality in a graph representation of sentences is considered and the LexRank with threshold method outperforms the other degree-based techniques including continuous LexRank.

Unsupervised Neural Multi-document Abstractive Summarization

The proposed end-to-end, neural model architecture to perform unsupervised abstractive summarization is applied to the summarization of business and product reviews and it is shown that the generated summaries are fluent, show relevancy in terms of word-overlap, representative of the average sentiment of the input documents, and are highly abstractive compared to baselines.

An Unsupervised Multi-Document Summarization Framework Based on Neural Document Model

A document-level reconstruction framework named DocRebuild is proposed, which reconstructs the documents with summary sentences through a neural document model and selects summary sentences to minimize the reconstruction error.

An Exploration of Document Impact on Graph-Based Multi-Document Summarization

A document-based graph model is proposed to incorporate the document-level information and the sentence-to-document relationship into the graph-based ranking process and the results show the robustness of the proposed model.

Don’t Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization

A novel abstractive model is proposed which is conditioned on the article’s topics and based entirely on convolutional neural networks, outperforming an oracle extractive system and state-of-the-art abstractive approaches when evaluated automatically and by humans.

Multi-document summarization using cluster-based link analysis

Experimental results on the DUC2001 and DUC2002 datasets demonstrate the good effectiveness of the proposed summarization models and demonstrate that the ClusterCMRW model is more robust than the ClusterHITS model, with respect to different cluster numbers.

Automatic Text Summarization of Newswire: Lessons Learned from the Document Understanding Conference

An overview of the achieved results in the different types of summarization tasks, comparing both the broader classes of baselines, systems and humans, as well as individual pairs of summarizers (both human and automatic).

Topical Coherence for Graph-based Extractive Summarization

We present an approach for extractive single-document summarization. Our approach is based on a weighted graphical representation of documents obtained by topic modeling. We optimize importance,

Optimizing Sentence Modeling and Selection for Document Summarization

This paper attempts to build a strong summarizer DivSelect+CNNLM by presenting new algorithms to optimize each of them, and proposes CNNLM, a novel neural network language model (NNLM) based on convolutional neural network (CNN), to project sentences into dense distributed representations, then models sentence redundancy by cosine similarity.

Neural Summarization by Extracting Sentences and Words

This work develops a general framework for single-document summarization composed of a hierarchical document encoder and an attention-based extractor that allows for different classes of summarization models which can extract sentences or words.