A Supervised Approach to Extractive Summarisation of Scientific Papers

@inproceedings{Collins2017ASA,
  title={A Supervised Approach to Extractive Summarisation of Scientific Papers},
  author={Edward Collins and Isabelle Augenstein and Sebastian Riedel},
  booktitle={CoNLL},
  year={2017}
}
Automatic summarisation is a popular approach to reduce a document to its main arguments. Recent research in the area has focused on neural approaches to summarisation, which can be very data-hungry. However, few large datasets exist and none for the traditionally popular domain of scientific publications, which opens up challenging research avenues centered on encoding large, complex documents. In this paper, we introduce a new dataset for summarisation of computer science publications by… 

Figures and Tables from this paper

Extractive Summarization of Long Documents by Combining Global and Local Context
TLDR
A novel neural single-document extractive summarization model for long documents, incorporating both the global context of the whole document and the local context within the current topic, where it outperforms previous work, both extractive and abstractive models.
A Divide-and-Conquer Approach to the Summarization of Academic Articles
TLDR
A novel divide-and-conquer method for the summarization of long documents that processes the input in parts and generates a corresponding summary that leads to state-of-the-art results in two publicly available datasets of academic articles.
From Standard Summarization to New Tasks and Beyond: Summarization with Manifold Information
TLDR
This paper focuses on the survey of these new summarization tasks and approaches in the real-world application of text summarization algorithms.
Data-driven Summarization of Scientific Articles
TLDR
This work generates two novel multi-sentence summarization datasets from scientific articles and test the suitability of a wide range of existing extractive and abstractive neural network-based summarization approaches, demonstrating that scientific papers are suitable for data-driven text summarization.
Scientific Document Summarization for LaySumm ’20 and LongSumm ’20
TLDR
This paper distinguishes between two types of summaries, namely, a very short summary that captures the essence of the research paper in layman terms restricting overtly specific technical jargon and a much longer detailed summary aimed at providing specific insights into various ideas touched upon in the paper.
TalkSumm: A Dataset and Scalable Annotation Method for Scientific Paper Summarization Based on Conference Talks
TLDR
This paper proposes a novel method that automatically generates summaries for scientific papers, by utilizing videos of talks at scientific conferences, and hypothesizes that such talks constitute a coherent and concise description of the papers’ content, and can form the basis for good summaries.
Structured Summarization of Academic Publications
TLDR
SUSIE is proposed, a novel summarization method that can work with state-of-the-art summarization models in order to produce structured scientific summaries for academic articles and it is shown that the proposed method improves the performance of all models by as much as 4 ROUGE points.
A Hierarchical Neural Extractive Summarizer for Academic Papers
TLDR
This paper collects academic papers available from PubMed Central, and builds the training data suited for supervised machine learning-based extractive summarization, and proposes a tree structure-based scoring method to steer the model toward correct sentences.
Summaformers @ LaySumm 20, LongSumm 20
TLDR
This paper distinguishes between two types of summaries, namely, a very short summary that captures the essence of the research paper in layman terms and a much longer detailed summary aimed at providing specific insights into various ideas touched upon in the paper.
Summarization for LaySumm ’ 20 and LongSumm ’ 20
  • 2020
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 50 REFERENCES
Neural Summarization by Extracting Sentences and Words
TLDR
This work develops a general framework for single-document summarization composed of a hierarchical document encoder and an attention-based extractor that allows for different classes of summarization models which can extract sentences or words.
Learning Summary Prior Representation for Extractive Summarization
TLDR
A novel summary system called PriorSum is developed, which applies the enhanced convolutional neural networks to capture the summary prior features derived from length-variable phrases under a regression framework, and concatenated with document-dependent features for sentence ranking.
A Neural Attention Model for Abstractive Sentence Summarization
TLDR
This work proposes a fully data-driven approach to abstractive sentence summarization by utilizing a local attention-based model that generates each word of the summary conditioned on the input sentence.
Using Machine Learning Methods and Linguistic Features in Single-Document Extractive Summarization
TLDR
A novel sentence ranking methodology based on the similarity score between a candidate sentence and benchmark summaries is introduced and the popular linear regression model achieved the best results in all evaluated datasets.
Event-Based Extractive Summarization
TLDR
The experimental results indicate that not only the event-based features offer an improvement in summary quality over words as features, but that this effect is more pronounced for more sophisticated summarization methods that avoid redundancy in the output.
Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond
TLDR
This work proposes several novel models that address critical problems in summarization that are not adequately modeled by the basic architecture, such as modeling key-words, capturing the hierarchy of sentence-to-word structure, and emitting words that are rare or unseen at training time.
A compositional context sensitive multi-document summarizer: exploring the factors that influence summarization
TLDR
The research shows that a frequency based summarizer can achieve performance comparable to that of state-of-the-art systems, but only with a good composition function; context sensitivity improves performance and significantly reduces repetition.
SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents
We present SummaRuNNer, a Recurrent Neural Network (RNN) based sequence model for extractive summarization of documents and show that it achieves performance better than or comparable to
MUSEEC: A Multilingual Text Summarization Tool
TLDR
This paper provides an overview of MUSEEC methods and its architecture in general and recommends an unsupervised extension of POLY that compiles a document summary from compressed sentences.
Extractive Summarization by Maximizing Semantic Volume
TLDR
This work embeds each sentence in a semantic space and construct a summary by choosing a subset of sentences whose convex hull maximizes volume in that space, and provides a greedy algorithm based on the GramSchmidt process to efficiently perform volume maximization.
...
1
2
3
4
5
...