Newsroom: A Dataset of 1.3 Million Summaries with Diverse Extractive Strategies

@article{Grusky2018NewsroomAD,
  title={Newsroom: A Dataset of 1.3 Million Summaries with Diverse Extractive Strategies},
  author={Max Grusky and M. Naaman and Yoav Artzi},
  journal={ArXiv},
  year={2018},
  volume={abs/1804.11283}
}
  • Max Grusky, M. Naaman, Yoav Artzi
  • Published 2018
  • Computer Science
  • ArXiv
  • We present NEWSROOM, a summarization dataset of 1.3 million articles and summaries written by authors and editors in newsrooms of 38 major news publications. [...] Key Method We analyze the extraction strategies used in NEWSROOM summaries against other datasets to quantify the diversity and difficulty of our new data, and train existing methods on the data to evaluate its utility and challenges. The dataset is available online at summari.es.Expand Abstract
    144 Citations

    Figures, Tables, and Topics from this paper

    Explore Further: Topics Discussed in This Paper

    BIGPATENT: A Large-Scale Dataset for Abstractive and Coherent Summarization
    • 28
    • Highly Influenced
    • PDF
    Exploring Content Selection in Summarization of Novel Chapters
    • 1
    • PDF
    Multi-News: a Large-Scale Multi-Document Summarization Dataset and Abstractive Hierarchical Model
    • 58
    • Highly Influenced
    • PDF
    Jointly Extracting and Compressing Documents with Summary State Representations
    • 15
    • Highly Influenced
    • PDF
    WikiHow: A Large Scale Text Summarization Dataset
    • 41
    • PDF
    Enhancing a Text Summarization System with ELMo
    • 1
    • Highly Influenced
    • PDF
    The Summary Loop: Learning to Write Abstractive Summaries Without Examples
    • 7
    • PDF
    Unsupervised Reference-Free Summary Quality Evaluation via Contrastive Learning
    • 1
    • Highly Influenced
    • PDF
    A Hierarchy Transformer Network for Extractive Summaries

    References

    SHOWING 1-10 OF 36 REFERENCES
    SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents
    • 537
    • Highly Influential
    • PDF
    A Neural Attention Model for Abstractive Sentence Summarization
    • 1,681
    • Highly Influential
    • PDF
    Classify or Select: Neural Architectures for Extractive Document Summarization
    • 51
    • Highly Influential
    • PDF
    Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond
    • 1,104
    • PDF
    DUC 2005: Evaluation of Question-Focused Summarization Systems
    • 38
    • PDF
    The Effects of Human Variation in DUC Summarization Evaluation
    • 47
    • PDF
    Abstractive Document Summarization with a Graph-Based Attentional Neural Model
    • 181
    • PDF
    Looking for a Few Good Metrics: Automatic Summarization Evaluation - How Many Samples Are Enough?
    • 101
    • Highly Influential
    • PDF
    Improving the Estimation of Word Importance for News Multi-Document Summarization
    • 118
    • PDF