Corpus ID: 14102322

The Impact of Frequency on Summarization

@inproceedings{Nenkova2005TheIO,
  title={The Impact of Frequency on Summarization},
  author={A. Nenkova and Lucy Vanderwende},
  year={2005}
}
Most multi-document summarizers utilize term frequency related features to determine sentence importance. No empirical studies, however, have been carried out to isolate the contribution made by frequency information from that of other features. Here, we examine the impact of frequency on various aspects of summarization and the role of frequency in the design of a summarization system. We describe SumBasic, a summarization system that exploits frequency exclusively to create summaries… Expand
Multi-Document Summarization by Maximizing Informative Content-Words
We show that a simple procedure based on maximizing the number of informative content-words can produce some of the best reported results for multi-document summarization. We first assign a score toExpand
Significance of Sentence Ordering in Multi Document Summarization
TLDR
The significance of ordering of sentences in multi document summarization is discussed and experimental results on DUC2002 dataset show the ordering of summaries before and, improvement in this, after applying sentence ordering. Expand
Content selection in multi-document summarization
TLDR
It is shown that a modular extractive summarizer using the estimates of word importance can generate summaries comparable to the state-of-the-art systems, and a new framework of system combination for multi-document summarization is presented. Expand
The elements of automatic summarization
TLDR
This thesis is about automatic summarization, with experimental results on multi-document news topics: how to choose a series of sentences that best represents a collection of articles about one topic, using an objective function for summarization that is called "maximum coverage". Expand
A Novel Contextual Topic Model for Query-Focused Multi-document Summarization
  • Guangbing Yang
  • Computer Science
  • 2014 IEEE 26th International Conference on Tools with Artificial Intelligence
  • 2014
TLDR
This study proposes a novel approach based on well-known hierarchical Bayesian topic models that can determine the relevance of sentences more effectively, and recognize latent topics and arrange them hierarchically as well. Expand
Topic-Focused Multi-Document Summarization Using an Approximate Oracle Score
TLDR
An "oracle" score, based on the probability distribution of unigrams in human summaries, is introduced and it is demonstrated that with the oracle score, extracts are generated which score, on average, better than the human summary, when evaluated with ROUGE. Expand
Summarization Approaches Based on Document Probability Distributions
TLDR
The research shows that the above summarizer, which is light and simple, can deliver good summaries comparable to other state-of-the-art systems. Expand
Extractive Multi-Document Summaries Should Explicitly Not Contain Document Specific Content
TLDR
A sentence selection objective for extractive summarization in which sentences are penalized for containing content that is specific to the documents they were extracted from is presented. Expand
An Extractive Text Summarizer Based on Significant Words
TLDR
A new quantification measure for word significance used in natural language processing (NLP) tasks is proposed and successfully applied to an extractive text summarization approach, achieving a state-of-the-art performance. Expand
Beyond SumBasic: Task-focused summarization with sentence simplification and lexical expansion
TLDR
This paper details the design of a generic extractive summarization system, which ranked first out of 22 systems in terms of overall mean Pyramid score; and in the human evaluation of summary responsiveness to the topic, the system ranked third out of 35 systems. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 28 REFERENCES
Experiments in Multidocument Summarization
This paper describes a multidocument summarizer built upon research into the detection of new information. The summarizer uses several new strategies to select interesting and informative sentences,Expand
Using N-Grams To Understand the Nature of Summaries
TLDR
Empirically characterize human-written summaries provided in a widely used summarization corpus and suggest that extraction-based techniques which have been successful for single-document summarization may not be sufficient when summarizing multiple documents. Expand
The use of MMR, diversity-based reranking for reordering documents and producing summaries
TLDR
This paper presents a method for combining query-relevance with information-novelty in the context of text retrieval and summarization, and preliminary results indicate some benefits for MMR diversity ranking in document retrieval and in single document summarization. Expand
Evaluation Challenges in Large-Scale Document Summarization
TLDR
A large-scale meta evaluation of eight evaluation measures for both single-document and multi-document summarizers is presented, showing the strengths and draw-backs of all evaluation methods and how they rank the different summarizers. Expand
A trainable document summarizer
TLDR
The trends in the results are in agreement with those of Edmundson who used a subjectively weighted combination of features as opposed to training the feature weights using a corpus, which suggests that even shorter extracts may be useful indicative summmies. Expand
ROUGE: A Package for Automatic Evaluation of Summaries
TLDR
Four different RouGE measures are introduced: ROUGE-N, ROUge-L, R OUGE-W, and ROUAGE-S included in the Rouge summarization evaluation package and their evaluations. Expand
LexRank: Graph-based Centrality as Salience in Text Summarization
We introduce a stochastic graph-based method for computing relative importance of textual units for Natural Language Processing. We test the technique on the problem of Text Summarization (TS).Expand
Left-Brain / Right-Brain Multi-Document Summarization
Since we began participating in DUC in 2001, our summarizer has been based on an HMM (Hidden Markov Model) for sentence selection within a document and a pivoted QR algorithm to generate aExpand
Vocabulary Agreement Among Model Summaries And Source Documents 1
Analysis of 9000 manually-written summaries of newswire stories provided to participants in four Document Understanding Conferences indicates that no more than 55% of the vocabulary items they employExpand
Evaluating Content Selection in Summarization: The Pyramid Method
TLDR
It is argued that the method presented is reliable, predictive and diagnostic, thus improves considerably over the shortcomings of the human evaluation method currently used in the Document Understanding Conference. Expand
...
1
2
3
...