Multi-document summarization using cluster-based link analysis

@inproceedings{Wan2008MultidocumentSU,
  title={Multi-document summarization using cluster-based link analysis},
  author={Xiaojun Wan and Jianwu Yang},
  booktitle={SIGIR '08},
  year={2008}
}
The Markov Random Walk model has been recently exploited for multi-document summarization by making use of the link relationships between sentences in the document set, under the assumption that all the sentences are indistinguishable from each other. [...] Key Result Experimental results on the DUC2001 and DUC2002 datasets demonstrate the good effectiveness of our proposed summarization models.Expand
Ranking Through Clustering: An Integrated Approach to Multi-Document Summarization
  • X. Cai, Wenjie Li
  • Computer Science
  • IEEE Transactions on Audio, Speech, and Language Processing
  • 2013
TLDR
A novel approach that directly generates clusters integrated with ranking is proposed that is demonstrated by both the cluster quality analysis and the summarization evaluation conducted on the DUC 2004-2007 datasets. Expand
Generic Multi-Document Summarization Using Topic-Oriented Information
TLDR
The topic-oriented PageRank (ToPageRank) model, in which topic information is fully incorporated, and the topic- oriented HITS (ToHITS) model is designed to compare the influence of different graph-based algorithms are proposed. Expand
A spectral analysis approach to document summarization: Clustering and ranking sentences simultaneously
TLDR
This paper proposes a novel approach developed based on the spectral analysis to simultaneously clustering and ranking of sentences and demonstrates the improvement of the proposed approach over the other existing clustering-based approaches. Expand
Multi-document Summarization using Probabilistic Topic-based Network Models
TLDR
An integrated approach considering both probabilistic topic models and network models for multi-document summarization shows that the PLSA-based network approach outperforms the TF-IDF baseline on all datasets. Expand
Topic based Summarization of Multiple Documents using Semantic Analysis and Clustering
TLDR
Experimental results show that there is substantial performance improvement using the combination of agglomerative hierarchical clustering and Latent Semantic Analysis and it makes better summary as compared to the other state-of-art techniques. Expand
Multi-document Summarization via LDA and Density Peaks Based Sentence-Level Clustering
TLDR
This paper presents a novel unsupervised extractive multi-document summarization method that achieves the best property on the DUC2004 dataset, which outperforms the state-of-the-art methods, such as DUC 2004 Best, R2N2_ILP, and WCS. Expand
Query-focused multi-document summarization using hypergraph-based ranking
TLDR
A novel hypergraph based vertex-reinforced random walk framework for multi-document summarization that exploits the Hierarchical Dirichlet Process (HDP) topic model to learn a word-topic probability distribution in sentences and a time-variant random walk algorithm for hypergraphs is developed to rank sentences which ensures sentence diversity by vertex- reinforcement in summaries. Expand
Co-clustering Sentences and Terms for Multi-document Summarization
TLDR
This paper presents a co-clustering based multi-document summarization method that makes full use of the diverse and redundant content within topically-related articles to generate a multidocument summary. Expand
Multi Document summarization using EM Clustering
TLDR
A new technique for cluster identification called EM (Expectation Maximization) which helps to identify the unobserved latent variables from the sentences which is using the manifold ranking based on relevance propagation via mutual reinforcement between sentences and cluster. Expand
Multi-document Summarization Exploiting Semantic Analysis Based on Tag Cluster
TLDR
This work proposes a novel multi-document summarization technique which employs the tag cluster on Flickr, a kind of folksonomy systems, for detecting key sentences from multiple documents and creates a word frequency table for analyzing the semantics and contribution of words by using HITS algorithm. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 30 REFERENCES
Respect my authority!: HITS without hyperlinks, utilizing cluster-based language models
TLDR
It is found that the cluster-document graphs presented give rise to much better retrieval performance than previously proposed document-only graphs do and that computing authority scores for clusters constitutes an effective method for identifying clusters containing a large percentage of relevant documents. Expand
Topic themes for multi-document summarization
TLDR
This paper presents eight different methods of generating MDS and evaluates each of these methods on a large set of topics used in past DUC workshops, showing a significant improvement in the quality of summaries based on topic themes over MDS methods that use other alternative topic representations. Expand
Improved Affinity Graph Based Multi-Document Summarization
This paper describes an affinity graph based approach to multi-document summarization. We incorporate a diffusion process to acquire semantic relationships between sentences, and then computeExpand
Cross-document summarization by concept classification
TLDR
A Cross Document Summarizer XDoX designed specifically to summarize large document sets (50-500 documents and more) and shows examples of summaries obtained in tests as well as from the first Document Understanding Conference (DUC). Expand
LexPageRank: Prestige in Multi-Document Text Summarization
TLDR
The results show that the LexPageRank approach outperforms centroid-based summarization and is quite successful compared to other summarization systems. Expand
Combining a mixture language model and Naive Bayes for multi-document summarisation
TLDR
The TNO system for multi-document summarisation is based on an extraction approach that combined two statistical methods for sentence selection with a variant of the MMR algorithm to yield a more reliable salience score. Expand
Centroid-based summarization of multiple documents
TLDR
A multi-document summarizer, MEAD, is presented, which generates summaries using cluster centroids produced by a topic detection and tracking system and an evaluation scheme based on sentence utility and subsumption is applied. Expand
Summarizing text documents: sentence selection and evaluation metrics
TLDR
An analysis of news-article summaries generated by sentence selection, using a normalized version of precision-recall curves with a baseline of random sentence selection to evaluate features and empirical results show the importance of corpus-dependent baseline summarization standards, compression ratios and carefully crafted long queries. Expand
Summarizing Similarities and Differences Among Related Documents
TLDR
The approach described here exploits the results of recent progress in information extraction to represent salient units of text and their relationships to represent meaningful relations between units based on an analysis of text cohesion and the context in which the comparison is desired. Expand
Towards Multidocument Summarization by Reformulation: Progress and Prospects
TLDR
The evaluation of system components shows that learning over multiple extracted linguistic features is more effective than information retrieval approaches at identifying similar text units for summarization and that it is possible to generate a fluent summary that conveys similarities among documents even when full semantic interpretations of the input text are not available. Expand
...
1
2
3
...