• Publications
  • Influence
Scan Statistics on Enron Graphs
A theory of scan statistics on graphs is introduced and the ideas applied to the problem of anomaly detection in a time series of Enron email graphs are applied. Expand
A Repository of State of the Art and Competitive Baseline Summaries for Generic News Summarization
A corpus of summaries produced by several state-of-the-art extractive summarization systems or by popular baseline systems is presented to facilitate future research on generic summarization and motivates the need for development of more sensitive evaluation measures and for approaches to system combination in summarization. Expand
Left-Brain / Right-Brain Multi-Document Summarization
Since we began participating in DUC in 2001, our summarizer has been based on an HMM (Hidden Markov Model) for sentence selection within a document and a pivoted QR algorithm to generate aExpand
Text summarization via hidden Markov models
This work presents an approach to generating sentence extract summary of a document, a hidden Markov model that judges the likelihood that each sentence should be contained in the summary. Expand
Topic-Focused Multi-Document Summarization Using an Approximate Oracle Score
An "oracle" score, based on the probability distribution of unigrams in human summaries, is introduced and it is demonstrated that with the oracle score, extracts are generated which score, on average, better than the human summary, when evaluated with ROUGE. Expand
An Assessment of the Accuracy of Automatic Evaluation in Summarization
An assessment of the automatic evaluations used for multi-document summarization of news, and recommendations about how any evaluation, manual or automatic, should be used to find statistically significant differences between summarization systems. Expand
Fast Approximate Quadratic Programming for Graph Matching
This work presents its graph matching algorithm, the Fast Approximate Quadratic assignment algorithm, and empirically demonstrates that the algorithm is faster and achieves a lower objective value on over 80% of the QAPLIB benchmark library, compared with the previous state-of-the-art. Expand
CLASSY 2011 at TAC: Guided and Multi-lingual Summaries and Evaluation Metrics
We present CLASSY’s guided summarization as well as multi-lingual methods as submitted to TAC 2011. In addition, we describe improved metrics submitted to the AESOP task at TAC.
QCS: A system for querying, clustering and summarizing documents
A novel integrated information retrieval system-the Query, Cluster, Summarize (QCS) system, which is portable, modular, and permits experimentation with different instantiations of each of the constituent text analysis components, demonstrates the feasibility of assembling an effective IR system from existing software libraries, the usefulness of the modularity of the design, and the value of this particular combination of modules. Expand
MultiLing 2015: Multilingual Summarization of Single and Multi-Documents, On-line Fora, and Call-center Conversations
An overview of MultiLing 2015 is presented, a special session at SIGdial 2015, a communitydriven initiative that pushes the state-ofthe-art in Automatic Summarization by providing data sets and fostering further research and development of summarization systems. Expand