• Publications
  • Influence
Automatic Acquisition of Hyponyms from Large Text Corpora
TLDR
A set of lexico-syntactic patterns that are easily recognizable, that occur frequently and across text genre boundaries, and that indisputably indicate the lexical relation of interest are identified. Expand
TextTiling: Segmenting Text into Multi-paragraph Subtopic Passages
TLDR
The algorithm is fully implemented and is shown to produce segmentation that corresponds well to human judgments of the subtopic boundaries of 12 texts, which should be useful for many text analysis tasks, including information retrieval and summarization. Expand
Why phishing works
TLDR
This paper provides the first empirical evidence about which malicious strategies are successful at deceiving general users by analyzing a large set of captured phishing attacks and developing a set of hypotheses about why these strategies might work. Expand
Search User Interfaces
TLDR
This book summarizes developments of the state of the art of search interface design, both in academic research and in deployment in commercial systems, presenting the most broadly acceptable make their way into major web search engines. Expand
Multi-Paragraph Segmentation of Expository Text
TLDR
TextTiling, an algorithm for partitioning expository texts into coherent multi-paragraph discourse units which reflect the subtopic structure of the texts, is described and shown to produce segmentation that corresponds well to human judgments of the major subtopic boundaries of thirteen lengthy texts. Expand
Faceted metadata for image search and browsing
TLDR
An alternative based on enabling users to navigate along conceptual dimensions that describe the images is presented, which makes use of hierarchical faceted metadata and dynamically generated query previews. Expand
A Simple Algorithm for Identifying Abbreviation Definitions in Biomedical Text
TLDR
This paper shows that the problem of identifying abbreviations' definitions can be solved with a much simpler algorithm than that proposed by other research efforts, and achieves 96% precision and 82% recall on a standard test collection, which is at least as good as existing approaches. Expand
Reexamining the cluster hypothesis: scatter/gather on retrieval results
TLDR
This work systematically evaluates Scatter/Gather in this context and finds significant improvements over similarity search ranking alone and provides evidence validating the cluster hypothesis which states that relevant documents tend to be more similar to each other than to non-relevant documents. Expand
The state of the art in automating usability evaluation of user interfaces
TLDR
The survey analyzes existing techniques, identifies which aspects of usability evaluation automation are likely to be of use in future research, and suggests new ways to expand existing approaches to better support usability evaluation. Expand
A Critique and Improvement of an Evaluation Metric for Text Segmentation
TLDR
A simple modification to the Pk metric is proposed, called Window Diff, which moves a fixed-sized window across the text and penalizes the algorithm whenever the number of boundaries within the window does not match the true number of borders for that window of text. Expand
...
1
2
3
4
5
...