Adam L. Berger

Learn More
The concept of maximum entropy can be traced back along multiple threads to Biblical times. Only recently, however, have computers become powerful enough to permit the widescale application of this concept to real world problems in statistical estimation and pattern recognition. In this paper, we describe a method for statistical modeling based on maximum(More)
This paper introduces a new statistical approach to automatically partitioning text into coherent segments. The approach is based on a technique that incrementally builds an exponential model to extract features that are correlated with the presence of boundaries in labeled training text. The models use two classes of features: topicality features that use(More)
This paper investigates whether a machine can automatically learn the task of finding, within a large collection of candidate responses, the answers to questions. The learning process consists of inspecting a collection of answered questions and characterizing the relation between question and answer with a statistical model. For the purpose of learning(More)
We introduce OCELOT, a prototype system for automatically generating the “gist” of a web page by summarizing it. Although most text summarization research to date has focused on the task of news articles, web pages are quite different in both structure and content. Instead of coherent text with a well-defined discourse structure, they are more(More)
This paper introduces a statistical model for query-relevant summarization: succinctly characterizing the relevance of a document to a query. Learning parameter values for the proposed model requires a large collection of summarized documents, which we do not have, but as a proxy, we use a collection of FAQ (frequently-asked question) documents. Taking a(More)
Doug Beeferman Adam Berger John La erty School of Computer Science Carnegie Mellon University Pittsburgh PA 15213 dougb,aberger,la erty@cs.cmu.edu ABSTRACT This paper describes a lightweight method for the automatic insertion of intra-sentence punctuation into text. Despite the intuition that pauses in an acoustic stream are a positive indicator for some(More)