Learn More
We introduce a stochastic graph-based method for computing relative importance of textual units for Natural Language Processing. We test the technique on the problem of Text Summarization (TS). Extractive TS relies on the concept of sentence salience to identify the most important sentences in a document or set of documents. Salience is typically defined in(More)
We consider the problem of question-focused sentence retrieval from complex news articles describing multi-event stories published over time. Annotators generated a list of questions central to understanding each story in our corpus. Because of the dynamic nature of the stories, many questions are time-sensitive (e.g. " How many victims have been found? ")(More)
Multidocument extractive summarization relies on the concept of sentence centrality to identify the most important sentences in a document. Central-ity is typically defined in terms of the presence of particular important words or in terms of similarity to a centroid pseudo-sentence. We are now considering an approach for computing sentence importance based(More)
We introduce a relation extraction method to identify the sentences in biomedical text that indicate an interaction among the protein names mentioned. Our approach is based on the analysis of the paths between two protein names in the dependency parse trees of the sentences. Given two dependency trees, we define two separate similarity functions (kernels)(More)
We present Biased LexRank, a method for semi-supervised passage retrieval in the context of question answering. We represent a text as a graph of passages linked based on their pairwise lexical similarity. We use traditional passage retrieval techniques to identify passages that are likely to be relevant to a user's natural language question. We then(More)
The old Asian legend about the blind men and the elephant comes to mind when looking at how different authors of scientific papers describe a piece of related prior work. It turns out that different citations to the same paper often focus on different aspects of that paper and that neither provides a full description of its full set of contributions. In(More)
Multiple sequence alignment techniques have recently gained popularity in the Natural Language community, especially for tasks such as machine translation, text generation, and paraphrase identification. Prior work falls into two categories, depending on the type of input used: (a) parallel corpora (e.g., multiple translations of the same text) or (b)(More)
This interactive presentation describes LexNet, a graphical environment for graph-based NLP developed at the University of Michigan. LexNet includes LexRank (for text summarization), biased LexRank (for passage retrieval), and TUMBL (for binary classification). All tools in the collection are based on random walks on lexical graphs, that is graphs where(More)
This year we participated in the Novelty track. To find the relevant sentences, we combine sentence salience features that are inherited from text summarization domain with other heuristic features based on topic statements. We propose a novel method to extract the new sentences based on the graph-based ranking of the similarity relation between the(More)
  • 1