Learn More
The 2010 Fall Issue of AI Magazine includes an article on "Building Watson: An Overview of the DeepQA Project," written by the IBM Watson Research Team, led by David Ferucci. Read about this exciting project in the most detailed technical article available. We hope you will also take a moment to read through the archives of AI Magazine, (../issues.php) and(More)
An invaluable portion of scientific data occurs naturally in text form. Given a large unlabeled document collection, it is often helpful to organize this collection into clusters of related documents. By using a vector space model, text data can be treated as high-dimensional but sparse numerical data vectors. It is a contemporary challenge to efficiently(More)
A traditional goal of Artificial Intelligence research has been a system that can read unrestricted natural language texts on a given topic, build a model of that topic and reason over the model. Natural Language Processing advances in syntax and semantics have made it possible to extract a limited form of meaning from sentences. Knowledge Representation(More)
A source expansion algorithm automatically extends a given text corpus with related content from large external sources such as the Web. The expanded corpus is not intended for human consumption but can be used in question answering (QA) and other information retrieval or extraction tasks to find more relevant information and supporting evidence. We propose(More)
Most existing Question Answering (QA) systems adopt a type-and-generate approach to candidate generation that relies on a pre-defined domain ontology. This paper describes a type independent search and candidate generation paradigm for QA that leverages Wikipedia characteristics. This approach is particularly useful for adapting QA systems to domains where(More)
As part of the ongoing project, Project Halo, our goal is to build a system capable of answering questions posed by novice users to a formal knowledge base. In our current context, the knowledge base covers selected topics in physics, chemistry, and biology, and our question set consists of AP (advanced high-school) level examination questions. The task is(More)
David Ferrucci1, Eric Nyberg2, James Allan3, Ken Barker4, Eric Brown1, Jennifer Chu-Carroll1, Arthur Ciccolo1, Pablo Duboue1, James Fan1, David Gondek1, Eduard Hovy5, Boris Katz6, Adam Lally1, Michael McCord1, Paul Morarescu1, Bill Murdock1, Bruce Porter4, John Prager1, Tomek Strzalkowski7, Chris Welty1, Wlodek Zadrozny1 1IBM Research Division Thomas J.(More)
This paper describes a novel approach to the semantic relation detection problem. Instead of relying only on the training instances for a new relation, we leverage the knowledge learned from previously trained relation detectors. Specifically, we detect a new semantic relation by projecting the new relation's training instances onto a lower dimension topic(More)
In this paper, we present a manifold model for medical relation extraction. Our model is built upon a medical corpus containing 80M sentences (11 gigabyte text) and designed to accurately and efficiently detect the key medical relations that can facilitate clinical decision making. Our approach integrates domain specific parsing and typing systems, and can(More)