Adapting Open Information Extraction to Domain-Specific Relations


dent processes and domain-specific knowledge. Until recently, information extraction has leaned heavily on domain knowledge, which requires either manual engineering or manual tagging of examples (Miller et al. 1998; Soderland 1999; Culotta, McCallum, and Betz 2006). Semisupervised approaches (Riloff and Jones 1999, Agichtein and Gravano 2000, Rosenfeld and Feldman 2007) require only a small amount of hand-annotated training, but require this for every relation of interest. This still presents a knowledge engineering bottleneck, when one considers the unbounded number of relations in a diverse corpus such as the web. Shinyama and Sekine (2006) explored unsupervised relation discovery using a clustering algorithm with good precision, but limited scalability. The KnowItAll research group is a pioneer of a new paradigm, Open IE (Banko et al. 2007, Banko and Etzioni 2008), that operates in a totally domain-independent manner and at web scale. An Open IE system makes a single pass over its corpus and extracts a diverse set of relational tuples without requiring any relation-specific human input. Open IE is ideally suited to corpora such as the web, where the target relations are not known in advance and their number is massive. Articles

Extracted Key Phrases

7 Figures and Tables


Citations per Year

56 Citations

Semantic Scholar estimates that this publication has 56 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@article{Soderland2010AdaptingOI, title={Adapting Open Information Extraction to Domain-Specific Relations}, author={Stephen Soderland and Brendan Roof and Bo Qin and Shi Xu and Mausam and Oren Etzioni}, journal={AI Magazine}, year={2010}, volume={31}, pages={93-102} }