Share This Author
Unsupervised named-entity extraction from the Web: An experimental study
Web-scale information extraction in knowitall: (preliminary results)
KnowItAll, a system that aims to automate the tedious process of extracting large collections of facts from the web in an autonomous, domain-independent, and scalable manner, is introduced.
Large-scale Semantic Parsing via Schema Matching and Lexicon Extension
A semantic parser for Freebase is developed based on a reduction to standard supervised training algorithms, schema matching, and pattern learning that is capable of parsing questions with an F1 that improves by 0.42 over a purely-supervised learning algorithm.
TextRunner: Open Information Extraction on the Web
- A. Yates, Michele Banko, M. Broadhead, Michael J. Cafarella, Oren Etzioni, S. Soderland
- Computer ScienceNorth American Chapter of the Association for…
- 23 April 2007
The TextRunner system demonstrates a new kind of information extraction, called Open Information Extraction (OIE), in which the system makes a single, data-driven pass over the entire corpus and extracts a large set of relational tuples, without requiring any human input.
Modern Natural Language Interfaces to Databases: Composing Statistical Parsing with Semantic Tractability
- Ana-Maria Popescu, Alex Armanasu, Oren Etzioni, David Ko, A. Yates
- Computer ScienceInternational Conference on Computational…
- 23 August 2004
The paper shows how a strong semantic model coupled with "light re-training" enables PRECISE to overcome parser errors, and correctly map from parsed questions to the corresponding SQL queries.
Re-ranking for joint named-entity recognition and linking
A joint model for NER and EL is presented, called NEREL, that takes a large set of candidate mentions from typical NER systems and a largeSet of candidate entity links from EL systems, and ranks the candidate mention-entity pairs together to make joint predictions.
Methods for Domain-Independent Information Extraction from the Web: An Experimental Comparison
- Oren Etzioni, Michael J. Cafarella, A. Yates
- Computer ScienceAAAI Conference on Artificial Intelligence
- 25 July 2004
Three distinct ways to improve KNOWITALL's recall and extraction rate without sacrificing precision are presented and evaluated and their performance is evaluated.
Distributional Representations for Handling Sparsity in Supervised Sequence-Labeling
It is demonstrated that distributional representations of word types, trained on unannotated text, can be used to improve performance on rare words and reduces the sample complexity of sequence labeling.
Semantic Parsing Freebase: Towards Open-domain Semantic Parsing
This paper introduces FreeParser, a system that trains on one domain and one set of predicate and constant symbols, and then can parse sentences for any new domain, including sentences that refer to symbols never seen during training.
To buy or not to buy: mining airfare data to minimize ticket purchase price
- Oren Etzioni, R. Tuchinda, Craig A. Knoblock, A. Yates
- BusinessKnowledge Discovery and Data Mining
- 24 August 2003
A pilot study in the domain of airline ticket prices suggests that mining of price data available over the web has the potential to save consumers substantial sums of money per annum.