• Publications
  • Influence
Unsupervised named-entity extraction from the Web: An experimental study
The KnowItAll system aims to automate the tedious process of extracting large collections of facts (e.g., names of scientists or politicians) from the Web in an unsupervised, domain-independent, andExpand
  • 1,145
  • 92
  • PDF
Web-scale information extraction in knowitall: (preliminary results)
Manually querying search engines in order to accumulate a large bodyof factual information is a tedious, error-prone process of piecemealsearch. Search engines retrieve and rank potentiallyExpand
  • 849
  • 60
  • PDF
Large-scale Semantic Parsing via Schema Matching and Lexicon Extension
Supervised training procedures for semantic parsers produce high-quality semantic parsers, but they have difficulty scaling to large databases because of the sheer number of logical constants forExpand
  • 228
  • 25
  • PDF
TextRunner: Open Information Extraction on the Web
Traditional information extraction systems have focused on satisfying precise, narrow, pre-specified requests from small, homogeneous corpora. In contrast, the TextRunner system demonstrates a newExpand
  • 293
  • 19
  • PDF
Re-ranking for joint named-entity recognition and linking
Recognizing names and linking them to structured data is a fundamental task in text analysis. Existing approaches typically perform these two steps using a pipeline architecture: they use aExpand
  • 92
  • 14
  • PDF
Modern Natural Language Interfaces to Databases: Composing Statistical Parsing with Semantic Tractability
Natural Language Interfaces to Databases (NLIs) can benefit from the advances in statistical parsing over the last fifteen years or so. However, statistical parsers require training on a massive,Expand
  • 168
  • 9
  • PDF
Methods for Domain-Independent Information Extraction from the Web: An Experimental Comparison
Our KNOWITALL system aims to automate the tedious process of extracting large collections of facts (e.g., names of scientists or politicians) from the Web in an autonomous, domain-independent, andExpand
  • 146
  • 7
  • PDF
Distributional Representations for Handling Sparsity in Supervised Sequence-Labeling
Supervised sequence-labeling systems in natural language processing often suffer from data sparsity because they use word types as features in their prediction tasks. Consequently, they haveExpand
  • 95
  • 7
  • PDF
Unsupervised Methods for Determining Object and Relation Synonyms on the Web
The task of identifying synonymous relations and objects, or synonym resolution, is critical for high-quality information extraction. This paper investigates synonym resolution in the context ofExpand
  • 110
  • 5
  • PDF
Semantic Parsing Freebase: Towards Open-domain Semantic Parsing
Existing semantic parsing research has steadily improved accuracy on a few domains and their corresponding databases. This paper introduces FreeParser, a system that trains on one domain and one setExpand
  • 38
  • 5
  • PDF