• Publications
  • Influence
Open Information Extraction from the Web
Traditionally, Information Extraction (IE) has focused on satisfying precise, narrow, pre-specified requests from small homogeneous corpora (e.g., extract the location and time of seminars from a setExpand
  • 2,044
  • 258
Identifying Relations for Open Information Extraction
Open Information Extraction (IE) is the task of extracting assertions from massive corpora without requiring a pre-specified vocabulary. This paper shows that the output of state-of-the-art Open IEExpand
  • 1,089
  • 178
Named Entity Recognition in Tweets: An Experimental Study
People tweet more than 100 Million times daily, yielding a noisy, informal, but sometimes informative corpus of 140-character messages that mirrors the zeitgeist in an unprecedented manner. TheExpand
  • 1,121
  • 162
Open Language Learning for Information Extraction
Open Information Extraction (IE) systems extract relational tuples from text, without requiring a pre-specified vocabulary, by identifying relation phrases and associated arguments in arbitraryExpand
  • 547
  • 116
Web document clustering: a feasibility demonstration
Users of Web search engines are often forced to sift through the long ordered list of document "snippets" returned by the engines. The IR community has explored document clustering as an alternativeExpand
  • 1,249
  • 105
Extracting Product Features and Opinions from Reviews
Consumers are often forced to wade through many on-line reviews in order to make an informed product choice. This paper introduces Opine, an unsupervised information-extraction system which minesExpand
  • 1,950
  • 101
Unsupervised named-entity extraction from the Web: An experimental study
The KnowItAll system aims to automate the tedious process of extracting large collections of facts (e.g., names of scientists or politicians) from the Web in an unsupervised, domain-independent, andExpand
  • 1,139
  • 92
Web-scale information extraction in knowitall: (preliminary results)
Manually querying search engines in order to accumulate a large bodyof factual information is a tedious, error-prone process of piecemealsearch. Search engines retrieve and rank potentiallyExpand
  • 848
  • 60
The Tradeoffs Between Open and Traditional Relation Extraction
Traditional Information Extraction (IE) takes a relation name and hand-tagged examples of that relation as input. Open IE is a relationindependent extraction paradigm that is tailored to massive andExpand
  • 388
  • 54
Open Information Extraction: The Second Generation
How do we scale information extraction to the massive size and unprecedented heterogeneity of the Web corpus? Beginning in 2003, our KnowItAll project has sought to extract high-quality knowledgeExpand
  • 371
  • 51