• Publications
  • Influence
Named Entity Recognition in Tweets: An Experimental Study
People tweet more than 100 Million times daily, yielding a noisy, informal, but sometimes informative corpus of 140-character messages that mirrors the zeitgeist in an unprecedented manner. TheExpand
  • 1,114
  • 162
Open Language Learning for Information Extraction
Open Information Extraction (IE) systems extract relational tuples from text, without requiring a pre-specified vocabulary, by identifying relation phrases and associated arguments in arbitraryExpand
  • 543
  • 115
Open Information Extraction: The Second Generation
How do we scale information extraction to the massive size and unprecedented heterogeneity of the Web corpus? Beginning in 2003, our KnowItAll project has sought to extract high-quality knowledgeExpand
  • 368
  • 51
Adversarial classification
Essentially all data mining algorithms assume that the data-generating process is independent of the data miner's activities. However, in many domains, including spam detection, intrusion detection,Expand
  • 654
  • 48
Open domain event extraction from twitter
Tweets are the most up-to-date and inclusive stream of in- formation and commentary on current events, but they are also fragmented and noisy, motivating the need for systems that can extract,Expand
  • 526
  • 42
When is Temporal Planning Really Temporal?
While even STRIPS planners must search for plans of unbounded length, temporal planners must also cope with the fact that actions may start at any point in time. Most temporal planners cope with thisExpand
  • 167
  • 16
Generating Coherent Event Schemas at Scale
Chambers and Jurafsky (2009) demonstrated that event schemas can be automatically induced from text corpora. However, our analysis of their schemas identifies several weaknesses, e.g., some schemasExpand
  • 74
  • 13
A Latent Dirichlet Allocation Method for Selectional Preferences
The computation of selectional preferences, the admissible argument values for a relation, is a well-known NLP task with broad applicability. We present LDA-SP, which utilizes LinkLDA (Erosheva etExpand
  • 180
  • 12
Towards Coherent Multi-Document Summarization
This paper presents G-FLOW, a novel system for coherent extractive multi-document summarization (MDS). 1 Where previous work on MDS considered sentence selection and ordering separately, G-FLOWExpand
  • 95
  • 11
Topological Value Iteration Algorithms
Value iteration is a powerful yet inefficient algorithm for Markov decision processes (MDPs) because it puts the majority of its effort into backing up the entire state space, which turns out to beExpand
  • 44
  • 11