• Publications
  • Influence
Extracting Product Features and Opinions from Reviews
tl;dr
We introduce Opine, an unsupervised information-extraction system which mines reviews in order to build a model of important product features, their evaluation by reviewers, and their relative quality across products. Expand
  • 1,955
  • 101
  • Open Access
Unsupervised named-entity extraction from the Web: An experimental study
tl;dr
The KnowItAll system aims to automate the tedious process of extracting large collections of facts (e.g., names of scientists or politicians) from the Web in an unsupervised, domain-independent, scalable manner. Expand
  • 1,143
  • 92
  • Open Access
Web-scale information extraction in knowitall: (preliminary results)
tl;dr
This paper introduces KnowItAll, a system that aims to automate the tedious process ofextracting large collections of facts from the web in an autonomous,domain-independent, and scalable manner. Expand
  • 848
  • 60
  • Open Access
Towards a theory of natural language interfaces to databases
tl;dr
The need for Natural Language Interfaces (NLIs) to databases has become increasingly acute as more nontechnical people access information through their web browsers, PDAs and cell phones. Expand
  • 418
  • 50
  • Open Access
A Machine Learning Approach to Twitter User Classification
tl;dr
We automatically infer the values of user attributes such as political orientation or ethnicity by leveraging observable information such as the user behavior, network structure and the linguistic content of the user’s Twitter feed. Expand
  • 507
  • 39
  • Open Access
Web-Scale Distributional Similarity and Entity Set Expansion
tl;dr
We propose a large-scale term similarity algorithm, based on distributional similarity, implemented in the MapReduce framework and deployed over a 200 billion word crawl of the Web. Expand
  • 263
  • 23
  • Open Access
Euler Spiral for Shape Completion
tl;dr
In this paper we address the curve completion problem, e.g., the geometric continuation of boundaries of objects which are temporarily interrupted by occlusion. Expand
  • 164
  • 13
  • Open Access
Common Sense Based Joint Training of Human Activity Recognizers
tl;dr
We show how to add personal sensor data with object-use data from dense sensors to get accurate activity recognition with little labeling and feature selection overhead. Expand
  • 148
  • 12
  • Open Access
Detecting controversial events from twitter
tl;dr
This paper addresses the task of identifying controversial events using Twitter as a starting point: we propose 3 models for this task and report encouraging initial results. Expand
  • 201
  • 11
  • Open Access
Modern Natural Language Interfaces to Databases: Composing Statistical Parsing with Semantic Tractability
tl;dr
We report on the PRECISE NLI, which uses a statistical parser as a "plug in". Expand
  • 166
  • 9
  • Open Access