• Publications
  • Influence
Extracting Product Features and Opinions from Reviews
TLDR
Opine is introduced, an unsupervised information-extraction system which mines reviews in order to build a model of important product features, their evaluation by reviewers, and their relative quality across products. Expand
Unsupervised named-entity extraction from the Web: An experimental study
TLDR
An overview of KnowItAll's novel architecture and design principles is presented, emphasizing its distinctive ability to extract information without any hand-labeled training examples, and three distinct ways to address this challenge are presented and evaluated. Expand
Web-scale information extraction in knowitall: (preliminary results)
TLDR
KnowItAll, a system that aims to automate the tedious process of extracting large collections of facts from the web in an autonomous, domain-independent, and scalable manner, is introduced. Expand
Towards a theory of natural language interfaces to databases
TLDR
This paper proves that, for a broad class of semantically tractable natural language questions, Precise is guaranteed to map each question to the corresponding SQL query, and shows that Precise compares favorably with Mooney's learning NLI and with Microsoft's English Query product. Expand
A Machine Learning Approach to Twitter User Classification
TLDR
This paper automatically infer the values of user attributes such as political orientation or ethnicity by leveraging observable information such as the user behavior, network structure and the linguistic content of the user’s Twitter feed through a machine learning approach. Expand
Web-Scale Distributional Similarity and Entity Set Expansion
TLDR
This work applies the learned similarity matrix to the task of automatic set expansion and presents a large empirical study to quantify the effect on expansion performance of corpus size, corpus quality, seed composition and seed size. Expand
Euler Spiral for Shape Completion
TLDR
This paper analytically derive an optimal solution in the class of biarc curves, which is then used as the initial curve and yields intuitive interpolation across gaps and occlusions, and are extensible, in contrast to the scale-invariant version of elastica. Expand
Common Sense Based Joint Training of Human Activity Recognizers
TLDR
By synchronizing the personal sensor data with object-use data, it is possible to use easily specified commonsense models to minimize labeling overhead and combining a generative common sense model of activity with a discriminative model of actions can automate feature selection. Expand
Detecting controversial events from twitter
TLDR
This paper addresses the task of identifying controversial events using Twitter as a starting point: it proposes 3 models for this task and reports encouraging initial results. Expand
Modern Natural Language Interfaces to Databases: Composing Statistical Parsing with Semantic Tractability
TLDR
The paper shows how a strong semantic model coupled with "light re-training" enables PRECISE to overcome parser errors, and correctly map from parsed questions to the corresponding SQL queries. Expand
...
1
2
3
4
...