Share This Author
Named Entity Recognition in Tweets: An Experimental Study
The novel T-ner system doubles F1 score compared with the Stanford NER system, and leverages the redundancy inherent in tweets to achieve this performance, using LabeledLDA to exploit Freebase dictionaries as a source of distant supervision.
Open Language Learning for Information Extraction
Open Information Extraction (IE) systems extract relational tuples from text, without requiring a pre-specified vocabulary, by identifying relation phrases and associated arguments in arbitrary…
Open Information Extraction: The Second Generation
- Oren Etzioni, Anthony Fader, Janara Christensen, S. Soderland, Mausam
- Computer ScienceIJCAI
- 16 July 2011
The second generation of Open IE systems are described, which rely on a novel model of how relations and their arguments are expressed in English sentences to double precision/recall compared with previous systems such as TEXTRUNNER and WOE.
- Nilesh N. Dalvi, Pedro M. Domingos, Mausam, Sumit K. Sanghai, D. Verma
- Computer ScienceKDD
- 22 August 2004
This paper views classification as a game between the classifier and the adversary, and produces a classifier that is optimal given the adversary's optimal strategy, and experiments show that this approach can greatly outperform a classifiers learned in the standard way.
Open domain event extraction from twitter
TwiCal is described-- the first open-domain event-extraction and categorization system for Twitter, and a novel approach for discovering important event categories and classifying extracted events based on latent variable models is presented.
When is Temporal Planning Really Temporal?
A complete state-space temporal planning algorithm is designed, which the authors hope will be able to achieve high performance by leveraging the heuristics that power decision epoch planners.
Towards Coherent Multi-Document Summarization
G-FLOW is evaluated on Mechanical Turk, and it is found that it generates dramatically better summaries than an extractive summarizer based on a pipeline of state-of-the-art sentence selection and reordering components, underscoring the value of the joint model.
Generating Coherent Event Schemas at Scale
This work presents a novel approach to inducing open-domain event schemas that overcomes limitations of Chambers and Jurafsky's (2009) schemas and uses cooccurrence statistics of semantically typed relational triples, which it calls Rel-grams (relational n- grams).
A Latent Dirichlet Allocation Method for Selectional Preferences
LDA-SP, which utilizes LinkLDA to model selectional preferences, combines the benefits of previous approaches: like traditional class-based approaches, it produces human-interpretable classes describing each relation's preferences, but it is competitive with non-class-based methods in predictive power.
An analysis of open information extraction based on semantic role labeling
This work investigates the use of semantic role labeling techniques for the task of Open IE and compares SRL-based open extractors with TextRunner, an open extractor which uses shallow syntactic analysis but is able to analyze many more sentences in a fixed amount of time and thus exploit corpus-level statistics.