Learn More
Several recent works on relation extraction have been applying the distant supervision paradigm: instead of relying on annotated text to learn how to predict relations, they employ existing knowledge bases (KBs) as source of supervision. Crucially, these approaches are trained based on the assumption that each sentence which mentions the two related(More)
Topic models provide a powerful tool for analyzing large text collections by representing high dimensional data in a low dimensional subspace. Fitting a topic model given a set of training documents requires approximate inference techniques that are computationally expensive. With today's large-scale, constantly expanding document collections, it is useful(More)
This paper addresses several key issues in the ArnetMiner system, which aims at extracting and mining academic social networks. Specifically, the system focuses on: 1) Extracting researcher profiles automatically from the Web; 2) Integrating the publication data into the network from existing digital libraries; 3) Modeling the entire academic network; and(More)
Traditional relation extraction predicts relations within some fixed and finite target schema. Machine learning approaches to this task require either manual annotation or, in the case of distant supervision, existing struc-tured sources of the same schema. The need for existing datasets can be avoided by using a universal schema: the union of all involved(More)
We explore unsupervised approaches to relation extraction between two named entities; for instance, the semantic bornIn relation between a person and location entity. Concretely , we propose a series of generative probabilistic models, broadly similar to topic models, each which generates a corpus of observed triples of entity mention pairs and the surface(More)
We present a novel approach to relation extraction that integrates information across documents , performs global inference and requires no labelled text. In particular, we tackle relation extraction and entity identification jointly. We use distant supervision to train a factor graph model for relation extraction based on an existing knowledge base(More)
The taxonomy of scopulariopsis-like fungi, comprising numerous human opportunistic species, has recently been reassessed with delineation of the genera Microascus, Pithoascus, Pseudoscopulariopsis, and Scopulariopsis, using morphological data and multilocus sequence analysis based on four loci (ITS, LSU, EF-1α, and TUB). In this study, the same genetic(More)