Ajay Nagesh

Learn More
Generic rule-based systems for Information Extraction (IE) have been shown to work reasonably well out-of-the-box, and achieve state-of-the-art accuracy with further domain customization. However, it is generally recognized that manually building and customiz-ing rules is a complex and labor intensive process. In this paper, we discuss an approach that(More)
Distant supervision, a paradigm of relation extraction where training data is created by aligning facts in a database with a large unannotated corpus, is an attractive approach for training relation extractors. Various models are proposed in recent literature to align the facts in the database to their mentions in the corpus. In this paper, we discuss and(More)
We describe a novel max-margin learning approach to optimize non-linear performance measures for distantly-supervised relation extraction models. Our approach can be generally used to learn latent variable models under multivariate non-linear performance measures, such as F β-score. Our approach interleaves Concave-Convex Procedure (CCCP) for populating(More)
This paper is an attempt to raise pertinent questions and act as platform to generate fruitful discussions within the AKBC community about the need for a large scale dataset for relation extraction. For proper training and evaluation of relation extraction tasks, the weaknesses of datasets used so far need to be tackled: mainly the size (too small) and the(More)
Information Extraction (IE) has become an indispensable tool in our quest to handle the data deluge of the information age. IE can broadly be classified into Named-entity Recognition (NER) and Relation Extraction (RE). In this thesis, we view the task of IE as finding patterns in unstructured data, which can either take the form of features and/or be(More)
Building relational models for the structured output classification problem of sequence labeling has been recently explored in a few research works. The models built in such a manner are interpretable and capture much more information about the domain (than models built directly from basic attributes), resulting in accurate predictions. On the other hand,(More)
  • 1