Learn More
In this paper, we present the evaluation of our CLIR system performed as part of our participation in FIRE 2008. We participated in Hindi to English, Marathi to English, English to Hindi bilingual task and English, Hindi, Marathi monolingual task. We take a query translation based approach using bi-lingual dictionaries. Query words not found in the(More)
We describe a novel max-margin learning approach to optimize non-linear performance measures for distantly-supervised relation extraction models. Our approach can be generally used to learn latent variable models under multivariate non-linear performance measures, such as Fβ-score. Our approach interleaves Concave-Convex Procedure (CCCP) for populating(More)
Generic rule-based systems for Information Extraction (IE) have been shown to work reasonably well out-of-the-box, and achieve state-of-the-art accuracy with further domain customization. However, it is generally recognized that manually building and customizing rules is a complex and labor intensive process. In this paper, we discuss an approach that(More)
Distant supervision, a paradigm of relation extraction where training data is created by aligning facts in a database with a large unannotated corpus, is an attractive approach for training relation extractors. Various models are proposed in recent literature to align the facts in the database to their mentions in the corpus. In this paper, we discuss and(More)
Discovering relational structure between input features in sequence labeling models has shown to improve their accuracies in several problem settings. The problem of learning relational structure for sequence labeling can be posed as learning Markov Logic Networks (MLN) for sequence labeling, which we abbreviate as Markov Logic Chains (MLC). This objective(More)
Fast algorithms of a transform, like fast Fourier transform (FFT) algorithms, are based on different decomposition techniques. It is shown that these decomposition techniques can also be applied to the computation of the discrete Hartley transform (DHT) for a real-valued sequence. Recently, an efficient decomposition technique for radix-3 decimation-in-time(More)
Building relational models for the structured output classification problem of sequence labeling has been recently explored in a few research works. The models built in such a manner are interpretable and capture much more information about the domain (than models built directly from basic attributes), resulting in accurate predictions. On the other hand,(More)
We describe the UMass IESL relation extraction system for TAC KBP 2016. One of the main challenges in TAC 2016 is to extract relations from multiple languages, including those with relatively low resources like Spanish. To mitigate the problem, we integrate multilingual and compositional universal schema from Verga et al. (2016) into our slot filling and(More)
Automatic short answer grading (ASAG) techniques are designed to automatically assess short answers written in natural language having a length of a few words to a few sentences. In this paper, we report an intriguing finding that the set of short answers to a question, collectively, share significant lexical commonalities. Based on this finding, we propose(More)
Information Extraction (IE) has become an indispensable tool in our quest to handle the data deluge of the information age. IE can broadly be classified into Named-entity Recognition (NER) and Relation Extraction (RE). In this thesis, we view the task of IE as finding patterns in unstructured data, which can either take the form of features and/or be(More)