Learn More
While most approaches to automatically recognizing entailment relations have used classifiers employing hand engineered features derived from complex natural language processing pipelines, in practice their performance has been only slightly better than bag-of-word pair classifiers using only lexical similarity. The only attempt so far to build an(More)
MOTIVATION The accurate identification of chemicals in text is important for many applications, including computer-assisted reconstruction of metabolic networks or retrieval of information about substances in drug development. But due to the diversity of naming conventions and traditions for such molecules, this task is highly complex and should be(More)
Named entity recognition (NER) systems are often based on machine learning techniques to reduce the labor-intensive development of hand-crafted extraction rules and domain-dependent dictionaries. Nevertheless, time-consuming feature engineering is often needed to achieve state-of-the-art performance. In this study, we investigate the impact of such(More)
Many current natural language processing applications for social media rely on representation learning and utilize pre-trained word embeddings. There currently exist several publicly-available, pre-trained sets of word embeddings, but they contain few or no emoji representations even as emoji usage in social media has increased. In this paper we release(More)
Stance detection is the task of classifying the attitude Previous work has assumed that either the target is mentioned in the text or that training data for every target is given. This paper considers the more challenging version of this task, where targets are not always mentioned and no training data is available for the test targets. We experiment with(More)
Matrix factorization of knowledge bases in universal schema has facilitated accurate distantly-supervised relation extraction. This factor-ization encodes dependencies between textual patterns and structured relations using low-dimensional vectors defined for each entity pair; although these factors are effective at combining evidence for an entity pair,(More)
Existing modeling languages lack the expressiveness or efficiency to support many modern and successful machine learning (ML) models such as structured prediction or matrix factorization. We present WOLFE, a probabilistic programming language that enables practitioners to develop such models. Most ML approaches can be formulated in terms of scalar(More)
The ability to reason with natural language is a fundamental prerequisite for many NLP tasks such as information extraction, machine translation and question answering. To quantify this ability, systems are commonly tested whether they can recognize textual entailment, i.e., whether one sentence can be inferred from another one. However, in most NLP(More)
UNLABELLED : Descriptions of genetic variations and their effect are widely spread across the biomedical literature. However, finding all mentions of a specific variation, or all mentions of variations in a specific gene, is difficult to achieve due to the many ways such variations are described. Here, we describe SETH, a tool for the recognition of(More)
Methods based on representation learning currently hold the state-of-the-art in many natural language processing and knowledge base inference tasks. Yet, a major challenge is how to efficiently incorporate commonsense knowledge into such models. A recent approach reg-ularizes relation and entity representations by propositionalization of first-order logic(More)