Benjamin Roth

Learn More
This paper presents a probabilistic model for sense disambiguation which chooses the best sense based on the conditional probability of sense paraphrases given a context. We use a topic model to decompose this conditional probability into two conditional probabilities with latent variables. We propose three different instantiations of the model for solving(More)
Spoken Language Systems at Saarland University (LSV) participated this year with 5 runs at the TAC KBP English slot filling track. Effective algorithms for all parts of the pipeline, from document retrieval to relation prediction and response post-processing, are bundled in a modular end-to-end relation extraction system called RelationFactory. The main run(More)
Knowledge base (KB) completion adds new facts to a KB by making inferences from existing facts, for example by inferring with high likelihood nationality(X,Y) from bornIn(X,Y). Most previous methods infer simple one-hop relational synonyms like this, or use as evidence a multi-hop relational path treated as an atomic feature, like bornIn(X,Z)→(More)
This paper presents a survey on the role of negation in sentiment analysis. Negation is a very common linguistic construction that affects polarity and, therefore, needs to be taken into consideration in sentiment analysis. We will present various computational approaches modeling negation in sentiment analysis. We will, in particular, focus on aspects,(More)
For the slot filling task of TAC KBP 2010 we developed as a system a simple pipeline architecture whose main components are a two-stage retrieval module and a relation extraction module. We use word-cluster features in the system as a method of achieving generalization by exploiting raw text. In the relation extraction module we use distant supervision in(More)
Universal schema builds a knowledge base (KB) of entities and relations by jointly embedding all relation types from input KBs as well as textual patterns observed in raw text. In most previous applications of universal schema, each textual pattern is represented as a single embedding, preventing generalization to unseen patterns. Recent work employs a(More)
We address relation classification in the context of slot filling, the task of finding and evaluating fillers like “Steve Jobs” for the slot X in “X founded Apple”. We propose a convolutional neural network which splits the input sentence into three parts according to the relation arguments and compare it to state-ofthe-art and traditional approaches of(More)
We survey recent approaches to noise reduction in distant supervision learning for relation extraction. We group them according to the principles they are based on: at-least-one constraints, topic-based models, or pattern correlations. Besides describing them, we illustrate the fundamental differences and attempt to give an outlook to potentially fruitful(More)
Distant supervision is a scheme to generate noisy training data for relation extraction by aligning entities of a knowledge base with text. In this work we combine the output of a discriminative at-least-one learner with that of a generative hierarchical topic model to reduce the noise in distant supervision data. The combination significantly increases the(More)
We present RelationFactory, a highly effective open source relation extraction system based on shallow modeling techniques. RelationFactory emphasizes modularity, is easily configurable and uses a transparent pipelined approach. The interactive demo allows the user to pose queries for which RelationFactory retrieves and analyses contexts that contain(More)