• Publications
  • Influence
Modeling Relational Data with Graph Convolutional Networks
It is shown that factorization models for link prediction such as DistMult can be significantly improved through the use of an R-GCN encoder model to accumulate evidence over multiple inference steps in the graph, demonstrating a large improvement of 29.8% on FB15k-237 over a decoder-only baseline. Expand
Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling
A version of graph convolutional networks (GCNs), a recent class of neural networks operating on graphs, suited to model syntactic dependency graphs, is proposed, observing that GCN layers are complementary to LSTM ones. Expand
Modeling online reviews with multi-grain topic models
This paper presents a novel framework for extracting ratable aspects of objects from online user reviews and argues that multi-grain models are more appropriate for this task since standard models tend to produce topics that correspond to global properties of objects rather than aspects of an object that tend to be rated by a user. Expand
Inducing Crosslingual Distributed Representations of Words
This work induces distributed representations for a pair of languages jointly and shows that these representations are informative by using them for crosslingual document classification, where classifiers trained on these representations substantially outperform strong baselines when applied to a new language. Expand
Context-Aware Neural Machine Translation Learns Anaphora Resolution
A context-aware neural machine translation model designed in such way that the flow of information from the extended context to the translation model can be controlled and analyzed is introduced. Expand
A Joint Model of Text and Aspect Ratings for Sentiment Summarization
A statistical model is proposed which is able to discover corresponding topics in text and extract textual evidence from reviews supporting each of these aspect ratings, a fundamental problem in aspect-based sentiment summarization. Expand
Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned
It is found that the most important and confident heads play consistent and often linguistically-interpretable roles and when pruning heads using a method based on stochastic gates and a differentiable relaxation of the L0 penalty, it is observed that specialized heads are last to be pruned. Expand
When a Good Translation is Wrong in Context: Context-Aware Machine Translation Improves on Deixis, Ellipsis, and Lexical Cohesion
This work performs a human study on an English-Russian subtitles dataset and identifies deixis, ellipsis and lexical cohesion as three main sources of inconsistency as well as introducing a model suitable for this scenario and demonstrating major gains over a context-agnostic baseline on new benchmarks without sacrificing performance as measured with BLEU. Expand
Improving Massively Multilingual Neural Machine Translation and Zero-Shot Translation
It is argued that multilingual NMT requires stronger modeling capacity to support language pairs with varying typological characteristics, and overcome this bottleneck via language-specific components and deepening NMT architectures. Expand
Graph Convolutional Encoders for Syntax-aware Neural Machine Translation
We present a simple and effective approach to incorporating syntactic structure into neural attention-based encoder-decoder models for machine translation. We rely on graph-convolutional networksExpand