• Publications
  • Influence
A Domain-Agnostic Approach to Spam-URL Detection via Redirects
TLDR
This work proposes a new approach for detecting spam URLs on the Web that leverages the properties of URL redirections widely deployed by spammers, and combines the redirect chains into a redirection graph that reveals the underlying infrastructure in which the spammers operate. Expand
Repurposing Entailment for Multi-Hop Question Answering Tasks
TLDR
Multee is introduced, a general architecture that can effectively use entailment models for multi-hop QA tasks and outperforms QA models trained only on the target QA datasets and the OpenAI transformer models when using an entailment function pre-trained on NLI datasets. Expand
Recognizing cultural events in images: A study of image categorization models
TLDR
This work poses cultural event recognition as an image categorization problem, and it is found that Regularized Max Pooling with CNN, SIFT, and Color features achieves the best performance. Expand
Modeling Label Semantics for Predicting Emotional Reactions
TLDR
This work explicitly model label classes via label embeddings, and adds mechanisms that track label-label correlations both during training and inference, and introduces a new semi-supervision strategy that regularizes for the correlations on unlabeled data. Expand
Modeling Preconditions in Text with a Crowd-sourced Dataset
TLDR
PeKo, a crowd-sourced annotation of preconditions between event pairs in newswire, an order of magnitude larger than prior text annotations, is introduced and generation results show that fine-tuning an LM on PeKo yields better conditional relations than when trained on raw text or temporally-ordered corpora. Expand
Controlling Information Aggregation for Complex Question Answering
TLDR
Empirical evaluation on an elementary science exam benchmark shows that the proposed methods enables effective aggregation even over larger graphs and demonstrates the complementary value of information aggregation for answering complex questions. Expand
Fake Sentence Detection as a Training Task for Sentence Encoding
TLDR
It is found that the BiLSTM trains much faster on fake sentence detection using smaller amounts of data (1M instead of 64M sentences) and the learned representations capture many syntactic and semantic properties expected from good sentence representations. Expand
Cross Sentence Inference for Process Knowledge
TLDR
This work extends standard within sentence joint inference to inference across multiple sentences, which promotes role assignments that are compatible across different descriptions of the same process. Expand
Generating Narrative Text in a Switching Dynamical System
TLDR
A new model that integrates explicit narrative structure with neural language models, formalizing narrative modeling as a Switching Linear Dynamical System (SLDS) is introduced, and can be learned in a semi-supervised fashion using both labeled and unlabeled data. Expand
Toward Diverse Precondition Generation
TLDR
DiP is proposed, the Diverse Precondition generation system that can generate unique and diverse preconditions that are not always suitable for standard seq2seq frameworks and automatically generates control codes without training on diverse examples. Expand