• Publications
  • Influence
Knowledge Enhanced Contextual Word Representations
TLDR
We propose a general method to embed multiple knowledge bases (KBs) into large scale models, and thereby enhance their representations with structured, human-curated knowledge. Expand
  • 161
  • 33
  • PDF
AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts
TLDR
Using AutoPrompt, we show that masked language models (MLMs) have an inherent capability to perform sentiment analysis and natural language inference without additional parameters or finetuning. Expand
  • 23
  • 6
  • PDF
Barack's Wife Hillary: Using Knowledge-Graphs for Fact-Aware Language Modeling
TLDR
We introduce the knowledge graph language model (KGLM), a neural language model with mechanisms for selecting and copying facts from a knowledge graph that are relevant to the context. Expand
  • 70
  • 3
  • PDF
Detecting COVID-19 Misinformation on Social Media
The ongoing pandemic has heightened the need for developing tools to flag COVID-19related misinformation on the internet, specifically on social media such as Twitter. However, due to novel languageExpand
  • 9
Detecting conversation topics in primary care office visits from transcripts of patient-provider interactions
TLDR
We investigate the effectiveness of machine learning methods for automated annotation of medical topics in patient-provider dialog transcripts. Expand
  • 10
  • PDF
Multimodal Attribute Extraction
TLDR
In this paper, we provide a dataset containing mixed-media data for over 2 million product items along with 7 million attribute-value pairs describing the items which can be used to train attribute extractors in a weakly supervised manner. Expand
  • 7
  • PDF
PoMo: Generating Entity-Specific Post-Modifiers in Context
TLDR
We introduce PoMo, a post-modifier dataset created automatically from news articles reflecting a journalistic need for incorporating entity information that is relevant to a particular news event. Expand
  • 6
  • PDF
Easy, Reproducible and Quality-Controlled Data Collection with Crowdaq
TLDR
We introduce CROWDAQ, an open-source platform that standardizes the data collection pipeline with customizable user-interface components, automated annotator qualification, and saved pipelines in a re-usable format. Expand
  • 4
  • PDF
Active Bayesian Assessment for Black-Box Classifiers
TLDR
We develop inference strategies to quantify uncertainty for common assessment metrics (accuracy, misclassification cost, expected calibration error) using this uncertainty to guide efficient selection of instances for labeling. Expand
  • 1
  • PDF
Bayesian Evaluation of Black-Box Classifiers
There is an increasing need for accurate quantitative assessment of the performance of prediction models (such as deep neural networks), outof-sample, e.g., in new environments after they have beenExpand
...
1
2
...