• Publications
  • Influence
Analysis of protein-coding genetic variation in 60,706 humans
Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis ofExpand
  • 5,700
  • 457
  • PDF
Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings
The blind application of machine learning runs the risk of amplifying biases present in data. Such a danger is facing us with word embedding, a popular framework to represent text data as vectorsExpand
  • 798
  • 128
  • PDF
Genome-wide Chromatin State Transitions Associated with Developmental and Environmental Cues
Differences in chromatin organization are key to the multiplicity of cell states that arise from a single genetic background, yet the landscapes of in vivo tissues remain largely uncharted. Here, weExpand
  • 445
  • 25
  • PDF
Interpretation of Neural Networks is Fragile
In order for machine learning to be deployed and trusted in many applications, it is crucial to be able to reliably explain why the machine learning algorithm makes certain predictions. For example,Expand
  • 172
  • 20
  • PDF
Locus-specific editing of histone modifications at endogenous enhancers using programmable TALE-LSD1 fusions
Mammalian gene regulation is dependent on tissue-specific enhancers that can act across large distances to influence transcriptional activity. Mapping experiments have identified hundreds ofExpand
  • 320
  • 14
  • PDF
Epigenome-wide association studies without the need for cell-type composition
In epigenome-wide association studies, cell-type composition often differs between cases and controls, yielding associations that simply tag cell type rather than reveal fundamental biology. CurrentExpand
  • 191
  • 14
  • PDF
Priors for Diversity in Generative Latent Variable Models
Probabilistic latent variable models are one of the cornerstones of machine learning. They offer a convenient and coherent way to specify prior distributions over unobserved structure in data, soExpand
  • 88
  • 9
  • PDF
Data Shapley: Equitable Valuation of Data for Machine Learning
As data becomes the fuel driving technological and economic growth, a fundamental challenge is how to quantify the value of data in algorithmic predictions and decisions. For example, in healthcareExpand
  • 58
  • 9
  • PDF
Contrastive Learning Using Spectral Methods
In many natural settings, the analysis goal is not to characterize a single data set in isolation, but rather to understand the difference between one set of observations and another. For example,Expand
  • 58
  • 4
  • PDF
Strategic Voting Behavior in Doodle Polls
Finding a common time slot for a group event is a daily conundrum and illustrates key features of group decision-making. It is a complex interplay of individual incentives and group dynamics. AExpand
  • 29
  • 4
  • PDF