• Publications
  • Influence
Unified rational protein engineering with sequence-based deep representation learning
TLDR
We apply deep learning to unlabeled amino-acid sequences to distill the fundamental features of a protein into a statistical representation that is semantically rich and structurally, evolutionarily and biophysically grounded. Expand
  • 95
  • 14
  • PDF
EpiFactors: a comprehensive database of human epigenetic factors and complexes
TLDR
Epigenetics refers to stable and long-term alterations of cellular traits that are not caused by changes in the DNA sequence per se. Expand
  • 112
  • 8
  • PDF
Unified rational protein engineering with sequence-only deep representation learning
TLDR
We apply deep learning to unlabelled amino acid sequences to distill the fundamental features of a protein into a statistical representation that is semantically rich and structurally, evolutionarily, and biophysically grounded. Expand
  • 34
  • 5
PERFECTOS-APE - Predicting Regulatory Functional Effect of SNPs by Approximate P-value Estimation
TLDR
We present a novel software, PERFECTOS-APE, to predict how different alleles of SNVs or SNPs may alter affinity of transcription factor binding sites modelled by basic and advanced approaches. Expand
  • 23
  • 1
  • PDF
Low-N protein engineering with data-efficient deep learning
TLDR
We introduce a machine learning-guided paradigm that can use as few as 24 functionally assayed mutant sequences to build an accurate virtual fitness landscape and screen ten million sequences via in silico directed evolution. Expand
  • 26
Negative selection maintains transcription factor binding motifs in human cancer
BackgroundSomatic mutations in cancer cells affect various genomic elements disrupting important cell functions. In particular, mutations in DNA binding sites recognized by transcription factors canExpand
  • 12
  • PDF
Low-N protein engineering with data-efficient deep learning.
Protein engineering has enormous academic and industrial potential. However, it is limited by the lack of experimental assays that are consistent with the design goal and sufficiently high throughputExpand