• Corpus ID: 4644790

Feedback GAN (FBGAN) for DNA: a Novel Feedback-Loop Architecture for Optimizing Protein Functions

@article{Gupta2018FeedbackG,
  title={Feedback GAN (FBGAN) for DNA: a Novel Feedback-Loop Architecture for Optimizing Protein Functions},
  author={Anvita Gupta and James Zou},
  journal={ArXiv},
  year={2018},
  volume={abs/1804.01694}
}
Generative Adversarial Networks (GANs) represent an attractive and novel approach to generate realistic data, such as genes, proteins, or drugs, in synthetic biology. [] Key Result The proposed architecture also has the advantage that the analyzer need not be differentiable. We apply the feedback-loop mechanism to two examples: 1) generating synthetic genes coding for antimicrobial peptides, and 2) optimizing synthetic genes for the secondary structure of their resulting peptides.

Figures and Tables from this paper

Expanding functional protein sequence space using generative adversarial networks
TLDR
ProteinGAN, a specialised variant of the generative adversarial network that is able to ‘learn’ natural protein sequence diversity and enables the generation of functional protein sequences, demonstrates the potential of artificial intelligence to rapidly generate highly diverse novel functional proteins within the allowed biological constraints of the sequence space.
HelixGAN: A bidirectional Generative Adversarial Network with search in latent space for generation under constraints
TLDR
This work applies Wasserstein bi-directional Generative Adversarial Networks to generate full atom helical structures and introduces a novel Markov Chain Monte Carlo search mechanism with the encoder to allow the design according to structural constraints.
Synthetic Promoter Design in Escherichia coli based on Generative Adversarial Network
TLDR
Insight is provided into an area of navigation of novel functional promoter sequence space automatically, as well as speeding up evolution process of naturally existing promoters, indicating the potential ability for deep generative models to be applied into genetic element designing in the future.
MutaGAN: A Seq2seq GAN Framework to Predict Mutations of Evolving Protein Populations
TLDR
A novel machine learning framework using generative adversarial networks (GANs) with recurrent neural networks (RNNs) to accurately predict genetic mutations and evolution of future biological populations is developed.
Learning to Design RNA
TLDR
Comprehensive empirical results on two widely-used RNA Design benchmarks are shown, showing that the proposed LEARNA approach achieves new state-of-the-art performance on the former while also being orders of magnitudes faster in reaching the previous state of theart performance.
L EARNING TO D ESIGN RNA
TLDR
Comprehensive empirical results on two widely-used RNA Design benchmarks show that the proposed LEARNA approach achieves new state-of-the-art performance on the former while also being orders of magnitudes faster in reaching the previous state- of-theart performance.
Generative models for protein structure: A comparison between Generative Adversarial and Autoregressive networks
TLDR
Both the implemented generative models are able to generate protein sequences that are statistically similar to the ones in the original dataset of their family and AR networks show higher potential to capture the three-dimensional folded structure of the protein family.
Model-based reinforcement learning for biological sequence design
TLDR
A model-based variant of PPO, DyNA-PPO, is proposed to improve sample efficiency and performs significantly better than existing methods in settings in which modeling is feasible, while still not performing worse in situations in which a reliable model cannot be learned.
De Novo Protein Design for Novel Folds using Guided Conditional Wasserstein Generative Adversarial Networks (gcWGAN)
TLDR
GcWGAN explores uncharted sequence space to design proteins by learning from current sequence-structure data and is found to have comparable or better fold accuracy yet much more sequence diversity and novelty than cVAE.
Understanding Transcriptional Regulatory Logic Using Convolutional and Generative Deep Learning Models
TLDR
Deep learning approaches are applied to MPRA datasets to understand the individual contribution to gene expression activity of transcription factor (TF) motifs and the relationships between them, and it is shown that convolutional neural networks (CNNs) are capable of learning both aspects of cisregulatory logic.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 29 REFERENCES
Generating and designing DNA with deep generative models
TLDR
It is shown that these tools capture important structures of the data and, when applied to designing probes for protein binding microarrays, allow us to generate new sequences whose properties are estimated to be superior to those found in the training data.
Generative adversarial networks uncover epidermal regulators and predict single cell perturbations
TLDR
This work applies a new generative deep learning approach called Generative Adversarial Networks (GAN) to biological data and shows that it is possible to integrate diverse skin (epidermal) datasets and in doing so, the model is able to simulate realistic scRNA-seq data that covers the full diversity of cell types.
Molecular de-novo design through deep reinforcement learning
TLDR
A method to tune a sequence-based generative model for molecular de novo design that through augmented episodic likelihood can learn to generate structures with certain specified desirable properties is introduced.
Generative Recurrent Networks for De Novo Drug Design
TLDR
This paper presents a method for molecular de novo design that utilizes generative recurrent neural networks (RNN) containing long short‐term memory (LSTM) cells that captured the syntax of molecular representation in terms of SMILES strings with close to perfect accuracy.
Generating Focused Molecule Libraries for Drug Discovery with Recurrent Neural Networks
TLDR
This work shows that recurrent neural networks can be trained as generative models for molecular structures, similar to statistical language models in natural language processing, and demonstrates that the properties of the generated molecules correlate very well with those of the molecules used to train the model.
GANs for Biological Image Synthesis
TLDR
By interpolating across the latent space, GANs can mimic the known changes in protein localization that occur through time during the cell cycle, allowing us to predict temporal evolution from static images.
Dilated Convolutions for Modeling Long-Distance Genomic Dependencies
TLDR
It is shown that dilated convolutions are effective at modeling the locations of regulatory markers in the human genome, such as transcription factor binding sites, histone modifications, and DNAse hypersensitivity sites.
Real-valued (Medical) Time Series Generation with Recurrent Conditional GANs
TLDR
This work proposes a Recurrent GAN (RGAN) and Recurrent Conditional GGAN (RCGAN) to produce realistic real-valued multi-dimensional time series, with an emphasis on their application to medical data.
Recurrent Neural Network Model for Constructive Peptide Design
TLDR
The ability of LSTM RNNs to construct new amino acid sequences within the applicability domain of the model is showcased and motivated to motivate their prospective application to peptide and protein design without the need for the exhaustive enumeration of sequence libraries.
CytoGAN: Generative Modeling of Cell Images
TLDR
When evaluated for their ability to group cell images responding to treatment by chemicals of known classes, it is found that adversarially learned representations are superior to autoencoder-based approaches.
...
1
2
3
...