A Survey of Multi‐task Learning Methods in Chemoinformatics

@article{Sosnin2019ASO,
  title={A Survey of Multi‐task Learning Methods in Chemoinformatics},
  author={Sergey B. Sosnin and M. V. Vashurina and Michael Withnall and Pavel Karpov and Maxim V. Fedorov and Igor V. Tetko},
  journal={Molecular Informatics},
  year={2019},
  volume={38}
}
Despite the increasing volume of available data, the proportion of experimentally measured data remains small compared to the virtual chemical space of possible chemical structures. Therefore, there is a strong interest in simultaneously predicting different ADMET and biological properties of molecules, which are frequently strongly correlated with one another. Such joint data analyses can increase the accuracy of models by exploiting their common representation and identifying common features… 
Multitask Learning On Graph Neural Networks Applied To Molecular Property Predictions
TLDR
This work presents a new state-of-the-art multitask prediction method based on existing graph neural network models that clearly demonstrate that multitask learning can improve model performance.
Comparative Study of Multitask Toxicity Modeling on a Broad Chemical Space
TLDR
This research reveals that multitask learning can be very useful to improve the quality of acute toxicity modeling and raises a discussion about the usage of multitask approaches for regulation purposes.
Representation of molecules for drug response prediction
TLDR
This review is dedicated to the application of machine learning in drug response prediction, and focuses on molecular representations, which is a crucial element to the success ofdrug response prediction and other chemistry-related prediction tasks.
Multitask CapsNet: An Imbalanced Data Deep Learning Method for Predicting Toxicants
TLDR
It is found that multitask CapsNet excelled in toxicity prediction and outperformed many other computational approaches using the multitask strategy and would provide a novel, accurate, and efficient approach for predicting the toxicities of compounds.
Building Attention and Edge Convolution Neural Networks for Bioactivity and Physical-Chemical Property Prediction
TLDR
Attention and Edge Memory schemes are introduced to the existing Message Passing Neural Network framework for graph convolution, and the need to introduce a priori knowledge of the task and chemical descriptor calculation is removed by using only fundamental graph-derived properties.
Multitask Modeling with Confidence Using Matrix Factorization and Conformal Prediction
TLDR
Class conditional (Mondrian) conformal predictors using underlying Macau models as a novel approach for large scale bioactivity prediction that significantly improves the performance on imbalanced data sets is presented.
Improving Compound Activity Classification via Deep Transfer and Representation Learning
TLDR
TAc learns to generate effective molecular features that can generalize well from one domain to another and increase the classification performance in the target domain and TAc-fc is also found to be a strong model with competitive or even better performance on a notable number of target tasks.
Transferable Multi-level Attention Neural Network for Accurate Prediction of Quantum Chemistry Properties via Multi-task Learning
The development of efficient models for predicting specific properties through machine learning is of great importance for the innovation of chemistry and material science. However, predicting global
graphDelta: MPNN Scoring Function for the Affinity Prediction of Protein–Ligand Complexes
TLDR
Graph-convolutional neural networks are presented for the prediction of binding constants of protein–ligand complexes using multi task learning and achieves higher correlation coefficient and RMSE values than recently developed 3D convolutional network model Kdeep.
Building attention and edge message passing neural networks for bioactivity and physical–chemical property prediction
TLDR
This work removes the need to introduce a priori knowledge of the task and chemical descriptor calculation by using only fundamental graph-derived properties, and sets a new standard on sparse multi-task virtual screening targets.
...
...

References

SHOWING 1-10 OF 35 REFERENCES
Transfer and Multi-task Learning in QSAR Modeling: Advances and Challenges
TLDR
This review will present the main features of transfer and multi-task learning studies, as well as some applications and its potentiality in drug design projects.
Deep Architectures and Deep Learning in Chemoinformatics: The Prediction of Aqueous Solubility for Drug-Like Molecules
TLDR
A brief overview of deep learning methods is presented and in particular how recursive neural network approaches can be applied to the problem of predicting molecular properties, by considering an ensemble of recursive neural networks associated with all possible vertex-centered acyclic orientations of the molecular graph.
Inferring multi-target QSAR models with taxonomy-based multi-task learning
TLDR
Two different multi-task algorithms from the field of transfer learning that can exploit the similarity between several targets to transfer knowledge between the target specific QSAR models for lead optimization are presented.
Application of Bioactivity Profile-Based Fingerprints for Building Machine Learning Models
TLDR
Comparisons of in-house HTSFPs at this when combined with multitask deep learning versus the single task support vector machine method show that the two fingerprints yielded in similar performances and diverse hits with very little overlap, thus demonstrating the orthogonality of bioactivity profile-based descriptors with structural descriptors.
Effect of missing data on multitask prediction methods
TLDR
This work provides a first approximation to assess how much data is required to produce good performance in multitask prediction exercises, using two complete data sets to simulate sparseness by removing data from the training set.
DeepTox: Toxicity Prediction using Deep Learning
TLDR
DeepTox had the highest performance of all computational methods winning the grand challenge, the nuclear receptor panel, the stress response panel, and six single assays (teams ``Bioinf@JKU'').
BIGCHEM: Challenges and Opportunities for Big Data Analysis in Chemistry
TLDR
It is shown that the efficient exploration of billions of molecules requires the development of smart strategies, and the importance of education in “Big Data” for further progress of this area is highlighted.
Stargate GTM: Bridging Descriptor and Activity Spaces
TLDR
The "Stargate" version of the Generative Topographic Mapping approach, in which two different multidimensional spaces are linked through a common 2D latent space, outperforms conventional GTM for individual activities and performs similarly to the Lasso multitask learning algorithm, although it is still slightly less accurate than the Random Forest method.
A renaissance of neural networks in drug discovery
TLDR
This review discusses traditional and newly emerging neural network approaches to drug discovery, focusing on backpropagation neural networks and their variants, self-organizing maps and associated methods, and a relatively new technique, deep learning.
Deep Learning Based Regression and Multiclass Models for Acute Oral Toxicity Prediction with Automatic Chemical Feature Extraction.
TLDR
An improved molecular graph encoding convolutional neural networks (MGE-CNN) architecture is developed to construct three types of high-quality AOT models: regression model, multiclassification model, and multitask model (deepAOT-CR), which highly outperformed previously reported models.
...
...