Supervision and Source Domain Impact on Representation Learning: A Histopathology Case Study

  title={Supervision and Source Domain Impact on Representation Learning: A Histopathology Case Study},
  author={Milad Sikaroudi and Amir Safarpoor and Benyamin Ghojogh and Sobhan Shafiei and Mark Crowley and Hamid R. Tizhoosh},
  journal={2020 42nd Annual International Conference of the IEEE Engineering in Medicine \& Biology Society (EMBC)},
As many algorithms depend on a suitable representation of data, learning unique features is considered a crucial task. Although supervised techniques using deep neural networks have boosted the performance of representation learning, the need for a large sets of labeled data limits the application of such methods. As an example, high-quality delineations of regions of interest in the field of pathology is a tedious and time-consuming task due to the large image dimensions. In this work, we… 

Figures and Tables from this paper

Spectral, Probabilistic, and Deep Metric Learning: Tutorial and Survey

This is a tutorial and survey paper on metric learning that starts with the definition of distance metric, Mahalanobis distance, and generalized MahalanOBis distance and introduces multi-modal deep metric learning, geometric metric learning by neural networks, and few-shot metric learning.

Attention-Based Dynamic Subspace Learners for Medical Image Analysis

This work proposes to dynamically exploit multiple learners by removing the need of knowing apriori the number of learners and aggregating new subspace learners during training and provides an attention map generated directly during inference to illustrate the visual interpretability of the embedding features.

Hospital-Agnostic Image Representation Learning in Digital Pathology

A domain generalization technique is leveraged in this study to improve the generalization capability of a Deep Neural Network to an unseen histopathology image set (i.e., from an unseen hospital/trial site) in the presence of domain shift.

Learning to Predict RNA Sequence Expressions from Whole Slide Images with Applications for Search and Classification

The proposed tRNAsfomer can assist as a computational pathology tool to facilitate a new generation of search and classification methods by combining the tissue morphology and the molecular fingerprint of the biopsy samples.

Towards better understanding and better generalization of few-shot classification in histology images with contrastive learning

This work facilitates the study of few-shot learning in histology images by setting up three cross-domain tasks that simulate real clinics problems, and shows the superiority of CL over supervised learning in terms of generalization for such data.

Classification of Microscopy Images of Breast Tissue: Region Duplication based Self-Supervision vs. Off-the Shelf Deep Representations

A novel self-supervision pretext task to train a convolutional neural network (CNN) and extract domain specific features and results indicated that the best performance of 99% sensitivity was achieved for the deep features extracted using ResNet50 with concatenation of patch-level embedding.

Batch-Incremental Triplet Sampling for Training Triplet Networks Using Bayesian Updating Theorem

Experimental results on two public datasets, namely MNIST and histopathology colorectal cancer, substantiate the effectiveness of the proposed triplet mining method with Bayesian updating and conjugate priors.

Offline versus Online Triplet Mining based on Extreme Distances of Histopathology Patches

It is found that offline and online mining approaches have comparable performances for a specific architecture, such as ResNet-18 in this study.



Multi-class texture analysis in colorectal cancer histology

A new dataset of 5,000 histological images of human colorectal cancer including eight different types of tissue is presented and an optimal classification strategy is found that markedly outperformed traditional methods, improving the state of the art for tumour-stroma separation and setting a new standard for multiclass tissue separation.

Deep Residual Learning for Image Recognition

This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.

Learning with Less Data Via Weakly Labeled Patch Classification in Digital Pathology

It is shown that features learned from such weakly labeled datasets are indeed transferable and allow us to achieve highly competitive patch classification results on the colorectal cancer dataset and the PatchCamelyon (PCam) dataset while using an order of magnitude less labeled data.

Few Shot Learning in Histopathological Images:Reducing the Need of Labeled Data on Biological Datasets

This work validate that the use of few shot learning techniques can transfer knowledge from a well defined source domain from Colon tissue into a more generic domain composed by Colon, Lung and Breast tissue by using very few training images.

Self-Supervised Similarity Learning for Digital Pathology

This work proposes a self-supervised method for feature extraction by similarity learning on whole slide images (WSI) that is simple to implement and allows creation of robust and compact image descriptors and shows that it yields better retrieval task results than existing ImageNet based and generic self- supervised feature extraction methods.

A Tutorial on Distance Metric Learning: Mathematical Foundations, Algorithms and Software

A Python package is presented that collects a set of 17 distance metric learning techniques explained in this paper, with some experiments to evaluate the performance of the different algorithms.

Meta-Learning: A Survey

This chapter provides an overview of the state of the art in meta-learning, the science of systematically observing how different machine learning approaches perform on a wide range of learning tasks and then learning from this experience, or meta-data, to learn new tasks much faster than otherwise possible.

Tile2Vec: Unsupervised representation learning for spatially distributed data

Geospatial analysis lacks methods like the word vector representations and pre-trained networks that significantly boost performance across a wide range of natural language and computer vision tasks.

Diagnostic Assessment of Deep Learning Algorithms for Detection of Lymph Node Metastases in Women With Breast Cancer

In the setting of a challenge competition, some deep learning algorithms achieved better diagnostic performance than a panel of 11 pathologists participating in a simulation exercise designed to mimic routine pathology workflow; algorithm performance was comparable with an expert pathologist interpreting whole-slide images without time constraints.

UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction

The UMAP algorithm is competitive with t-SNE for visualization quality, and arguably preserves more of the global structure with superior run time performance.