Contrastive Cross-Modal Pre-Training: A General Strategy for Small Sample Medical Imaging

  title={Contrastive Cross-Modal Pre-Training: A General Strategy for Small Sample Medical Imaging},
  author={Gongbo Liang and Connor Greenwell and Yu Zhang and Xiaoqin Wang and Ramakanth Kavuluru and Nathan Jacobs},
  journal={IEEE Journal of Biomedical and Health Informatics},
A key challenge in training neural networks for a given medical imaging task is the difficulty of obtaining a sufficient number of manually labeled examples. In contrast, textual imaging reports are often readily available in medical records and contain rich but unstructured interpretations written by experts as part of standard clinical practice. We propose using these textual reports as a form of weak supervision to improve the image interpretation performance of a neural network without… 

Figures from this paper

Development of CNN models for the enteral feeding tube positioning assessment on a small scale data set

A CNN model for feeding tube positioning assessment is built by pre-training the model under a weakly supervised fashion on large quantities of radiographs and has a high prediction accuracy and more accurate estimated prediction confidence when compared to the no pre-trained model and other baseline models.

A Mutation-based Text Generation for Adversarial Machine Learning Applications

This work proposed and evaluated several mutation-based text generation approaches and showed examples of mutation operators, but this work can be extended in many aspects such as proposing new text-based mutation operators based on the nature of the application.

Neural Network Decision-Making Criteria Consistency Analysis via Inputs Sensitivity

This work evaluates the decision-making criteria of NNs via inputs sensitivity using feature-attribution explanation methods in combination with computational analysis and clustering analysis and finds that decision- making criteria are easily distinguishable between training trials of the same architecture and task.

Cross-modal Contrastive Attention Model for Medical Report Generation

This paper proposes a novel Cross-modal Contrastive Attention (CMCA) model to capture both visual and semantic information from similar cases, with mainly two modules: a Visual Contrastive attention Module for refining the unique abnormal regions compared to the retrieved case images; a Cross- modal Attention Module for matching the positive semantic Information from the case reports.

Beware the Black-Box of Medical Image Generation: an Uncertainty Analysis by the Learned Feature Space

  • Yunni QuDavid Yan G. Liang
  • Computer Science
    2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC)
  • 2022
It is demonstrated that the learned feature spaces of multiple U-Net architectures for image generation tasks are easily separable between different training trials of the same architecture with the same hyperparameter setting, indicating the models using different criteria for the same tasks.

Deep Neural

Flickr to Map Phenological Trends

Lung Disease Classification in CXR Images Using Hybrid Inception-ResNet-v2 Model and Edge Computing

A combination of the synthetic minority over-sampling technique (SMOTE) and weighted class balancing is used to alleviate the effects of class imbalance and a hybrid Inception-ResNet-v2 transfer learning model coupled with data augmentation and image enhancement gives the best accuracy.



Bimodal Network Architectures for Automatic Generation of Image Annotation from Text

This work proposes two separate deep neural network architectures for automatic marking of a region of interest (ROI) on the image best representing a finding location, given a textual report or a set of keywords and shows that for a variety of findings from chest X-ray images, both proposed architectures learn to estimate the ROI, as validated by clinical annotations.

Automatic Radiology Report Generation based on Multi-view Image Fusion and Medical Concept Enrichment

A generative encoder-decoder model is proposed and extracted medical concepts based on the radiology reports in the training data and fine-tune the encoder to extract the most frequent medical concepts from the x-ray images.

CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison

A labeler is designed to automatically detect the presence of 14 observations in radiology reports, capturing uncertainties inherent in radiograph interpretation, in CheXpert, a large dataset that contains 224,316 chest radiographs of 65,240 patients.

MIMIC-CXR: A large publicly available database of labeled chest radiographs

MIMic-CXR-JPG is derived entirely from the MIMIC-C XR database, and aims to provide a convenient processed version of MIMICS CXR, as well as to provided a standard reference for data splits and image labels.

Weakly supervised one-stage vision and language disease detection using large scale pneumonia and pneumothorax studies

A challenging new set of radiologist paired bounding box and natural language annotations on the publicly available MIMIC-CXR dataset especially focussed on pneumonia and pneumothorax is presented.

Preparing Medical Imaging Data for Machine Learning.

Fundamental steps for preparing medical imaging data in AI algorithm development are described, current limitations to data curation are explained, and new approaches to address the problem of data availability are explored.

GANai: Standardizing CT Images using Generative Adversarial Network with Alternative Improvement

A new GAN model called GANai is presented to mitigate the differences in radiomic features across CT images captured using non-standard imaging protocols, and is significantly better than the existing state-of-the-art image synthesis algorithms on CT image standardization.

Dermatologist-level classification of skin cancer with deep neural networks

This work demonstrates an artificial intelligence capable of classifying skin cancer with a level of competence comparable to dermatologists, trained end-to-end from images directly, using only pixels and disease labels as inputs.

Automatic Bounding Box Annotation of Chest X-Ray Data for Localization of Abnormalities

This paper proposes an automatic approach for labeling chest x-ray images for findings and locations by leveraging radiology reports and uses this “silver” bounding boxes dataset to train an opacity detection model using a RetinaNet architecture, and obtained localization results on par with the state-of-the-art.

Adversarial Representation Learning for Text-to-Image Matching

TIMAM is introduced: a Text-Image Modality Adversarial Matching approach that learns modality-invariant feature representations using adversarial and cross-modal matching objectives and it is demonstrated that BERT, a publicly-available language model that extracts word embeddings, can successfully be applied in the text-to-image matching domain.