Self-Supervised Representation Learning for Ultrasound Video

  title={Self-Supervised Representation Learning for Ultrasound Video},
  author={Jianbo Jiao and Richard Droste and Lior Drukker and Aris T. Papageorghiou and Julia Alison Noble},
  journal={2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI)},
  • Jianbo Jiao, Richard Droste, +2 authors J. Noble
  • Published 28 February 2020
  • Medicine, Computer Science, Engineering
  • 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI)
Recent advances in deep learning have achieved promising performance for medical image analysis, while in most cases ground-truth annotations from human experts are necessary to train the deep model. In practice, such annotations are expensive to collect and can be scarce for medical imaging applications. Therefore, there is significant interest in learning representations from unlabelled raw data. In this paper, we propose a self-supervised learning approach to learn meaningful and… 
Self-supervised Contrastive Video-Speech Representation Learning for Ultrasound
This paper designed a framework to model the correspondence between video and audio without any kind of human annotations, and introduces cross-modal contrastive learning and an affinity-aware self-paced learning scheme to enhance correlation modelling.
Effective Sample Pair Generation for Ultrasound Video Contrastive Representation Learning
An US semi-supervised contrastive learning (USCL) method to effectively learn feature representations of US images, with a new sample pair generation (SPG) scheme to tackle the problem that US images extracted from videos have high similarities.
Self-supervised learning methods and applications in medical imaging analysis: A survey
The article covers a set of the most recent self-supervised learning methods from the computer vision field as they are applicable to the medical imaging analysis and categorize them as predictive, generative and contrastive approaches.
Self-Supervised Representation Learning for Detection of ACL Tear Injury in Knee MRI
Experiments on the pretext task show that this proposed approach enables the model to learn spatial context invariant features which helps in reliable and explainable performance in downstream tasks like classification of ACL tear injury from knee MRI.
ContIG: Self-supervised Multimodal Contrastive Learning for Medical Imaging with Genetics
This work proposes ContIG, a self-supervised method that can learn from large datasets of unlabeled medical images and genetic data, and designs its method to integrate multiple modalities of each individual person in the same model end-toend, even when the available modalities vary across individuals.
Transforming obstetric ultrasound into data science using eye tracking, voice recording, transducer motion and ultrasound video
The PULSE (Perception Ultrasound by Learning Sonographer Experience) project is an interdisciplinary multi-modal imaging study aiming to offer clinical sonography insights and transform the process of obstetric ultrasound acquisition and image analysis by applying deep learning to large-scale multi- modal clinical data.
Table of Contents
  • 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI)
  • 2020
Ravikumar, Sadhana (Penn Image Computing and Science Laboratory, Department of Radio), Wisse, Laura (Penn Image Computing and Science Laboratory, Department of Radio), Ittyerah, Ranjit (Penn Image
Principled Ultrasound Data Augmentation for Classification of Standard Planes
It is shown that principled data augmentation for medical image model training can lead to significant improvements in ultrasound standard plane detection, with an an average F1-score improvement of 7.0% overall over naive data augmented strategies in ultrasound fetal standard plane classification.


Ultrasound Image Representation Learning by Modeling Sonographer Visual Attention
It is demonstrated that transferable representations of images can be learned without manual annotations by modeling human visual attention by training a convolutional neural network to predict gaze on ultrasound video frames through visual saliency prediction or gaze-point regression.
SonoNet: Real-Time Detection and Localisation of Fetal Standard Scan Planes in Freehand Ultrasound
A novel method based on convolutional neural networks is proposed, which can automatically detect 13 fetal standard views in freehand 2-D ultrasound data as well as provide a localization of the fetal structures via a bounding box while providing optimal output for the localization task.
Multi-task SonoEyeNet: Detection of Fetal Standardized Planes Assisted by Generated Sonographer Attention Maps
We present a novel multi-task convolutional neural network called Multi-task SonoEyeNet (M-SEN) that learns to generate clinically relevant visual attention maps using sonographer gaze tracking data
Squeeze-and-Excitation Networks
This work proposes a novel architectural unit, which is term the “Squeeze-and-Excitation” (SE) block, that adaptively recalibrates channel-wise feature responses by explicitly modelling interdependencies between channels and shows that these blocks can be stacked together to form SENet architectures that generalise extremely effectively across different datasets.
End-to-end lung cancer screening with three-dimensional deep learning on low-dose chest computed tomography
A convolutional neural network performs automated prediction of malignancy risk of pulmonary nodules in chest CT scan volumes and improves accuracy of lung cancer screening.
Colorful Image Colorization
This paper proposes a fully automatic approach to colorization that produces vibrant and realistic colorizations and shows that colorization can be a powerful pretext task for self-supervised feature learning, acting as a cross-channel encoder.
Unsupervised Learning of Visual Representations using Videos
This is a review of unsupervised learning applied to videos with the aim of learning visual representations. We look at different realizations of the notion of temporal coherence across various
Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network
It is demonstrated that an end-to-end deep learning approach can classify a broad range of distinct arrhythmias from single-lead ECGs with high diagnostic performance similar to that of cardiologists.
What Do Different Evaluation Metrics Tell Us About Saliency Models?
This paper provides an analysis of 8 different evaluation metrics and their properties, and makes recommendations for metric selections under specific assumptions and for specific applications.