Progressive Transformer-Based Generation of Radiology Reports

  title={Progressive Transformer-Based Generation of Radiology Reports},
  author={Farhad Nooralahzadeh and Nicolas Andres Perez Gonzalez and Thomas Frauenfelder and Koji Fujimoto and M. Krauthammer},
Inspired by Curriculum Learning, we propose a consecutive ( i.e. , image-to-text-to-text) generation framework where we divide the prob-lem of radiology report generation into two steps. Contrary to generating the full radiology report from the image at once, the model generates global concepts from the image in the first step and then reforms them into finer and coherent texts using a transformer architecture. We follow the transformer-based sequence-to-sequence paradigm at each step. We improve… 

Figures and Tables from this paper

Retrieval-Based Chest X-Ray Report Generation Using a Pre-trained Contrastive Language-Image Model

With compression, the model maintains similar performance while producing reports 70% faster than the best generative model and can be broadly useful in improving the diagnostic performance and generalizability of report generation models and enabling their use in clinical workflows.

Reinforced Cross-modal Alignment for Radiology Report Generation

This paper proposes an approach with reinforcement learning over a cross-modal memory (CMM) to better align visual and textual features for radiology report generation and conducts human evaluation and case study which confirm the validity of the reinforced algorithm in this approach.

Improving Chest X-Ray Report Generation by Leveraging Warm-Starting

The experimental investigation demonstrates that the Convolutional vision Transformer (CvT) ImageNet-21K and the Distilled Generative Pre-trained Transformer 2 (DistilGPT2) checkpoints are best for warm-starting the encoder and decoder, respectively.

Knowledge Matters: Radiology Report Generation with General and Specific Knowledge

Experimental results on two publicly available datasets IU-Xray and MIMIC-CXR show that the proposed knowledge enhanced approach outperforms state-of-the-art image captioning based methods.

Methods for automatic generation of radiological reports of chest radiographs: a comprehensive survey

A comprehensive survey of all such methods specifically developed for chest radiographs, classified and discussed in detail, consolidates information about standard chest X-ray datasets, state-of-the-art report generation methods, evaluation metrics, and their results.

Transformers in Medical Imaging: A Survey

This survey surveys the use of Transformers in medical image segmentation, detection, classification, reconstruction, synthesis, registration, clinical report generation, and other tasks and develops taxonomy for each application.

CheXPrune: sparse chest X-ray report generation model using multi-attention and one-shot global pruning

  • Navdeep KaurAjay Mittal
  • Computer Science, Materials Science
    Journal of ambient intelligence and humanized computing
  • 2022
CheXPrune is a multi-attention based sparse radiology report generation method that uses encoder-decoder based architecture equipped with a visual and semantic attention mechanism and the empirical results evaluated on the OpenI dataset confirm the accuracy of the sparse model.

Self adaptive global-local feature enhancement for radiology report generation

A novel framework AGFNet is proposed to dynamically fuse the global and anatomy region feature to generate multi-grained radiology report and demonstrates that the model achieved the state-of-the-art performance on two benchmark datasets including the IU X-Ray and MIMIC-CXR.



Learning to Generate Clinically Coherent Chest X-Ray Reports

This work develops a radiology report generation model utilizing the transformer architecture that produces superior reports as measured by both standard language generation and clinical coherence metrics compared to competitive baselines and develops a method to differentiably extract clinical information from generated reports.

On the Automatic Generation of Medical Imaging Reports

This work builds a multi-task learning framework which jointly performs the prediction of tags and the generation of paragraphs, proposes a co-attention mechanism to localize regions containing abnormalities and generate narrations for them, and develops a hierarchical LSTM model to generate long paragraphs.

When Radiology Report Generation Meets Knowledge Graph

Experimental results demonstrate the superior performance of the methods integrated with the proposed graph embedding module on a publicly accessible dataset (IU-RR) of chest radiographs compared with previous approaches using both the conventional evaluation metrics commonly adopted for image captioning and the proposed ones.

Clinically Accurate Chest X-Ray Report Generation

A domain-aware automatic chest X-ray radiology report generation system which first predicts what topics will be discussed in the report, then conditionally generates sentences corresponding to these topics, and is fine-tuned using reinforcement learning.

Learning to Read Chest X-Rays: Recurrent Neural Cascade Model for Automated Image Annotation

A deep learning model is presented to efficiently detect a disease from an image and annotate its contexts (e.g., location, severity and the affected organs), and a novel approach to use the weights of the already trained pair of CNN/RNN on the domain-specific image/text dataset, to infer the joint image/ text contexts for composite image labeling.

Meshed-Memory Transformer for Image Captioning

The architecture improves both the image encoding and the language generation steps: it learns a multi-level representation of the relationships between image regions integrating learned a priori knowledge, and uses a mesh-like connectivity at decoding stage to exploit low- and high-level features.

Show, Describe and Conclude: On Exploiting the Structure Information of Chest X-ray Reports

This work proposes a novel framework which exploits the structure information between and within report sections for generating CXR imaging reports and designs a novel co-operative multi-agent system that implicitly captures the imbalanced distribution between abnormality and normality.

Progressive Generation of Long Text

This work proposes a simple but effective method of generating text in a progressive manner, inspired by generating images from low to high resolution, and significantly improves upon the fine-tuned GPT-2 in terms of domain-specific quality and sample efficiency.

CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison

A labeler is designed to automatically detect the presence of 14 observations in radiology reports, capturing uncertainties inherent in radiograph interpretation, in CheXpert, a large dataset that contains 224,316 chest radiographs of 65,240 patients.

Show and tell: A neural image caption generator

This paper presents a generative model based on a deep recurrent architecture that combines recent advances in computer vision and machine translation and that can be used to generate natural sentences describing an image.