Self adaptive global-local feature enhancement for radiology report generation

  title={Self adaptive global-local feature enhancement for radiology report generation},
  author={Yuhao Wang and Kai Wang and Xiaohong Liu and Tianrun Gao and Jingyue Zhang and Guangyu Wang},
Automated radiology report generation aims at automat-ically generating a detailed description of medical images, which can greatly alleviate the workload of radiologists and provide better medical services to remote areas. Most existing works pay attention to the holistic impression of medical images, failing to utilize important anatomy information. However, in actual clinical practice, radiologists usually locate important anatomical structures, and then look for signs of abnormalities in… 

Figures and Tables from this paper



Knowledge Matters: Radiology Report Generation with General and Specific Knowledge

Experimental results on two publicly available datasets IU-Xray and MIMIC-CXR show that the proposed knowledge enhanced approach outperforms state-of-the-art image captioning based methods.

Cross-modal Memory Networks for Radiology Report Generation

A cross-modal memory networks (CMN) is proposed to enhance the encoder-decoder framework for radiology report generation, where a shared memory is designed to record the alignment between images and texts so as to facilitate the interaction and generation across modalities.

Reinforced Cross-modal Alignment for Radiology Report Generation

This paper proposes an approach with reinforcement learning over a cross-modal memory (CMM) to better align visual and textual features for radiology report generation and conducts human evaluation and case study which confirm the validity of the reinforced algorithm in this approach.

AnaXNet: Anatomy Aware Multi-label Finding Classification in Chest X-ray

A novel multi-label chest X-ray classification model that accurately classifies the image finding and also localizes the findings to their correct anatomical regions while also providing accurate location information is proposed.

Exploring and Distilling Posterior and Prior Knowledge for Radiology Report Generation

A Posterior- and-Prior Knowledge Exploring-and-Distilling approach (PPKED) to imitate the working patterns of radiologists, who will first examine the abnormal regions and assign the disease topic tags to the abnormal areas, and then rely on the years of prior medical knowledge and prior working experience accumulations to write reports.

AlignTransformer: Hierarchical Alignment of Visual Regions and Disease Tags for Medical Report Generation

An AlignTransformer framework, which includes the Align Hierarchical Attention (AHA) and the Multi-Grained Transformer (MGT) modules, which can achieve results competitive with state-of-the-art methods on the public IU-Xray and MIMIC-CXR datasets.

On the Automatic Generation of Medical Imaging Reports

This work builds a multi-task learning framework which jointly performs the prediction of tags and the generation of paragraphs, proposes a co-attention mechanism to localize regions containing abnormalities and generate narrations for them, and develops a hierarchical LSTM model to generate long paragraphs.

Generating Radiology Reports via Memory-driven Transformer

This paper proposes to generate radiology reports with memory-driven Transformer, where a relational memory is designed to record key information of the generation process and a memory- driven conditional layer normalization is applied to incorporating the memory into the decoder of Transformer.

Memory-aligned Knowledge Graph for Clinically Accurate Radiology Image Report Generation

  • S. Yan
  • Computer Science
  • 2022
A Memory-aligned Knowledge Graph (MaKG) of clinical abnormalities is introduced to better learn the visual patterns of abnormalities and their relationships by integrating it into a deep model architecture for the report generation.

Chest ImaGenome Dataset for Clinical Reasoning

Inspired by the Visual Genome effort in the computer vision community, the first Chest ImaGenome dataset is constructed with a scene graph data structure to describe 242, 072 images, and local annotations are automatically produced using a joint rule-based natural language processing and atlas-based bounding box detection pipeline.