Multimodal Recurrent Model with Attention for Automated Radiology Report Generation

@inproceedings{Xue2018MultimodalRM,
  title={Multimodal Recurrent Model with Attention for Automated Radiology Report Generation},
  author={Yuan Xue and Tao Xu and L. Rodney Long and Zhiyun Xue and Sameer Kiran Antani and George R. Thoma and Xiaolei Huang},
  booktitle={MICCAI},
  year={2018}
}
Radiologists routinely examine medical images such as X-Ray, CT, or MRI and write reports summarizing their descriptive findings and conclusive impressions. [] Key Method The proposed model incorporates the Convolutional Neural Networks (CNNs) with the Long Short-Term Memory (LSTM) in a recurrent way. It is capable of not only generating high-level conclusive impressions, but also generating detailed descriptive findings sentence by sentence to support the conclusion.

Automatic Radiology Report Generation based on Multi-view Image Fusion and Medical Concept Enrichment

TLDR
A generative encoder-decoder model is proposed and extracted medical concepts based on the radiology reports in the training data and fine-tune the encoder to extract the most frequent medical concepts from the x-ray images.

Deep learning in generating radiology reports: A survey

When Radiology Report Generation Meets Knowledge Graph

TLDR
Experimental results demonstrate the superior performance of the methods integrated with the proposed graph embedding module on a publicly accessible dataset (IU-RR) of chest radiographs compared with previous approaches using both the conventional evaluation metrics commonly adopted for image captioning and the proposed ones.

Confidence-Guided Radiology Report Generation

TLDR
The experimental results have demonstrated that the proposed method for model uncertainty characterization and estimation can provide more reliable confidence scores for radiology report generation, and the proposed uncertainty-weighted losses can achieve more comprehensive model optimization and result in state-of-the-art performance on a public Radiology report dataset.

Automatic Generation of Structured Radiology Reports for Volumetric Computed Tomography Images Using Question-Specific Deep Feature Extraction and Learning

TLDR
An automatic structured-radiology report generation system that is based on deep learning methods that develops volume-level and question-specific deep features using DNNs, and demonstrates the effectiveness of the proposed system on ImageCLEF2015 Liver computed tomography annotation task.

Auxiliary signal-guided knowledge encoder-decoder for medical report generation

TLDR
This work proposes an Auxiliary Signal-Guided Knowledge Encoder-Decoder (ASGK) to mimic radiologists’ working patterns and confirms that auxiliary signals driven Transformer-based models are with solid capabilities to outperform previous approaches on both medical terminology classification and paragraph generation metrics.

MedSkip: Medical Report Generation Using Skip Connections and Integrated Attention

TLDR
A novel architecture of a modified HRNet which includes added skip connections along with convolutional block attention modules (CBAM) is proposed, establishing new state-of-the-art for PEIR Gross while giving competitive results for IU X-Ray.

Trust It or Not: Confidence-Guided Automatic Radiology Report Generation

TLDR
Experimental results have demonstrated that the proposed method for model uncertainty characterization and estimation can produce more reliable confidence scores for radiology report generation, and the modified loss function, which takes into account the uncertainties, leads to better model performance on two public Radiology report datasets.

Prior Knowledge Enhances Radiology Report Generation

TLDR
This work proposes to mine and represent the associations among medical findings in an informative knowledge graph and incorporate this prior knowledge with radiology report generation to help improve the quality of generated reports.
...

References

SHOWING 1-10 OF 23 REFERENCES

On the Automatic Generation of Medical Imaging Reports

TLDR
This work builds a multi-task learning framework which jointly performs the prediction of tags and the generation of paragraphs, proposes a co-attention mechanism to localize regions containing abnormalities and generate narrations for them, and develops a hierarchical LSTM model to generate long paragraphs.

Learning to Read Chest X-Rays: Recurrent Neural Cascade Model for Automated Image Annotation

TLDR
A deep learning model is presented to efficiently detect a disease from an image and annotate its contexts (e.g., location, severity and the affected organs), and a novel approach to use the weights of the already trained pair of CNN/RNN on the domain-specific image/text dataset, to infer the joint image/ text contexts for composite image labeling.

MDNet: A Semantically and Visually Interpretable Medical Image Diagnosis Network

TLDR
This paper proposes MDNet to establish a direct multimodal mapping between medical images and diagnostic reports that can read images, generate diagnostic reports, retrieve images by symptom descriptions, and visualize attention, to provide justifications of the network diagnosis process.

Recurrent Topic-Transition GAN for Visual Paragraph Generation

TLDR
A semi-supervised paragraph generative framework that is able to synthesize diverse and semantically coherent paragraph descriptions by reasoning over local semantic regions and exploiting linguistic knowledge is investigated.

DenseCap: Fully Convolutional Localization Networks for Dense Captioning

TLDR
A Fully Convolutional Localization Network (FCLN) architecture is proposed that processes an image with a single, efficient forward pass, requires no external regions proposals, and can be trained end-to-end with asingle round of optimization.

A Hierarchical Approach for Generating Descriptive Image Paragraphs

TLDR
A model that decomposes both images and paragraphs into their constituent parts is developed, detecting semantic regions in images and using a hierarchical recurrent neural network to reason about language.

Video Paragraph Captioning Using Hierarchical Recurrent Neural Networks

TLDR
An approach that exploits hierarchical Recurrent Neural Networks to tackle the video captioning problem, i.e., generating one or multiple sentences to describe a realistic video, significantly outperforms the current state-of-the-art methods.

Preparing a collection of radiology examinations for distribution and retrieval

OBJECTIVE Clinical documents made available for secondary use play an increasingly important role in discovery of clinical knowledge, development of research methods, and education. An important step

Knowing When to Look: Adaptive Attention via a Visual Sentinel for Image Captioning

TLDR
This paper proposes a novel adaptive attention model with a visual sentinel that sets the new state-of-the-art by a significant margin on image captioning.

Deep Residual Learning for Image Recognition

TLDR
This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.