Unifying Relational Sentence Generation and Retrieval for Medical Image Report Composition
@article{Wang2022UnifyingRS, title={Unifying Relational Sentence Generation and Retrieval for Medical Image Report Composition}, author={Fuyu Wang and Xiaodan Liang and Lin Xu and Liang Lin}, journal={IEEE Transactions on Cybernetics}, year={2022}, volume={52}, pages={5015-5025} }
Beyond generating long and topic-coherent paragraphs in traditional captioning tasks, the medical image report composition task poses more task-oriented challenges by requiring both the highly accurate medical term diagnosis and multiple heterogeneous forms of information, including impression and findings. Current methods often generate the most common sentences due to dataset bias for the individual case, regardless of whether the sentences properly capture key entities and relationships…Â
Figures and Tables from this paper
3 Citations
Attention-based CNN-GRU Model For Automatic Medical Images Captioning: ImageCLEF 2021
- Computer ScienceCLEF
- 2021
This work addressed the challenge of medical image captioning by combining a CNN encoder model with an attention-based GRU language generator model whereas a multi-label CNN classifier is used for the concept detection task.
Contrastive Attention for Automatic Chest X-ray Report Generation
- Computer ScienceFINDINGS
- 2021
The Contrastive Attention (CA) model is proposed, which can help existing models better attend to the abnormal regions and provide more accurate descriptions which are crucial for an interpretable diagnosis.
Egocentric Image Captioning for Privacy-Preserved Passive Dietary Intake Monitoring
- Computer ScienceArXiv
- 2021
This paper proposes a privacypreserved secure solution for dietary assessment with passive monitoring, which unifies food recognition, volume estimation, and scene understanding, and a novel transformer-based architecture is designed to caption egocentric dietary images.
References
SHOWING 1-10 OF 60 REFERENCES
Knowledge-driven Encode, Retrieve, Paraphrase for Medical Image Report Generation
- Computer ScienceAAAI
- 2019
Experiments show that the proposed KERP approach generates structured and robust reports supported with accurate abnormality description and explainable attentive regions, achieving the state-of-the-art results on two medical report benchmarks, with the best medical abnormality and disease classification accuracy and improved human evaluation performance.
Hybrid Retrieval-Generation Reinforced Agent for Medical Image Report Generation
- Computer ScienceNeurIPS
- 2018
A novel Hybrid Retrieval-Generation Reinforced Agent (HRGR-Agent) is proposed which reconciles traditional retrieval-based approaches populated with human prior knowledge, with modern learning- based approaches to achieve structured, robust, and diverse report generation.
On the Automatic Generation of Medical Imaging Reports
- MedicineACL
- 2018
This work builds a multi-task learning framework which jointly performs the prediction of tags and the generation of paragraphs, proposes a co-attention mechanism to localize regions containing abnormalities and generate narrations for them, and develops a hierarchical LSTM model to generate long paragraphs.
Aligning where to see and what to tell: image caption with region-based attention and scene factorization
- Computer ScienceArXiv
- 2015
This paper proposes an image caption system that exploits the parallel structures between images and sentences and makes another novel modeling contribution by introducing scene-specific contexts that capture higher-level semantic information encoded in an image.
Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting
- Computer ScienceACL
- 2018
An accurate and fast summarization model that first selects salient sentences and then rewrites them abstractively to generate a concise overall summary is proposed, which achieves the new state-of-the-art on all metrics on the CNN/Daily Mail dataset, as well as significantly higher abstractiveness scores.
A Hierarchical Approach for Generating Descriptive Image Paragraphs
- Computer Science2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2017
A model that decomposes both images and paragraphs into their constituent parts is developed, detecting semantic regions in images and using a hierarchical recurrent neural network to reason about language.
CIDEr: Consensus-based image description evaluation
- Computer Science2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2015
A novel paradigm for evaluating image descriptions that uses human consensus is proposed and a new automated metric that captures human judgment of consensus better than existing metrics across sentences generated by various sources is evaluated.
Video Captioning With Attention-Based LSTM and Semantic Consistency
- Computer ScienceIEEE Transactions on Multimedia
- 2017
A novel end-to-end framework named aLSTMs, an attention-based LSTM model with semantic consistency, to transfer videos to natural sentences with competitive or even better results than the state-of-the-art baselines for video captioning in both BLEU and METEOR.
Multi-Attention and Incorporating Background Information Model for Chest X-Ray Image Report Generation
- Computer ScienceIEEE Access
- 2019
A new hierarchical model with multi-attention considering the background information that outperforms all baselines, achieving the state-of-the-art performance in terms of accuracy.
Know More Say Less: Image Captioning Based on Scene Graphs
- Computer ScienceIEEE Transactions on Multimedia
- 2019
A framework based on scene graphs for image captioning that leverages both visual features and semantic knowledge in structured scene graphs and introduces a hierarchical-attention-based module to learn discriminative features for word generation at each time step.