Ontology-Aware Clinical Abstractive Summarization

  title={Ontology-Aware Clinical Abstractive Summarization},
  author={Sean MacAvaney and Sajad Sotudeh and Arman Cohan and Nazli Goharian and Ish A. Talati and Ross W. Filice},
  journal={Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval},
Automatically generating accurate summaries from clinical reports could save a clinician's time, improve summary coverage, and reduce errors. We propose a sequence-to-sequence abstractive summarization model augmented with domain-specific ontological information to enhance content selection and summary generation. We apply our method to a dataset of radiology reports and show that it significantly outperforms the current state-of-the-art on this task in terms of rouge scores. Extensive human… 

Figures and Tables from this paper

Attend to Medical Ontologies: Content Selection for Clinical Abstractive Summarization

This paper approaches the content selection problem for clinical abstractive summarization by augmenting salient ontological terms into the summarizer, and shows that this model statistically significantly boosts state-of-the-art results in terms of ROUGE metrics.

Exploring optimal granularity for extractive summarization of unstructured health records: Analysis of the largest multi-institutional archive of health records in Japan

Clinical segments were defined in this study, aiming to express the smallest medically meaningful concepts, and found that the clinical segments yielded higher accuracy than sentences and clauses, indicating that summarization of inpatient records demands finer granularity than sentence-oriented processing.

Improving the Factual Accuracy of Abstractive Clinical Text Summarization using Multi-Objective Optimization

This study proposes a framework for improving the factual accuracy of abstractive summarization of clinical text using knowledge-guided multi-objective optimization and experiment with three transformer encoder-decoder architectures to demonstrate that optimizing different loss functions leads to improved performance in terms of entity-level factual accuracy.

Towards Clinical Encounter Summarization: Learning to Compose Discharge Summaries from Prior Notes

Two new measures, faithfulness and hallucination rate, are introduced for evaluation in this task, which complement existing measures for fluency and informativeness.

What’s in a Summary? Laying the Groundwork for Advances in Hospital-Course Summarization

This work constructs an English, text-to-text dataset of 109,000 hospitalizations and their corresponding summary proxy: the clinician-authored “Brief Hospital Course” paragraph written as part of a discharge note, and identifies multiple implications for modeling this complex, multi-document summarization task.

Optimizing the Factual Correctness of a Summary: A Study of Summarizing Radiology Reports

A general framework where the factual correctness of a generated summary is evaluated by fact-checking it automatically against its reference using an information extraction module, and a training strategy which optimizes a neural summarization model with a factual correctness reward via reinforcement learning is proposed.

Hierarchical Annotation for Building A Suite of Clinical Natural Language Processing Tasks: Progress Note Understanding

  • Yanjun GaoDmitriy Dligach M. Afshar
  • Computer Science
    LREC ... International Conference on Language Resources & Evaluation : [proceedings]. International Conference on Language Resources & Evaluation
  • 2022
This work introduces a hierarchical annotation schema with three stages to address clinical text understanding, clinical reasoning, and summarization, and created an annotated corpus based on an extensive collection of publicly available daily progress notes.

QIAI at MEDIQA 2021: Multimodal Radiology Report Summarization

Preliminary results shows that taking advantage of the visual features from the x-rays associated to the radiology reports leads to higher evaluation metrics compared to a text-only baseline system.

IBMResearch at MEDIQA 2021: Toward Improving Factual Correctness of Radiology Report Abstractive Summarization

This work proposes a novel approach to improve factual correctness of a summarization system by re-ranking the candidate summaries based on a factual vector of the summary.

BDKG at MEDIQA 2021: System Report for the Radiology Report Summarization Task

This paper presents the winning system at the Radiology Report Summarization track of the MEDIQA 2021 shared task, which is built upon a pre-trained Transformer encoder-decoder architecture deployed with an additional domain adaptation module to particularly handle the transfer and generalization issue.



Domain-Aware Abstractive Text Summarization for Medical Documents

This work proposes a deep-reinforced, abstractive summarization model that is capable of reading biomedical publication abstracts and producing summaries in the form of a one sentence headline, or title, and introduces novel reinforcement learning reward metrics based on biomedical expert tools.

Revisiting Summarization Evaluation for Scientific Articles

It is shown that, contrary to the common belief, ROUGE is not much reliable in evaluating scientific summaries, and an alternative metric is proposed which is based on the content relevance between a system generated summary and the corresponding human written summaries.

Learning to Summarize Radiology Findings

This work proposes to automate the generation of radiology impressions with neural sequence-to-sequence learning and proposes a customized neural model for this task which learns to encode the study background information and use this information to guide the decoding process.

A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents

This work proposes the first model for abstractive summarization of single, longer-form documents (e.g., research papers), consisting of a new hierarchical encoder that models the discourse structure of a document, and an attentive discourse-aware decoder to generate the summary.

A Neural Attention Model for Abstractive Sentence Summarization

This work proposes a fully data-driven approach to abstractive sentence summarization by utilizing a local attention-based model that generates each word of the summary conditioned on the input sentence.

Bottom-Up Abstractive Summarization

This work explores the use of data-efficient content selectors to over-determine phrases in a source document that should be part of the summary, and shows that this approach improves the ability to compress text, while still generating fluent summaries.

Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond

This work proposes several novel models that address critical problems in summarization that are not adequately modeled by the basic architecture, such as modeling key-words, capturing the hierarchy of sentence-to-word structure, and emitting words that are rare or unseen at training time.

Using Latent Semantic Analysis in Text Summarization and Summary Evaluation

A generic text summarization method which uses the latent semantic analysis technique to identify semantically important sentences and two new evaluation methods based on LSA, which measure content similarity between an original document and its summary are proposed.

Mind the Gap: Dangers of Divorcing Evaluations of Summary Content from Linguistic Quality

This paper proposes a method to maximize the strength of current automatic evaluations by using the method of canonical correlation, and applies this new evaluation method, which is called ROSE (ROUGE Optimal Summarization Evaluation), to find the optimal linear combination of ROUGE scores to maximize correlation with human responsiveness.

Challenges in Data-to-Document Generation

A new, large-scale corpus of data records paired with descriptive documents is introduced, a series of extractive evaluation methods for analyzing performance are proposed, and baseline results are obtained using current neural generation methods.