Corpus ID: 220496407

GGPONC: A Corpus of German Medical Text with Rich Metadata Based on Clinical Practice Guidelines

@article{Borchert2020GGPONCAC,
  title={GGPONC: A Corpus of German Medical Text with Rich Metadata Based on Clinical Practice Guidelines},
  author={Florian Borchert and C. Lohr and Luise Modersohn and T. Langer and M. Follmann and Jan Sachs and U. Hahn and M. Schapranow},
  journal={ArXiv},
  year={2020},
  volume={abs/2007.06400}
}
  • Florian Borchert, C. Lohr, +5 authors M. Schapranow
  • Published 2020
  • Computer Science
  • ArXiv
  • The lack of publicly available text corpora is a major obstacle for progress in clinical natural language processing, for non-English speaking countries in particular. In this work, we present GGPONC (German Guideline Program in Oncology NLP Corpus), a freely distributable German language corpus based on clinical practice guidelines in the field of oncology. The corpus is one of the largest corpora of German medical text to date. It does not contain any patient-related data and can therefore be… CONTINUE READING

    Figures, Tables, and Topics from this paper.

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 48 REFERENCES
    A fine-grained corpus annotation schema of German nephrology records
    10
    Unsupervised Abbreviation Detection in Clinical Narratives
    10
    Semi-Automatic Terminology Generation for Information Extraction from German Chest X-Ray Reports
    5
    Annotating German Clinical Documents for De-Identification
    2
    3000PA - Towards a National Reference Corpus of German Clinical Language
    4
    Semi-Automatic Mark-Up and UMLS Annotation of Clinical Guidelines
    1
    A Corpus of Clinical Practice Guidelines Annotated with the Importance of Recommendations
    3