Integrating cancer diagnosis terminologies based on logical definitions of SNOMED CT concepts


In oncology, the reuse of data is confronted with the heterogeneity of terminologies. It is necessary to semantically integrate these distinct terminologies. The semantic integration by using a third terminology as a support is a conventional approach for the integration of two terminologies that are not very structured. The aim of our study was to use SNOMED CT for integrating ICD-10 and ICD-O3. We used two complementary resources, mapping tables provided by SNOMED CT and the NCI Metathesaurus, in order to find mappings between ICD-10 or ICD-O3 concepts and SNOMED CT concepts. We used the SNOMED CT structure to filter inconsistent mappings, as well as to disambiguate multiple mappings. Based on the remaining mappings, we used semantic relations from SNOMED CT to establish links between ICD-10 and ICD-O3. Overall, the coverage of ICD-O3 and ICD10 codes was over 88%. Finally, we obtained an integration of 24% (203/852) of ICD-10 concepts with 86% (888/1032) of ICD-O3 morphology concepts combined to 39% (127/330) of ICD-O3 topography concepts. Comparing our results with the 23,684 ICD-O3 pairs mapped to ICD-10 concepts in the SEER conversion file, we found 17,447 pairs of ICD-O3 concepts in common among which 11,932 pairs were integrated with the same ICD-10 concept as the SEER conversion file. The automated process leverages logical definitions of SNOMED CT concepts. While the low quality of some of these definitions impacted negatively the integration process, the identification of such situations made it possible to indirectly audit the structure of SNOMED CT.

DOI: 10.1016/j.jbi.2017.08.013

Cite this paper

@article{Nikiema2017IntegratingCD, title={Integrating cancer diagnosis terminologies based on logical definitions of SNOMED CT concepts}, author={Jean No{\"{e}l Nikiema and Vianney Jouhet and Fleur Mougin}, journal={Journal of biomedical informatics}, year={2017}, volume={74}, pages={46-58} }