• Corpus ID: 1419166

Syntactically-Informed Semantic Category Recognizer for Discharge Summaries

  title={Syntactically-Informed Semantic Category Recognizer for Discharge Summaries},
  author={Tawanda C. Sibanda and Tian He and Peter Szolovits and {\"O}zlem Uzuner},
Semantic category recognition (SCR) contributes to document understanding. Most approaches to SCR fail to make use of syntax. We hypothesize that syntax, if represented appropriately, can improve SCR. We present a statistical semantic category (SC) recognizer trained with syntactic and lexical contextual clues, as well as ontological information from UMLS, to identify eight semantic categories in discharge summaries. Some of our categories, e.g., test results and findings, include complex… 

Tables from this paper

TEXT2TABLE: Medical Text Summarization System Based on Named Entity Recognition and Modality Identification
Experimental results demonstrate empirically that syntactic information can contribute to the method's accuracy and an SVM-based classifier using syntactic Information is proposed.
Leveraging Sublanguage Features for the Semantic Categorization of Clinical Terms
A method for the semantic categorization of clinical terms based on their surface form is presented and it is found that features based on sublanguage properties can provide valuable cues for the classification of term variants.
Does negation really matter?
It is concluded that due to the nature of the machine learning classifiers, information on the polarity of phrase-level assertions does not improve performance on data in a multilabel document-level classification task.
Semantic relations for problem-oriented medical records
Three Approaches to Automatic Assignment of ICD-9-CM Codes to Radiology Reports
A simple hand-crafted rule-based system with lexical elements and semantic information can outperform algorithmically more complex systems when these systems base their ICD-9-CM predictions only upon individual words, n-grams, or s grams.
Using contextual and lexical features to restructure and validate the classification of biomedical concepts
BackgroundBiomedical ontologies are critical for integration of data from diverse sources and for use by knowledge-based biomedical applications, especially natural language processing as well as
Using Local Grammar for Entity Extraction from Clinical Reports
The work presented in this paper uses a local grammar approach to extract medical named entities from French patient clinical reports using a natural language processing (NLP) approach.
Research and applications: Automatic lymphoma classification with sentence subgraph mining from pathology reports
A novel framework is designed that translates sentences into graph representations, automatically mines sentence subgraphs, reduces redundancy in mined sub graphs, and automatically generates subgraph features for subsequent classification tasks, and highlights how unsupervised relation features may provide meaningful insights into lymphoma classification.
Semantic tagging of medical narratives using SNOMED CT
This research began with an investigation of using SNOMED CT® for identification of concepts in medical narratives which resulted in the derivation of a tag set which was used to develop three gold standard datasets.
Groundtruth budgeting : a novel approach to semi-supervised relation extraction in medical language
A novel technique for weakly-supervisedbootstrapping of a classifier for this task: Groundtruth Budgeting, which sets aside a random portion of the groundtruth at the beginning of bootstrapping which will be gradually added as the classifier is bootstrapped.


Indexing UMLS Semantic Types for Medical Question-Answering
It is shown, using statistical studies, that strategies for using these new tags in a QA context are to take in account the individual origin of documents.
Extracting Diagnoses from Discharge Summaries
A program for extracting the diagnoses and procedures from the past medical history and discharge diagnoses in the discharge summary of a case and coding these using SNOMED-CT in the UMLS using a limited amount of natural language processing.
Context-Sensitive Medical Information Retrieval
The algorithm presented significantly improves the performance of information retrieval done on medical narratives and reduces the error rate from about 60%, when using context-insensitive retrieval, to nearly 100%.
Research Paper: A General Natural-language Text Processor for Clinical Radiology
Development of a general natural-language processor that identifies clinical information in narrative reports and maps that information into a structured representation containing clinical terms, using radiology as the test domain.
Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program
The UMLS Metathesaurus, the largest thesaurus in the biomedical domain, provides a representation of biomedical knowledge consisting of concepts classified by semantic type and both hierarchical and
Parsing English with a Link Grammar
This work has encoded English grammar into a new formal grammatical system called a link grammar, and written a program (based on new algorithms) for efficiently parsing with this very complex grammar.
Machine Learning of Text Analysis Rules for Clinical Records
The BADGER text analysis system, which identifies concepts contained in a text based on linguistic context using the CRYSTAL dictionary induction system that automatically learns text analysis rules from a set of training documents.
AMIA 2006 Symposium Proceedings
  • AMIA 2006 Symposium Proceedings