Learn More
This paper analyzes a Question & Answer site for programmers, Stack Overflow, that dramatically improves on the utility and performance of Q&A systems for technical domains. Over 92% of Stack Overflow questions about expert topics are answered - in a median time of 11 minutes. Using a mixed methods approach that combines statistical data analysis(More)
Agreement measures are used frequently in reliability studies that involve categorical data. Simple measures like observed agreement and specific agreement can reveal a good deal about the sample. Chance-corrected agreement in the form of the kappa statistic is used frequently based on its correspondence to an intraclass correlation coefficient and the ease(More)
Temporal information is crucial in electronic medical records and biomedical information systems. Processing temporal information in medical narrative data is a very challenging area. It lies at the intersection of temporal representation and reasoning (TRR) in artificial intelligence and medical natural language processing (MLP). Some fundamental concepts(More)
Information retrieval studies that involve searching the Internet or marking phrases usually lack a well-defined number of negative cases. This prevents the use of traditional interrater reliability metrics like the kappa statistic to assess the quality of expert-generated gold standards. Such studies often quantify system performance as precision, recall,(More)
BACKGROUND Patient-based similarity metrics are important case-based reasoning tools which may assist with research and patient care applications. Ontology and information content principles may be potentially helpful tools for similarity metric development. METHODS Patient cases from 1989 through 2003 from the Columbia University Medical Center data(More)
Evaluating natural language processing (NLP) systems in the clinical domain is a difficult task which is important for advancement of the field. A number of NLP systems have been reported that extract information from free-text clinical reports, but not many of the systems have been evaluated. Those that were evaluated noted good performance measures but(More)
C.F. is named in a patent held by Columbia University for the natural language processor described in this report. 1 PURPOSE: To evaluate translation of chest radiographic reports by using natural language processing and to compare the findings with those in the literature. MATERIALS AND METHODS: A natural language processor coded 10 years of narrative(More)