Learn More
A JDI (Journal Descriptor Indexing) tool has been developed at NLM that automatically categorizes biomedical text as input, returning a ranked list, with scores between 0-1, of either JDs (Journal Descriptors, corresponding to biomedical disciplines) or STs (UMLS Semantic Types). Possible applications include WSD (Word Sense Disambiguation) and retrieval(More)
Unicode is an industry standard allowing computers to consistently represent and manipulate text expressed in most of the worlds writing systems. It is widely used in multilingual NLP (natural language processing) projects. On the other hand, there are some NLP projects still only dealing with ASCII characters. This paper describes methods of utilizing(More)
1. Introduction The demand for natural language processing (NLP) in medicine has grown significantly in recent years. This growth is expected to increase rapidly due to the continuing adoption of electronic medical records (EMRs). Medical language processing (MLP) seeks to analyze linguistic patterns found not only in electronic medical records, but also in(More)
The SPECIALIST Lexicon has been distributed annually by the National Library of Medicine (NLM) since 1994. Lexical records are used for Part-of-Speech (POS) tagging, indexing, information retrieval, concept mapping, etc. in many Natural Language Processing (NLP) projects, such as Lexical Tools, MetaMap, SemRep, UMLS Metathesaurus, and ClinicalTrials.gov.(More)
It is always a challenge to present Web applications at a facility with no Internet connection. Traditional presentation methods such as transparencies or slides are inadequate for demonstrating dynamic Web applications. Currently, virtual-live demonstrations of Web applications are created with static HTML (Hypertext Markup Language) files. However,(More)
Journal Descriptor Indexing (JDI) is a vector-based text classification system developed at NLM (National Library of Medicine), originally in Lisp and now as a Java tool. Consequently, a testing suite was developed to verify training set data and results of the JDI tool. A methodology was developed and implemented to compare two sets of JD vectors,(More)
Multiwords are vital to better precision and recall in NLP applications. The Lexical Systems Group (LSG) developed an effective approach to add multiwords to the SPECIALIST Lexicon from the MEDLINE n-gram set. This paper describes a frequency analysis on LexMultiwords (LMWs) and acronym expansions based on the word count (WC) in MEDLINE. Results show most(More)