Corpus ID: 16791757

Multiword Frequency Analysis Based on the MEDLINE N-gram Set

  title={Multiword Frequency Analysis Based on the MEDLINE N-gram Set},
  author={C. J. Lu and Destinee L. Tormey and Lynn McCreedy and A. Browne},
Multiwords are vital to better precision and recall in NLP applications. The Lexical Systems Group (LSG) developed an effective approach to add multiwords to the SPECIALIST Lexicon from the MEDLINE n-gram set. This paper describes a frequency analysis on LexMultiwords (LMWs) and acronym expansions (e.g. blood pressure for BP) based on the word count (WC) in MEDLINE. Results show most LMWs locate in the low WC range with better precision and F1 score. Introduction LMWs are terms in Lexical… Expand
1 Citations
Enhanced LexSynonym Acquisition for Effective UMLS Concept Mapping
The LSG has developed a new system for element synonym acquisition based on new enhanced requirements and design for better performance and the results show a 36.71 times growth of synonyms in the Lexicon (lexSynonym) in the 2017 release. Expand


Generating the MEDLINE N-Gram Set
This work processed 2.6 billion single words from 22.4 million MEDLINE documents (titles and abstracts) to generate MEDLINE n-grams (n = 1 to 5) with terms appearing at least 30 times and having less than 50 characters for the 2014 release to resolve the Java limitation issue. Expand
Word Frequency Distributions
This paper presents a meta-modelling framework for estimating the randomness of word frequency distributions using a variety of non-parametric and Parametric models. Expand
Word Frequency Distributions. 1 st Edition
  • Word Frequency Distributions. 1 st Edition
  • 2001
Word Frequency Distributions. 1st Edition
  • 2001