PubMed Phrases, an open set of coherent phrases for searching biomedical literature
@article{Kim2018PubMedPA, title={PubMed Phrases, an open set of coherent phrases for searching biomedical literature}, author={Sun Kim and Lana Yeganova and Donald C. Comeau and W. John Wilbur and Zhiyong Lu}, journal={Scientific Data}, year={2018}, volume={5} }
In biomedicine, key concepts are often expressed by multiple words (e.g., ‘zinc finger protein’). Previous work has shown treating a sequence of words as a meaningful unit, where applicable, is not only important for human understanding but also beneficial for automatic information seeking. Here we present a collection of PubMed® Phrases that are beneficial for information retrieval and human comprehension. We define these phrases as coherent chunks that are logically connected. To collect the…
16 Citations
Towards a unified search: Improving PubMed retrieval with full text
- Computer ScienceJ. Biomed. Informatics
- 2022
PMCVec: Distributed phrase representation for biomedical text processing
- Computer ScienceJ. Biomed. Informatics X
- 2019
A novel MEDLINE topic indexing method using image presentation
- Computer ScienceJ. Vis. Commun. Image Represent.
- 2019
PubMed Author-assigned Keyword Extraction (PubMedAKE) Benchmark
- Economics, Computer ScienceCIKM
- 2022
Experimental results using state-of-the-art baseline methods illustrate the need for developing automatic keyword extraction methods for biomedical literature.
MeSH-based dataset for measuring the relevance of text retrieval
- Computer ScienceBioNLP
- 2018
This work selects a suitable subset of MeSH terms as queries, and utilizes MeSH term assignments as pseudo-relevance rankings for retrieval evaluation, and uses the proposed retrieval evaluation framework to better understand how to combine heterogeneous sources of textual information.
Robust Representation Learning of Biomedical Names
- Computer ScienceACL
- 2019
The idea behind the approach is to consider and encode contextual meaning, conceptual meaning, and the similarity between synonyms during the representation learning process, resulting in high practical utility in real-world applications.
A reference set of curated biomedical data and metadata from clinical case reports
- MedicineScientific Data
- 2018
A standardized metadata template and MACCR set are developed that render CCRs more findable, accessible, interoperable, and reusable while serving as valuable resources for key user groups, including researchers, physician investigators, clinicians, data scientists, and those shaping government policies for clinical trials.
A graph-based method for reconstructing entities from coordination ellipsis in medical text
- MedicineJ. Am. Medical Informatics Assoc.
- 2020
RECEEM improves concept normalization for medical coordinated elliptical expressions in a variety of biomedical corpora and outperformed existing methods and significantly enhanced the performance of 2 notable NLP systems for mapping coordination ellipses in the evaluation.
Fast searches of large collections of single cell data using scfind
- BiologybioRxiv
- 2019
Using transcriptome data from mouse cell atlases, scfind can be used to evaluate marker genes, to perform in silico gating, and to identify both cell-type specific and housekeeping genes, and a subquery optimization routine is developed to ensure that long and complex queries return meaningful results.
Epione application: An integrated web-toolkit of clinical genomics and personalized medicine in systemic lupus erythematosus
- BiologyInternational journal of molecular medicine
- 2022
The Epione application is presented, an integrated bioinformatics web-toolkit designed to assist medical experts and researchers in more accurately diagnosing SLE, and may assist and facilitate in early stage diagnosis by using the patients' genomic profile to compare against the list of the most predictable candidate gene variants related to SLE.
References
SHOWING 1-10 OF 41 REFERENCES
Identifying well-formed biomedical phrases in MEDLINE® text
- Computer ScienceJ. Biomed. Informatics
- 2012
How to interpret PubMed queries and why it matters
- Computer ScienceJ. Assoc. Inf. Sci. Technol.
- 2009
An automated retrieval evaluation method is developed, based on machine learning techniques, that enables us to evaluate and compare various retrieval outcomes and shows that the class of records that contain all the search terms, but not the phrase, qualitatively differs from theclass of records containing the phrase.
Summarizing Topical Contents from PubMed Documents Using a Thematic Analysis
- Computer ScienceEMNLP
- 2015
A method that finds sub-topics that are referred to as themes and computes representative titles based on a set of documents in each theme is proposed, which outperformed LDA and outperformed MeSH r terms.
Extracting noun phrases for all of MEDLINE
- Computer ScienceAMIA
- 1999
The extraction of noun phrases from MEDLINE is discussed, using a general parser not tuned specifically for any medical domain, and it is claimed that a generic parser can effectively extract all the different phrases across the entire medical literature.
Corpus-based statistical screening for phrase identification.
- Computer ScienceJournal of the American Medical Informatics Association : JAMIA
- 2000
Statistical scoring methods provide a promising approach to the extraction of useful phrases from a natural language database for the purpose of indexing or providing hyperlinks in text.
Understanding PubMed® user search behavior through log analysis
- Computer ScienceDatabase J. Biol. Databases Curation
- 2009
This investigation was conducted through the analysis of one month of log data, consisting of more than 23 million user sessions and more than 58 million user queries, which provided insight into PubMed users’ needs and their behavior.
Bridging the gap: Incorporating a semantic similarity measure for effectively mapping PubMed queries to documents
- Computer ScienceJ. Biomed. Informatics
- 2017
Meshable: searching PubMed abstracts by utilizing MeSH and MeSH-derived topical terms
- Computer ScienceBioinform.
- 2016
A web interface is introduced which allows users to enter queries to find MeSH terms closely related to the queries and can be effectively used to find full names of abbreviations and to disambiguate user queries.
Retro: concept-based clustering of biomedical topical sets
- Computer ScienceBioinform.
- 2014
Retro-a novel clustering algorithm that extracts meaningful clusters along with concise and descriptive titles from small and homogenous document collections, and is superior to existing methods in terms of quality of clusters.
Click-words: learning to predict document keywords from a user perspective
- Computer ScienceBioinform.
- 2010
This model is able to accurately predict the words likely to appear in user queries that lead to document clicks, and suggests that click-words tend to be biomedical entities, to exist in article titles, and to occur repeatedly in article abstracts.