Rezarta Islamaj Doğan

Learn More
A great deal of information on the molecular genetics and biochemistry of model organisms has been reported in the scientific literature. However, this data is typically described in free text form and is not readily amenable to computational analyses. To this end, the BioGRID database systematically curates the biomedical literature for genetic and protein(More)
This paper reports the use of BioC to address a common challenge in processing biomedical text information—that of frequent biomedical entity name abbreviation. We selected three different abbreviation definition identification modules, and used the publically available BioC code to convert these independent modules into BioC-compatible components that(More)
Term normalization is frequently used in information retrieval task to reduce variant word forms to a common form. The most general term normalization technique used in practice is stemming, however it has been found to not be completely reliable. Here we present PubTermVariants, a high-quality data-driven resource of term variant pairs that can improve(More)
  • 1