Advancing Science through Mining Libraries, Ontologies, and Communities*
@article{Evans2011AdvancingST, title={Advancing Science through Mining Libraries, Ontologies, and Communities*}, author={James A. Evans and A. Rzhetsky}, journal={The Journal of Biological Chemistry}, year={2011}, volume={286}, pages={23659 - 23666} }
Life scientists today cannot hope to read everything relevant to their research. Emerging text-mining tools can help by identifying topics and distilling statements from books and articles with increased accuracy. Researchers often organize these statements into ontologies, consistent systems of reality claims. Like scientific thinking and interchange, however, text-mined information (even when accurately captured) is complex, redundant, sometimes incoherent, and often contradictory: it is…
14 Citations
Exploiting Latent Features of Text and Graphs
- Computer Science
- 2020
This dissertation focuses on information available within biomedical science, including human-written abstracts of scientific papers, as well as machinegenerated graphs of biomedical entity relationships, and presents the Moliere system, and a deep-learning approach to hypothesis generation.
MOLIERE: Automatic Biomedical Hypothesis Generation System
- Computer ScienceKDD
- 2017
This work model hypotheses using Latent Dirichlet Allocation applied on abstracts found near shortest paths discovered within a multi-modal and multi-relational network of biomedical objects extracted from several heterogeneous datasets from the National Center for Biotechnology Information (NCBI).
Predicting research trends with semantic and neural networks with an application in quantum physics
- Computer ScienceProceedings of the National Academy of Sciences
- 2020
The development of a semantic network for quantum physics, denoted SemNet, is demonstrated using 750,000 scientific papers and knowledge from books and Wikipedia, which is used to predict future trends in research and to inspire personalized and surprising seeds of ideas in science.
Supersemantics for Knowledge Extraction
- Computer Science
- 2015
This thesis introduces Supersemantics as an approach to integrate different linguistic and other adjacent fields and bridges the boundaries between typical units of linguistics as well as external knowledge.
Data Mining Approach for Extraction of Useful Information About Biologically Active Compounds from Publications
- Biology, Computer ScienceJ. Chem. Inf. Model.
- 2019
This study has developed and validated a data mining approach for extraction of text fragments containing description of bioassays and used it to evaluate compounds and their biological activity reported in scientific publications and found that categorization of papers into relevant and irrelevant may be performed based on the machine learning analysis of the abstracts.
Text mining applications in psychiatry: a systematic literature review
- Psychology, MedicineInternational journal of methods in psychiatric research
- 2016
Text mining approaches are becoming essential to facilitate the automated extraction of useful biomedical information from unstructured text, and it is demonstrated that TM can contribute to complex research tasks in psychiatry.
Tradition and Innovation in Scientists’ Research Strategies
- EconomicsArXiv
- 2013
By studying prizewinners in biomedicine and chemistry, it is shown that occasional gambles for extraordinary impact are a compelling explanation for observed levels of risky innovation.
Data analysis and data mining: current issues in biomedical informatics.
- Medicine, Computer ScienceMethods of information in medicine
- 2011
Biomedical informatics represents a natural framework to properly and effectively apply data analysis and data mining methods in a decision-making context and will be necessary to preserve the inclusive nature of the field and to foster an increasing sharing of data and methods between researchers.
References
SHOWING 1-10 OF 38 REFERENCES
Strategic Reading, Ontologies, and the Future of Scientific Publishing
- Computer ScienceScience
- 2009
How scientists use new forms of the “literature” and how the ascendance of novel computing technologies will combine to revolutionize the way scientific data is accessed, synthesized, and turned to practical use are reviewed.
Rutabaga by any other name: extracting biological names
- Computer ScienceJ. Biomed. Informatics
- 2002
Using statistical and knowledge-based approaches for literature-based discovery
- BiologyJ. Biomed. Informatics
- 2006
Infotopia: How Many Minds Produce Knowledge
- Economics
- 2006
This book explores the human potential to pool widely dispersed information, and to use that knowledge to improve both our institutions and our lives. Various methods for aggregating information are…
A translation approach to portable ontology specifications
- Computer Science
- 1993
This paper describes a mechanism for defining ontologies that are portable over representation systems, basing Ontolingua itself on an ontology of domain-independent, representational idioms.
Microparadigms: chains of collective reasoning in publications about molecular interactions.
- BiologyProceedings of the National Academy of Sciences of the United States of America
- 2006
It is found that published statements, regardless of their verity, tend to interfere with interpretation of the subsequent experiments and, therefore, can act as scientific "microparadigms," similar to dominant scientific theories.
How citation distortions create unfounded authority: analysis of a citation network
- MedicineBMJ : British Medical Journal
- 2009
Citation is both an impartial scholarly method and a powerful form of social communication that can be used to generate information cascades resulting in unfounded authority of claims.
GENIES: a natural-language processing system for the extraction of molecular pathways from journal articles
- Biology, Computer ScienceISMB
- 2001
A system is presented that extracts and structures information about cellular pathways from the biological literature in accordance with a knowledge model that was developed earlier and implemented by modifying an existing medical natural language processing system.
Biomedical Discovery Acceleration, with Applications to Craniofacial Development
- Computer SciencePLoS Comput. Biol.
- 2009
A novel computational approach to this challenge, a knowledge-based system that combines reading, reasoning, and reporting methods to facilitate analysis of experimental data, is described and demonstrated on a large-scale gene expression array dataset relevant to craniofacial development.
The outcomes of pathway database computations depend on pathway ontology
- BiologyNucleic acids research
- 2006
Compared KEGG and BioCyc pathways are compared using genome context methods, which determine the functional relatedness of pairs of genes, supporting the conclusion that theBioCyc pathway conceptualization is closer to a single conserved biological process than is that of K EGG.