BioExcom: Detection and Categorization of Speculative Sentences in Biomedical Literature

Biological research papers are replete with speculative sentences. We present the BioExcom rule-based system, which detects speculations in biomedical literature. Furthermore, it enables to distinguish automatically between prior and new speculations in the analyzed paper. BioExcom is based on the Contextual Exploration processing (hierarchical research of linguistic surface markers with the EXCOM computational platform). To accomplish this task, BioExcom uses also specific linguistic resources… 
Constructing Synthesized Sheets by Mining Scientific Research Papers: Application to the Biological Domain
This chapter presents a text mining tool for scientific publications that allows the extraction of textual segments from a large corpora according to a set of semantic categories (results, methods, hypothesis, etc.).
Scientific Monitoring by Mining Scientific Papers
This paper presents a method to exploit linguistic ontologies to generate semantic annotations of the corpus and identified fifteen key points around the definition to represent the current issues.
Discursive Mining Viewpoints in Building Multi-Document Synthesized Sheets
Multi-documents sheets are viewed as semantically structured representations of textual documents that allows the an-notation of segments (which can be a title, a paragraph, a sentence or a clause) according to a given discursive mining viewpoint.
Management of Big Textual Data in Qualitative Research: Organizing the Relationships in a Typology based on Logical Properties
A structured set of relationships is proposed, based on a linguistic model, the Applicative and Cognitive Grammar (ACG) of Jean-Pierre Descles, to better designing the knowledge structures underlying the concepts of a field.
Towards an automatic analyze and standardization of unstructured data in the context of big and linked data
The goal of the research described here is to present an approach for automating the detection and the extraction of meaning from unstructured Web using its normalized part: Web of data & Linked Open data (LOD) such as RDF WordNet, DBpedia, etc.


Automatic annotation of speculation in biomedical texts: new perspectives and large-scale evaluation
This work enables it to highlight a more restrictive way to consider speculations, viewed as a source of knowledge, and to discuss the criteria used to determine if a sentence is speculative or not.
Detecting Speculations and their Scopes in Scientific Text
This work introduces an approach which is based on solving two sub-problems to identify speculative sentence fragments by using a diverse set of linguistic features that represent the contexts of the keywords.
Recognizing speculative language in biomedical research articles: a linguistically motivated perspective
The results demonstrate that speculative language can be recognized successfully with a linguistically motivated approach and confirms that selection of hedging devices affects the speculative strength of the sentence, which can be captured reasonably by weighting the hedging cues.
The BioScope corpus: annotation for negation, uncertainty and their scope in biomedical texts
A corpus annotation project that has produced a freely available resource for research on handling negation and uncertainty in biomedical texts and is called the BioScope corpus, which consists of medical free texts, biological full papers and biological scientific abstracts.
Hedge Classification in Biomedical Texts with a Weakly Supervised Selection of Keywords
This paper demonstrates the importance of hedge classification experimentally in two real life scenarios, namely the ICD9-CM coding of radiology reports and gene name Entity Extraction from scientific texts, and develops a maxent-based solution for both the free text and scientific text processing tasks.
Learning the Scope of Hedge Cues in Biomedical Texts
It is shown that the same scope finding approach can be applied to both negation and hedging, and the system is tested on the three subcorpora of the BioScope corpus that represent different text types.
Exploring hedge identification in biomedical literature
The Language of Bioscience: Facts, Speculations, and Statements In Between
Results from a manual annotation experiment suggest that the notion of speculative sentence can be reliably annotated by humans and an experiment with automated methods also suggest that reliable automated methods might also be developed.
Conceptual biology, hypothesis discovery, and text mining: Swanson's legacy
