• Publications
  • Influence
Automatic recognition of multi-word terms:. the C-value/NC-value method
Abstract.Technical terms (henceforth called terms ), are important elements for digital libraries. In this paper we present a domain-independent method for the automatic extraction of multi-wordExpand
  • 769
  • 87
  • PDF
brat: a Web-based Tool for NLP-Assisted Text Annotation
We introduce the brat rapid annotation tool (BRAT), an intuitive web-based tool for text annotation supported by Natural Language Processing (NLP) technology. BRAT has been developed for richExpand
  • 784
  • 75
  • PDF
Developing a Robust Part-of-Speech Tagger for Biomedical Text
This paper presents a part-of-speech tagger which is specifically tuned for biomedical text. We have built the tagger with maximum entropy modeling and a state-of-the-art tagging algorithm. TheExpand
  • 502
  • 35
Extracting Nested Collocations
This paper provides an approach to the semi-automatic extraction of collocations from corpora using statistics. The growing availability of large textual corpora, and the increasing number ofExpand
  • 149
  • 29
  • PDF
Distributional Semantics Resources for Biomedical Text Processing
The openly available biomedical literature contains over 5 billion words in publication abstracts and full texts. Recent advances in unsupervised language processing methods have made it possible toExpand
  • 345
  • 26
  • PDF
Stochastic Gradient Descent Training for L1-regularized Log-linear Models with Cumulative Penalty
Stochastic gradient descent (SGD) uses approximate gradients estimated from subsets of the training data and updates the parameters in an online fashion. This learning framework is attractive becauseExpand
  • 197
  • 25
  • PDF
Identifying Sections in Scientific Abstracts using Conditional Random Fields
OBJECTIVE: The prior knowledge about the rhetorical structure of scientific abstracts is useful for various text-mining tasks such as information extraction, information retrieval, and automaticExpand
  • 147
  • 24
  • PDF
A Neural Layered Model for Nested Named Entity Recognition
Entity mentions embedded in longer entity mentions are referred to as nested entities. Most named entity recognition (NER) systems deal only with the flat entities and ignore the inner nested ones,Expand
  • 70
  • 24
  • PDF
The C-value/NC-value Method of Automatic Recognition for Multi-Word Terms
Technical terms (henceforth called simply terms), are important elements for digital libraries. In this paper we present a domain-independent method for the automatic extraction of multi-word terms,Expand
  • 198
  • 22
Text Mining for Biology And Biomedicine
Introduction to Text Mining for Biology. Levels of Natural Language Processing for Text Mining. Lexical, Terminological and Ontological Resources For Biological Text Mining. Automatic TerminologyExpand
  • 362
  • 20
  • PDF