Sophia Ananiadou

Learn More
Technical terms (henceforth called terms ), are important elements for digital libraries. In this paper we present a domain-independent method for the automatic extraction of multi-word terms, from machine-readable special language corpora. The method, (C-value/NC-value ), combines linguistic and statistical information. The first part, C-value, enhances(More)
We introduce the brat rapid annotation tool (BRAT), an intuitive web-based tool for text annotation supported by Natural Language Processing (NLP) technology. BRAT has been developed for rich structured annotation for a variety of NLP tasks and aims to support manual curation efforts and increase annotator productivity using NLP techniques. We discuss(More)
This paper presents a part-of-speech tagger which is specifically tuned for biomedical text. We have built the tagger with maximum entropy modeling and a state-of-the-art tagging algorithm. The tagger was trained on a corpus containing newspaper articles and biomedical documents so that it would work well on various types of biomedical text. Experimental(More)
OBJECTIVE: The prior knowledge about the rhetorical structure of scientific abstracts is useful for various text-mining tasks such as information extraction, information retrieval, and automatic summarization. This paper presents a novel approach to categorize sentences in scientific abstracts into four sections, objective, methods, results, and(More)
Stochastic gradient descent (SGD) uses approximate gradients estimated from subsets of the training data and updates the parameters in an online fashion. This learning framework is attractive because it often requires much less training time in practice than batch training algorithms. However, L1-regularization, which is becoming popular in natural language(More)
Text mining is defined by Hearst (1999) as the automatic discovery of new, previously unknown, information from unstructured textual data. This is often seen as comprising of three major tasks: information retrieval (gathering relevant documents), information extraction (extracting information of interest from these documents), and data mining (discovering(More)
'l?his paper 1)rovidcs an at)l)roa(:h to tim semi-aul;onmtic exl;i'action of (:ollocaIJons f lom eorl)ora using sl;atisti(:s. The growing availability of lm'ge textual cort)ora, and the in(:reasing number of applications of colloeal;ion extra(:tion, has given risc~ 1;o wu.ious apt)roaches on the I;opi(:. In l;his palter, we address the probl(;m of 'ne,stcd(More)
The topic of automatic term recognition (ATR) is of great interest especially with the growth of NLP systems, which are passing from the development stage to the application stage. The application of NLP technology involves customlsing systems towards specific needs, particularly in specialised domains (sublanguages) which form the main target of the(More)