Anita Alicante

Learn More
This paper proposes and discusses the use of text mining techniques for the extraction of information from clinical records written in Italian. However, as it is very difficult and expensive to obtain annotated material for languages different from English, we only consider unsupervised approaches, where no annotated training set is necessary. We therefore(More)
The aim of this paper is to contribute to the debate on the issues raised by Morphologically Rich Languages, and more precisely to investigate, in a cross-paradigm perspective, the influence of the constituent order on the data-driven parsing of one of such languages (i.e. Italian). It shows therefore new evidence from experiments on Italian, a language(More)
In this work we consider the problem of extracting concepts and relations between them from documents, aiming at constructing an index for a more semantically oriented search engine. While assessment is performed on a biomedical application, the proposed solutions can be also applied to different domains. With the distributed architecture proposed, we(More)
This paper discusses the application of an unsupervised text mining technique for the extraction of information from clinical records in Italian. The approach includes two steps. First of all, a metathesaurus is exploited together with natural language processing tools to extract the domain entities. Then, clustering is applied to explore relations between(More)
English. A crucial point for the applicability of sentiment analysis over Twitter is represented by the degree of manual intervention necessary to adapt the approach to the considered domain. In this work we propose a new sentiment polarity classi-fier exploiting barrier features, originally introduced for the classification of textual data. Empirical tests(More)
English. We propose a system to extract entities and relations from a set of clinical records in Italian based on two preceding works (Alicante et al., 2016b) and (Al-icante et al., 2016a). This approach does not require annotated data and is based on existing domain lexical resources and un-supervised machine learning techniques. Italiano. Proponiamo un(More)