Learn More
This paper describes a new, freely available, highly multilingual named entity resource for person and organisation names that has been compiled over seven years of large-scale multilingual news analysis combined with Wikipedia mining, resulting in 205,000 person and organisation names plus about the same number of spelling variants written in over 20(More)
Recent years have brought a significant growth in the volume of research in sentiment analysis, mostly on highly subjective text types (movie or product reviews). The main difference these texts have with news articles is that their target is clearly defined and unique across the text. Following different annotation efforts and the analysis of the issues(More)
We are presenting a fully-automatic live online system (accessible at http://langtech.jrc.it/SocNet) that produces monolingual or mixed-language social network graphs showing which groups of persons are being mentioned together in the world news of the last few hours. The basis for this system are name mentions extracted automatically from an average of(More)
In May 2011, an outbreak of enterohemorrhagic Escherichia coli (EHEC) occurred in northern Germany. The Shiga toxin-producing strain O104:H4 infected several thousand people, frequently leading to haemolytic uremic syndrome (HUS) and gastroenteritis (GI). First reports about the outbreak appeared in the German media on Saturday 21st of May 2011; the media(More)
2 The mission of the JRC-IPSC is to provide research results and to support EU policy-makers in their effort towards global security and towards protection of European citizens from accidents, deliberate attacks, fraud and illegal actions against EU policies. The European Centre for Disease Prevention and Control (ECDC) was established in 2005. It is an EU(More)
In this paper we present an approach to large-scale coreference resolution for an ample set of human languages, with a particular emphasis on time performance and precision. One of the distinctive features of our approach is the use of a mature multilingual named entity repository (persons and organizations) gradually compiled over the past few years. Our(More)
This paper presents an endeavor aiming at construction of a real-time event extraction system for border security-related intelligence gathering from online news. First, the background and motivation behind the presented work is given. Next, the paper describes the event extraction processing chain, the specifics of the domain, i.e., illegal migration and(More)
The Europe Media Monitor (EMM) family of applications is a set of multilingual tools that gather, cluster and classify news in currently fifty languages and that extract named entities and quotations (reported speech) from twenty languages. In this paper, we describe the recent effort of adding the African Bantu language Swahili to EMM. EMM is designed in(More)
  • 1