Jenya Belyaeva

Learn More
This paper describes a new, freely available, highly multilingual named entity resource for person and organisation names that has been compiled over seven years of large-scale multilingual news analysis combined with Wikipedia mining, resulting in 205,000 person and organisation names plus about the same number of spelling variants written in over 20(More)
Recent years have brought a significant growth in the volume of research in sentiment analysis, mostly on highly subjective text types (movie or product reviews). The main difference these texts have with news articles is that their target is clearly defined and unique across the text. Following different annotation efforts and the analysis of the issues(More)
In May 2011, an outbreak of enterohemorrhagic Escherichia coli (EHEC) occurred in northern Germany. The Shiga toxin-producing strain O104:H4 infected several thousand people, frequently leading to haemolytic uremic syndrome (HUS) and gastroenteritis (GI). First reports about the outbreak appeared in the German media on Saturday 21st of May 2011; the media(More)
<lb>The mission of the European Centre for Diseases prevention and Control (ECDC) is to identify, assess and<lb>communicate current and emerging infectious threats to human health within the European Union (EU). The<lb>identification of threats is based on the collection and analysis of information from established communicable<lb>disease surveillance(More)
In this paper we present an approach to large-scale coreference resolution for an ample set of human languages, with a particular emphasis on time performance and precision. One of the distinctive features of our approach is the use of a mature multilingual named entity repository (persons and organizations) gradually compiled over the past few years. Our(More)
This paper presents an endeavor aiming at construction of a real-time event extraction system for border security-related intelligence gathering from online news. First, the background and motivation behind the presented work is given. Next, the paper describes the event extraction processing chain, the specifics of the domain, i.e., illegal migration and(More)
The Europe Media Monitor (EMM) family of applications is a set of multilingual tools that gather, cluster and classify news in currently fifty languages and that extract named entities and quotations (reported speech) from twenty languages. In this paper, we describe the recent effort of adding the African Bantu language Swahili to EMM. EMM is designed in(More)
  • 1