Diana Maynard

Learn More
In this paper we present GATE, a framework and graphical development environment which enables users to develop and deploy language engineering components and resources in a robust fashion. The GATE architecture has enabled us not only to develop a number of successful applications for various language processing tasks (such as Information Extraction), but(More)
Twitter is the largest source of microblog text, responsible for gigabytes of human discourse every day. Processing microblog text is difficult: the genre is noisy, documents have little context, and utterances are very short. As such, conventional NLP tools fail when faced with tweets and other microblog text. We present TwitIE, an open-source NLP pipeline(More)
In this paper we present recent work on GATE, a widely-used framework and graphical development environment for creating and deploying Language Engineering components and resources in a robust fashion. The GATE architecture has facilitated the development of a number of successful applications for various language processing tasks (such as Information(More)
The evaluation of the quality of ontological classification is an important part of semantic web technology. Because this area is under constant development, it requires improvement and standardisation. This paper discusses existing evaluation metrics, and proposes a new method for evaluating the ontology population task, which is general enough to be used(More)
Applying natural language processing for mining and intelligent information access to tweets (a form of microblog) is a challenging, emerging research area. Unlike carefully authored news text and other longer content, tweets pose a number of new challenges, due to their short, noisy, context-dependent, and dynamic nature. Information extraction from tweets(More)
Nous nous intéressons dans cet article aux méthodes superficielles de résolution d’anaphores et de construction des chaı̂nes de référence, que nous avons développées comme modules du système d’extraction d’information ANNIE. La module ”orthomatcher” traite la coréférence orthographique des noms propres et le module de résolution d’anaphores traite les(More)
Current research in Information Extraction tends to be focused on application-specific systems tailored to a particular domain. The Muse system is a multi-purpose Named Entity recognition system which aims to reduce the need for costly and time-consuming adaptation of systems to new applications, with its capability for processing texts from widely(More)
Ontology generation and population is a crucial part of knowledge base construction and maintenance that enables us to relate text to ontologies, providing a rich and customised ontology related to the data and domain with which we are concerned. SPRAT combines aspects from traditional named entity recognition, ontology-based information extraction and(More)
Business Intelligence (BI) requires the acquisition and aggregation of key pieces of knowledge from multiple sources in order to provide valuable information to customers or feed statistical BI models and tools. The massive amount of information available to business analysts makes information extraction and other natural language processing tools key(More)