Learn More
Much progress has been made in the provision of reusable data resources for Natural Language Engineering, such as grammars, lexicons, thesauruses. Although a number of projects have addressed the provision of reusable algorithmic resources (or 'tools'), takeup of these resources has been relatively slow. This paper describes GATE, a General Architecture for(More)
A web services based architecture for Language Resources utilizing existing technology such as XML, SOAP, WSDL and UDDI is presented. The web services architecture creates a pervasive information infrastructure that enables straightforward access to two kinds of Language Resources: traditional information sources and language processing resources. Details(More)
The evaluation of ontologies is vital for the growth of the Semantic Web. We consider a number of problems in evaluating a knowledge artifact like an ontology. We propose in this paper that one approach to ontology evaluation should be corpus or data driven. A corpus is the most accessible form of knowledge and its use allows a measure to be derived of the(More)
course(2) fulfils this criterion, course is modif ied by hilly which expects a noun of type noumovable s o l i d . However , course(2) is m a r k e d a b s t r a c t , which does not comply with this restriction. Therefore, a ssuming that run is being used in its second sense leads to a si tuation in which there is no set of senses which comply wi th all(More)
The University of She eld NLP group took part in MUC-7 using the LaSIE-II system, an evolution of the LaSIE (Large Scale Information Extraction) system rst created for participation in MUC-6 [9] and part of a larger research e ort into information extraction underway in our group. LaSIE-II was used to carry out all ve of the MUC-7 tasks and was, in fact,(More)
Machine readable dictionaries (Mrds) contain knowledge about language and the world essential for tasks in natural language processing (Nlp). However, this knowledge, collected and recorded by lexicographers for human readers, is not presented in a manner for Mrds to be used directly for Nlp tasks. What is badly needed are machine tractable dictionaries(More)
The process of document annotation for the Semantic Web is complex and time consuming, as it requires a great deal of manual annotation. Information extraction from texts (IE) is a technology used by some very recent systems for reducing the burden of annotation. The integration of IE systems in annotation tools is quite a new development and there is still(More)
In this paper we propose a methodology to learn to automatically annotate domain-specific information from large repositories (e.g. Web sites) with minimum user intervention. The methodology is based on a combination of information extraction, information integration and machine learning techniques. Learning is seeded by extracting information from(More)