Corpus ID: 26358322

Evaluating Tools for Automatic Concept Extraction: a Case Study from the Musicology Domain

  title={Evaluating Tools for Automatic Concept Extraction: a Case Study from the Musicology Domain},
  author={Scott Songlin Piao and Jamie Forth and Ricardo Gacit{\'u}a and Jon Whittle and Geraint A. Wiggins},
extraction algorithms have various applications in Digital Economy research with the rise of online sources. This paper reports on an evaluation of five term extraction algorithms for automatic concept extraction in the musicology domain, which is carried out in the context of the RCUK funded SerenA Project. Our focus here is to identify the algorithms that are most suitable for the task of concept extraction. In our evaluation, the C-value algorithm produced the best result, while others… Expand
Concept extraction and e-commerce applications
The experimental results demonstrate that ICE significantly outperforms ACE and also outperforms KEA in concept extraction, and is used to showcase two e-commerce applications, i.e. product matching and topic-based opinion mining. Expand
An Information Extraction System for English Ontology Identifier Names
I describe a system, Txt2ids, that uses a series of regular expressions to extract suggestions for ontology identifier names from English text and classify them as (i) class names, (ii) individualExpand
An ontology-based recommender system using scholar's background knowledge
Using knowledge items instead of keywords for profiling as well as transforming the knowledge items by DBpedia can significantly improve the recommendation performance, and the domain-specific reference ontology can effectively capture the full scholars’ knowledge which results to more accurate profiling. Expand
A reference ontology for profiling scholar's background knowledge in recommender systems
A method for integrating of multiple domain taxonomies to build a reference ontology for profiling scholars' background knowledge is proposed, and the empirical results show an improvement over the existing reference ontologies in terms of completeness, richness, and coverage. Expand
Capturing scholar's knowledge from heterogeneous resources for profiling in recommender systems
This work first model the scholars' academic behavior and extract different knowledge items, diffused over the Web including mediated profiles in digital libraries, and then integrate those heterogeneous knowledge items by Wikipedia. Expand
This paper describes the method and development of earthquake ontology for prevention and prediction. In our previous work, we develop ontologies for organization knowledge such as university orExpand
Context ontology for humanitarian assistance in crisis response
The paper presents a method which merges ontologies and logic rules to represent the humanitarian needs and recommend appropriate humanitarian responses automatically so that the decision makers are not overwhelmed with massive and unrelated information and can focus more on implementing the solutions. Expand
SerenA: A Multi-site Pervasive Agent Environment That Supports Serendipitous Discovery in Research
SerenA, a multi-site, pervasive, agent environment that suppers serendipitous discovery in research, attempts to assist researchers by presenting them with information that they did not know they needed to know about their research. Expand


Glossary extraction and utilization in the information search and delivery system for IBM Technical Support
This paper proposes a number of enhancements to the existing glossary extraction process, including focusing the glossary on a selected domain context, providing support for multidomain glossaries, and importing domain-specific dictionaries. Expand
A Comparative Evaluation of Term Recognition Algorithms
This paper evaluated the six approaches using two different corpora and showed how the voting algorithm performs best on one corpus and less well using the Genia corpus, indicating that choice and design of corpus has a major impact on the evaluation of term recognition algorithms. Expand
University of Surrey Participation in TREC8: Weirdness Indexing for Logical Document Extrapolation and Retrieval (WILDER)
This paper describes the development of a prototype document retrieval system based on frequency calculations and corpora comparison techniques, and uses term identification and extraction techniques for identifying topics discussed in a given text. Expand
Automatic recognition of multi-word terms:. the C-value/NC-value method
This paper presents a domain-independent method for the automatic extraction of multi-word terms, from machine-readable special language corpora, using C-value/NC-value, which enhances the common statistical measure of frequency of occurrence for term extraction, making it sensitive to a particular type ofMulti- word terms, the nested terms. Expand
On the Effectiveness of Abstraction Identification in Requirements Engineering
A new technique for the identification of single- and multi-word abstractions named Relevance driven abstraction identification (RAI) is proposed and a corresponding tool implementation is presented, an experiment comparing the effectiveness of the technique with human judgement and with a different technique proposed in the literature is presented. Expand
TermExtractor: a Web Application to Learn the Shared Terminology of Emergent Web Communities
A high-performing technique to automatically extract the shared terminology from available documents in a given domain is designed and submitted for large-scale evaluation in the domain of enterprise interoperability, by the member of the INTEROP network of excellence. Expand
Information Retrieval
  • A. Dekhtyar
  • Medicine, Computer Science
  • Lecture Notes in Computer Science
  • 1968
A novel method to efficiently represent the behaviors of query reformulation by the translating embedding from the original query to its reformulated query by utilizing two-stage training algorithm to make the learning of multilevel intentions representation more adequate. Expand
Making Tacit Requirements Explicit
  • Ricardo Gacitúa, Lin Ma, +6 authors H. Yang
  • Computer Science, Engineering
  • 2009 Second International Workshop on Managing Requirements Knowledge
  • 2009
A number of techniques are described that offer analysts the means to reason about the effect of tacit knowledge and improve the quality of requirements and their management. Expand
100 million words of English
A description of the background, nature and prospects of the British National Corpus project
100 Million Words of English:The British National Corpus (BNC)
100 Million Words of English: The British National Corpus (BNC) is a celebration of 100 million words of English. Expand