Corpus ID: 13909378

Flexible UIMA Components for Information Retrieval Research

@inproceedings{Mller2008FlexibleUC,
  title={Flexible UIMA Components for Information Retrieval Research},
  author={Christof M{\"u}ller and Torsten Zesch and M. M{\"u}ller and Delphine Bernhard and Kateryna Ignatova and Iryna Gurevych and M. M{\"u}hlh{\"a}user},
  year={2008}
}
In this paper, we present a suite of flexible UIMA-based components for information retrieval research which have been successfully used (and re-used) in several projects in different application domains. Implementing the whole system as UIMA components is beneficial for configuration management, component reuse, implementation costs, analysis and visualization. 
An architecture to support intelligent user interfaces for Wikis by means of Natural Language Processing
TLDR
An architecture for integrating a set of Natural Language Processing (NLP) techniques with a wiki platform that entails support for adding, organizing, and finding content in the wiki and an intelligent interface to provide suggestions is presented. Expand
Information Extraction with the Darmstadt Knowledge Processing Software Repository (Extended Abstract)
TLDR
The DKPro repository consists of several main parts created to serve the purposes of different NLP application areas, including a highly flexible, scalable and easy-to-use toolkit that allows rapid creation of complex NLP pipelines for semantic information processi ng on demand. Expand
Between Platform and APIs: Kachako API for Developers
TLDR
Reusing an existing NLP platform Kachako, an API oriented NLP system is created that loosely couples rich high-end functions, including annotation visualizations, statistical evaluations, an-notation searching, etc. Expand
Understanding the Information Needs of Web Archive Users
A complete characterization of web archive users must respond to three questions: why, what and how do users search? This study focuses on the first two: what are the user intents and which topicsExpand
An Approach For Evaluation Of Semantic Performance Of Search Engines: Google, Yahoo, Msn And Hakia
TLDR
Comparison of web document retrieval performance and calculation of relative precision of the two sets of data show maximum relative precision for Hakia search engine followed by exchange places search engines of Yahoo and Google, where the lowest relevant precision is shown by Msn. Expand
Combining Answers from heterogeneous Web Documents for Question Answering
TLDR
The design and implementation of a question answering system that generates a summarized answer for open-domain natural language queries and aims to increase the quality of existing systems by using heterogeneous documents from Wikipedia, Yahoo! Answers and Frequently Asked Questions is described. Expand
Information extraction for the geospatial domain
TLDR
New approaches which were implemented as prototypes and evaluated for toponym recognition and the results showed that machine learning based classifiers perform well for resolving the geo/non-geo ambiguity. Expand
DKPro-UGD: A Flexible Data-Cleansing Approach to Processing User-Generated Discourse
TLDR
The five-stage data cleansing approach proposed here offers a maximum of flexibility in identifying problematic artifacts, deciding how to deal with them and analysing cleansed data and creating reusable UIMA-based components for the actual data cleansing and for mapping annotations created on the clean data back to the original representation. Expand
Terminology Evolution Module for Web Archives in the LiWA Context∗
More and more national libraries and institutes are archiving the web as a part of the cultural heritage. As with all long term archives, these archives contain text and language that evolves overExpand
An Empirical Evaluation on Semantic Search Performance of Keyword-Based and Semantic Search Engines: Google, Yahoo, Msn and Hakia
TLDR
It was found that semantic search performance of search engines was high for both keyword-based search engines and the semantic search engine, whereas Google turned-out to be the best search engine in terms of normalized recall ratio. Expand
...
1
2
...

References

SHOWING 1-6 OF 6 REFERENCES
What to be? - Electronic Career Guidance Based on Semantic Relatedness
TLDR
A study aimed at investigating the use of semantic information in a novel NLP application, Electronic Career Guidance (ECG), in German, and evaluating the performance of SR measures intrinsically on the tasks of computing SR, and solving Reader’s Digest Word Power questions. Expand
Using the Structure of a Conceptual Network in Computing Semantic Relatedness
TLDR
The method relies solely on the structure of a conceptual network and eliminates the need for performing additional corpus analysis and can be easily applied to compute semantic relatedness based on alternative conceptual networks, e.g. in the domain of life sciences. Expand
Retrieval Models and Q and A Learning with FAQ Files
TLDR
The issue of paraphrase recognition has been receiving attention in question-answering research as a way to fill the gap between words in a question and those in an answer, and the primary focus is on finding an FAQ question which is similar to the user query/question, that is, a Q-to-Q match. Expand
Retrieving answers from frequently asked questions pages on the web
We address the task of answering natural language questions by using the large number of Frequently Asked Questions (FAQ) pages available on the web. The task involves three steps: (1) fetching FAQExpand
Learning Question Paraphrases for QA from Encarta Logs
TLDR
A method is proposed that exploits Encarta logs to automatically identify question paraphrases and extract templates, which can evidently outperform the unsupervised method. Expand
Probabilistic part-of-speech tagging using decision trees
In this paper, a new probabilistic tagging method is presented which avoids problems that Markov Model based taggers face, when they have to estimate transition probabilities from sparse data. InExpand