• Corpus ID: 457325

Wikulu: An Extensible Architecture for Integrating Natural Language Processing Techniques with Wikis

  title={Wikulu: An Extensible Architecture for Integrating Natural Language Processing Techniques with Wikis},
  author={Daniel B{\"a}r and Nicolai Erbs and Torsten Zesch and Iryna Gurevych},
We present Wikulu, a system focusing on supporting wiki users with their everyday tasks by means of an intelligent interface. Wikulu is implemented as an extensible architecture which transparently integrates natural language processing (NLP) techniques with wikis. It is designed to be deployed with any wiki platform, and the current prototype integrates a wide range of NLP algorithms such as keyphrase extraction, link discovery, text segmentation, summarization, or text similarity… 
A hybrid English to Malayalam machine translator for Malayalam content creation in Wikis
This paper considers the idea of integration of a Hybrid Machine Translator service, a combination of statistical machine Translator and Translation Memory, with wiki system for automated content generation in Malayalam to speed up the content creation in wikis and bridging the language barrier which restricts knowledge dissemination in the web.
Approaches to Automatic Text Structuring
Two prototypes of textStructuring systems are presented, which integrate techniques for automatic text structuring in a wiki setting and in an e-learning setting with eBooks, and the effect of senses on computing similarities is analyzed.
The quality of content in open online collaboration platforms: approaches to NLP-supported information quality management in Wikipedia
A comprehensive article quality model is defined that aims to consolidate both the quality of writing and the quality criteria defined in multiple Wikipedia guidelines and policies into a single model and an approach for automatically identifying quality flaws in Wikipedia articles is presented.
A composite model for computing similarity between texts
This thesis presents the implementation of a text similarity system which composes a multitude of text similarity measures along multiple text dimensions using a machine learning classifier and proposes a classification into compositional and non-compositionalText similarity measures according to their inherent properties.
Dedicated Support for Experience Sharing in Distributed Software Projects
Preliminary evaluation with a prototype shows that dedicated tool support and automated heuristic critiques can increase willingness to submit experience as well as their quality, and easily accessible and integrated into a trustworthy experience engineering processes.


Connecting wikis and natural language processing systems
A number of practical application examples are provided, including index generation, question answering, and automatic summarization, which demonstrate the practicability and usefulness of the integration of Wiki systems with automated natural language processing techniques.
Semantic MediaWiki
The software is already used on a number of productive installations world-wide, but the main target remains to establish “Semantic Wikipedia” as an early adopter of semantic technologies on the web.
UIMA: an architectural approach to unstructured information processing in the corporate research environment
A general introduction to U IMA is given focusing on the design points of its analysis engine architecture and how UIMA is helping to accelerate research and technology transfer is discussed.
GPX: Ad-Hoc Queries and Automated Link Discovery in the Wikipedia
A simplification whereby the score of each node is computed directly, doing away with the score propagation mechanism is described, indicating slightly improved performance in the INEX 2007 evaluation.
Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis
This work proposes Explicit Semantic Analysis (ESA), a novel method that represents the meaning of texts in a high-dimensional space of concepts derived from Wikipedia that results in substantial improvements in correlation of computed relatedness scores with human judgments.
What to be? - Electronic Career Guidance Based on Semantic Relatedness
A study aimed at investigating the use of semantic information in a novel NLP application, Electronic Career Guidance (ECG), in German, and evaluating the performance of SR measures intrinsically on the tasks of computing SR, and solving Reader’s Digest Word Power questions.
KEA: practical automatic keyphrase extraction
This paper uses a large test corpus to evaluate Kea’s effectiveness in terms of how many author-assigned keyphrases are correctly identified, and describes the system, which is simple, robust, and publicly available.
Intranet wikis
Historically, when business organizations realized that web technology could also be used internally, the development of intra-webs took off rapidly. In 2002, several studies showed that 75% of web
University of Waterloo at INEX2007: Adhoc and Link-the-Wiki Tracks
University of Waterloo's baseline approaches to the Adhoc, Book, and Link-the-Wiki tracks indicate that the baseline approaches work best, although other approaches have rooms for improvement.
LexRank: Graph-based Lexical Centrality as Salience in Text Summarization
A new approach, LexRank, for computing sentence importance based on the concept of eigenvector centrality in a graph representation of sentences is considered and the LexRank with threshold method outperforms the other degree-based techniques including continuous LexRank.