Aleksander Smywinski-Pohl

Learn More
In this paper we try to answer the question how cross-lingual evidence may improve matching between dierent classication schemas. We concentrate specically on the task of mapping between Wikipedia categories and Cyc terms as well as the classication of Wikipedia articles to the Cyc taxonomy and show how this process may be improved by consuming the evidence(More)
—This document describes the improvements of the Wikipedia Miner word sense disambiguation algorithm. The original algorithm performs very well in detecting key terms in documents and disambiguating them against Wikipedia articles. By replacing the original Normalized Google Distance inspired measure with Jaccard coefficient inspired measure and taking into(More)
We investigate whether language models used in automatic speech recognition (ASR) should be trained on speech transcripts rather than on written texts. By calculating log-likelihood statistic for part-of-speech (POS) n-grams, we show that there are significant differences between written texts and speech transcripts. We also test the performance of language(More)
The aim of the research presented in the article is the mapping between the English Wikipedia categories and OpenCyc types. The mapping algorithm is heuristic and it takes into account structural similarities between the categories and the corresponding types. The achieved mapping precision ranges from 82 to 92 % (depending on the evaluation scheme), recall(More)