Corpus ID: 17067139

Wikipedia Search as Effective Entity Linking Algorithm

@article{Dojchinovski2013WikipediaSA,
  title={Wikipedia Search as Effective Entity Linking Algorithm},
  author={Milan Dojchinovski and Ivo Lasek and Tom{\'a}{\vs} Kliegr and Ondrej Sv{\'a}b-Zamazal},
  journal={Theory and Applications of Categories},
  year={2013}
}
This paper reports on the participation of the LKD team in the English entity linking task at the TAC KBP 2013. We evaluated various modifications and combinations of the MostFrequent-Sense (MFS) based linking, the Entity Co-occurrence based linking (ECC), and the Explicit Semantic Analysis (ESA) based linking. We employed two our Wikipediabased NER systems, the Entityclassifier.eu and the SemiTags. Additionally, two Lucenebased entity linking systems were developed. For the competition we… Expand
LHD 2.0: A text mining approach to typing entities in knowledge graphs
TLDR
A novel technique for type inference that extracts types from the free text description of the entity combining lexico-syntactic pattern analysis with supervised classification is introduced. Expand
Towards Linked Hypernyms Dataset 2.0: complementing DBpedia with hypernym discovery
TLDR
A statistical type inference algorithm for ontology alignment, which assigns DBpedia entities with a new type (class) and a byproduct of the mapping process is a set of 11.000 mappings from DBpedia resources to DBpedia Ontology classes with associated confidence values. Expand
Linked hypernyms: Enriching DBpedia with Targeted Hypernym Discovery
TLDR
The Linked Hypernyms Dataset (LHD) provides entities described by Dutch, English and German Wikipedia articles with types in the DBpedia namespace using Hearst pattern matching over part-of-speech annotated text and disambiguated to DBpedia concepts. Expand

References

SHOWING 1-9 OF 9 REFERENCES
Semantic Relatedness for Named Entity Disambiguation Using a Small Wikipedia
TLDR
An evaluation of the state-of-the-art methods in this task for Basque NE disambiguation based on the Basque Wikipedia and a new normalization factor for ESA is proposed to minimise the effect of the knowledge base size. Expand
Overview of the TAC 2010 Knowledge Base Population Track
TLDR
An overview of the task definition and annotation challenges associated with KBP2010 is provided and the evaluation results and lessons that are learned are discussed based on detailed analysis. Expand
Entityclassifier.eu: Real-Time Classification of Entities in Text with Wikipedia
TLDR
Targeted Hypernym Discovery performs unsupervised classification of entities appearing in text using a hypernym mined from the free-text of the Wikipedia article describing the entity, enriched with additional types from DBpedia and YAGO knowledge bases providing a semantic web interoperability. Expand
An Evaluation of Technologies for Knowledge Base Population
TLDR
The motivation and design of the tasks, the language resources that were developed for this evaluation, comparisons to previous community evaluations, and briefly summarizes the performance obtained by systems are explained. Expand
Various approaches to text representation for named entity disambiguation
TLDR
It showed up that this method could be successfully applied also on other languages, not only on English, and the structured co‐occurrence representation provides the best disambiguation results. Expand
Learning to link with wikipedia
TLDR
This paper explains how machine learning can be used to identify significant terms within unstructured text, and enrich it with links to the appropriate Wikipedia articles, and performs very well, with recall and precision of almost 75%. Expand
Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis
TLDR
This work proposes Explicit Semantic Analysis (ESA), a novel method that represents the meaning of texts in a high-dimensional space of concepts derived from Wikipedia that results in substantial improvements in correlation of computed relatedness scores with human judgments. Expand
Insights into explicit semantic analysis
TLDR
A general probabilistic model for term weights is employed which reveals how ESA actually works and provides a theoretical grounding on how the size and the composition of the index collection affect the ESA-based computation of similarity values for texts. Expand
Algorithms for Scoring Coreference Chains
TLDR
This paper presents several diierent scoring algorithms and detail their respective strengths and weaknesses for varying classes of processing and demonstrates that tasks like information extraction have very diiesrent needs from information retrieval in terms of how to score the performance of coreference annotation. Expand