Learn More
Applying natural language processing for mining and intelligent information access to tweets (a form of microblog) is a challenging, emerging research area. Unlike carefully authored news text and other longer content, tweets pose a number of new challenges, due to their short, noisy, context-dependent, and dynamic nature. Information extraction from tweets(More)
Named entity recognition and disambiguation are of primary importance for extracting information and for populating knowledge bases. Detecting and classifying named entities has traditionally been taken on by the natural language processing community, whilst linking of entities to external resources, such as those in DBpedia, has been tackled by the(More)
Microposts are small fragments of social media content and a popular medium for sharing facts, opinions and emotions. Collectively, they comprise a wealth of data that is increasing exponentially, and which therefore presents new challenges for the Information Extraction community, among others. This paper describes the Making Sense of Microposts(More)
Named Entity Extraction is a mature task in the NLP field that has yielded numerous services gaining popularity in the Semantic Web community for extracting knowledge from web documents. These services are generally organized as pipelines, using dedicated APIs and different taxonomy for extracting, classifying and disambiguating named entities. Integrating(More)
The Web of data promotes the idea that more and more data are interconnected. A step towards this goal is to bring more structured annotations to existing documents using common vocabularies or ontolo-gies. Semi-structured texts such as scientific, medical or news articles as well as forum and archived mailing list threads or (micro-)blog posts can hence be(More)
We have often heard that data is the new oil. In particular, extracting information from semi-structured textual documents on the Web is key to realize the Linked Data vision. Several attempts have been proposed to extract knowledge from textual documents, extracting named entities, classifying them according to pre-defined taxonomies and disam-biguating(More)
We present GERBIL, an evaluation framework for semantic entity annotation. The rationale behind our framework is to provide developers, end users and researchers with easy-to-use interfaces that allow for the agile, fine-grained and uniform evaluation of annotation tools on multiple datasets. By these means, we aim to ensure that both tool developers and(More)
Microposts are small fragments of social media content and a popular medium for sharing facts, opinions and emotions. They comprise a wealth of data which is increasing exponentially, and which therefore presents new challenges for the information extraction community, among others. This paper describes the 'Making Sense of Microposts' (#Microposts2014)(More)
Microposts shared on social platforms instantaneously report facts, opinions or emotions. In these posts, entities are often used but they are continuously changing depending on what is currently trending. In such a scenario, recognising these named entities is a challenging task, for which off-the-shelf approaches are not well equipped. We propose NERD-ML,(More)
In this paper, we present NERD, an evaluation framework we have developed that records and analyzes ratings of Named Entity (NE) extraction and disambiguation tools working on English plain text articles performed by human beings. NERD enables the comparison of different popular Linked Data entity extractors which expose APIs such as AlchemyAPI, DBPedia(More)