Learn More
Nowadays, many decisions are based on information found in the Web. For the most part, the disseminating sources are not certified, and hence an assessment of the quality and credibility of Web content became more important than ever. With <i>factual density</i> we present a simple statistical quality measure that is based on facts extracted from Web(More)
Classification and categorization are common tasks in data mining and knowledge discovery. Visualizations of classification models can create understanding and trust in data mining models. However, existing visualizations are often complex or restricted to specific classifiers and attributes. In this work, we propose an intuitive visualization system to(More)
The study explores the citedness of research data, its distribution over time and how it is related to the availability of a DOI (Digital Object Identifier) in Thomson Reuters' DCI (Data Citation Index). We investigate if cited research data " impact " the (social) web, reflected by altmetrics scores, and if there is any relationship between the number of(More)
In this study, we explore the citedness of research data, its distribution over time and its relation to the availability of a digital object identifier (DOI) in the Thomson Reuters database Data Citation Index (DCI). We investigate if cited research data “impacts” the (social) web, reflected by altmetrics scores, and if there is any relationship between(More)
The information obtained from the Web is increasingly important for decision making and for our everyday tasks. Due to the growth of uncertified sources, blogosphere, comments in the social media and automatically generated texts, the need to measure the quality of text information found on the Internet is becoming of crucial importance. It has been(More)
In this paper, we outline our experiments carried out at the TREC Microblog Track 2011. Our system is based on a plain text index extracted from Tweets crawled from twitter.com. This index has been used to retrieve candidate Tweets for the given topics. The resulting Tweets were post-processed and then analyzed using three different approaches: (i) a burst(More)
People use weblogs to express thoughts, present ideas and share knowledge. However, weblogs can also be misused to influence and manipulate the readers. Therefore the credibility of a blog has to be validated before the available information is used for analysis. The credibility of a blogentry is derived from the content, the credibility of the author or(More)
Introduction We are currently witnessing a change in scholarly communication. Next to the paper, complementary materials, such as research data, source code, and images are regarded as important outcomes that should be shared and built upon (Kraker et al., 2011). In this new ecosystem, many archives have been established that cater to the needs of a digital(More)
In this work we present APA Labs, a generic framework for visualizing the news article domain. APA Labs is a web-based platform enabling retrieval and analysis of news repositories provided by the Austrian Press Agency. APA Labs is designed as a rich internet application combined with a modular system of interactive visualizations. News articles are(More)