Heinz-Peter Lang

  • Citations Per Year
Learn More
This paper presents the Media Watch on Climate Change, a public Web portal that captures and aggregates large archives of digital content from multiple stakeholder groups. Each week it assesses the domainspecific relevance of millions of documents and user comments from news media, blogs, Web 2.0 platforms such as Facebook, Twitter and YouTube, the Web(More)
Organizations require tools that can assess their online reputations as well as the impact of their marketing and public outreach activities. The Media Watch on Climate Change is a Web intelligence and online collaboration platform that addresses this requirement. It aggregates large archives of digital content from multiple stakeholder groups and enables(More)
Web pages not only contain main content, but also other elements such as navigation panels, advertisements and links to related documents. Furthermore, overview pages (summarization pages and entry points) duplicate and aggregate parts of articles and thereby create redundancies. The noise elements in Web pages as well as overview pages affect the(More)
Knowledge capture approaches in the age of massive Web data require robust and scalable mechanisms to acquire, consolidate and pre-process large amounts of heterogeneous data, both unstructured and structured. This paper addresses this requirement by introducing the Extensible Web Retrieval Toolkit (eWRT), a modular Python API for retrieving social data(More)
The <i>webLyzard</i> media monitoring and Web intelligence platform (www.webLyzard.com) presented in this paper is a generic tool for assessing the strategic positioning of an organization and the effectiveness of its communication strategies. The platform captures and aggregates large archives of digital content from multiple stakeholder groups. Each week(More)
  • 1