What Computers Should Know, Shouldn't Know, and Shouldn't Believe

@article{Weikum2017WhatCS,
  title={What Computers Should Know, Shouldn't Know, and Shouldn't Believe},
  author={Gerhard Weikum},
  journal={Proceedings of the 26th International Conference on World Wide Web Companion},
  year={2017}
}
  • G. Weikum
  • Published 3 April 2017
  • Computer Science
  • Proceedings of the 26th International Conference on World Wide Web Companion
Automatically constructed knowledge bases (KB's) are a powerful asset for search, analytics, recommendations and data integration, with intensive use at big industrial stake-holders. Examples are the knowledge graphs for search engines (e.g., Google, Bing, Baidu) and social networks (e.g., Facebook), as well as domain-specific KB's (e.g., Bloomberg, Walmart). These achievements are rooted in academic research and community projects. The largest general-purpose KB's with publicly accessible… 

Leveraging Wikipedia Table Schemas for Knowledge Graph Augmentation

TLDR
An alternative solution, which leverages the patterns that occur on the schemas of a large corpus of Wikipedia tables, is investigated, which can extract more than 1.7M of facts with an estimated accuracy of 0.81 even from tables that do not expose any fact on the KG.

Un-polarizing news in social media platform

TLDR
The main emphasis is that, by showing various news documents from diverse perspectives, a person gets a possibility to identify and discard the misinformation as well as crushing his/her own echo-chamber due to the exposure to the "other sides".

Neural Network Architecture for Credibility Assessment of Textual Claims

TLDR
Experiments on Snopes dataset reveals that CREDO outperforms the state-of-the-art approaches based on linguistic features and a novel approach called Credibility Outcome (CREDO) which aims at scoring the credibility of an article in an open domain setting.

Collaborative Filtering for Binary, Positiveonly Data

TLDR
This survey provides an overview of the existing work from an innovative perspective that allows for surprising commonalities and key differences in binary, positive-only data.

Propaganda Barometer : A Supportive Tool to Improve Media Literacy Towards Building a Critically Thinking Society

To smartly consume a huge and constantly growing volume of information, to identify fake news and resist propaganda in the context of Information Warfare, to improve personal critical thinking

Detecting Fake News on Social Media

  • Kai ShuHuan Liu
  • Sociology, Computer Science
    Synthesis Lectures on Data Mining and Knowledge Discovery
  • 2019
TLDR
This research highlights the need to understand more fully the role that social media plays in the development of media literacy and how it can be leveraged for social media-enabled media literacy.

Fake News Detection on Social Media: A Data Mining Perspective

TLDR
This survey presents a comprehensive review of detecting fake news on social media, including fake news characterizations on psychology and social theories, existing algorithms from a data mining perspective, evaluation metrics and representative datasets, and future research directions for fake news detection on socialMedia.

References

SHOWING 1-10 OF 29 REFERENCES

Distilling Task Knowledge from How-To Communities

TLDR
This paper presents a method for automatically constructing a formal knowledge base on tasks and task-solving steps, by tapping the contents of online communities such as WikiHow, and employs Open-IE techniques to extract noisy candidates for tasks, steps and the required tools and other items.

Predicting Completeness in Knowledge Bases

TLDR
This work investigates different signals to identify the areas where a knowledge base is complete and can combine these signals in a rule mining approach, which allows to predict where facts may be missing.

As Time Goes By: Comprehensive Tagging of Textual Phrases with Temporal Scopes

TLDR
This paper develops a family of Integer Linear Programs for jointly inferring temponym mappings to the timeline and knowledge base and develops methods for detecting such temponyms, inferring their temporal scopes, and mapping them to events in a knowledge base if present there.

Commonsense in Parts: Mining Part-Whole Relations from the Web and Image Tags

TLDR
A new method for automatically acquiring part-whole commonsense from Web contents and image tags at an unprecedented scale, yielding many millions of assertions, while specifically addressing the four shortcomings of prior work.

WebChild: harvesting and organizing commonsense knowledge from the web

TLDR
A method for automatically constructing a large commonsense knowledge base, called WebChild, from Web contents, based on semi-supervised Label Propagation over graphs of noisy candidate assertions that automatically derive seeds from WordNet and by pattern matching from Web text collections.

Coupling Label Propagation and Constraints for Temporal Fact Extraction

TLDR
This paper develops a methodology that combines label propagation with constraint reasoning for temporal fact extraction, and an Integer Linear Program is used to clean out false hypotheses that violate temporal constraints.

Latent credibility analysis

TLDR
A new approach to information credibility, Latent Credibility Analysis (LCA), is introduced, constructing strongly principled, probabilistic models where the truth of each claim is a latent variable and the credibility of a source is captured by a set of model parameters.

Disinformation on the Web: Impact, Characteristics, and Detection of Wikipedia Hoaxes

TLDR
This paper studies false information on Wikipedia by focusing on the hoax articles that have been created throughout its history, and assesses the real-world impact of hoax articles by measuring how long they survive before being debunked, how many pageviews they receive, and how heavily they are referred to by documents on the Web.

Knowlywood: Mining Activity Knowledge From Hollywood Narratives

TLDR
A pipeline for semantic parsing and knowledge distillation is developed, to systematically compile semantically refined activity frames, mined from about two million scenes of movies, TV series, and novels.

Where the Truth Lies: Explaining the Credibility of Emerging Claims on the Web and Social Media

TLDR
This paper automatically assessing the credibility of emerging claims, with sparse presence in web-sources, and generating suitable explanations from judiciously selected sources, shows that the methods work well for early detection of emergingClaims, as well as for claims with limited presence on the web and social media.