Who With Whom And How?: Extracting Large Social Networks Using Search Engines

  title={Who With Whom And How?: Extracting Large Social Networks Using Search Engines},
  author={Stefan Siersdorfer and Philipp Kemkes and Hanno Ackermann and Sergej Zerr},
  journal={Proceedings of the 24th ACM International on Conference on Information and Knowledge Management},
Social network analysis is leveraged in a variety of applications such as identifying influential entities, detecting communities with special interests, and determining the flow of information and innovations. However, existing approaches for extracting social networks from unstructured Web content do not scale well and are only feasible for small graphs. In this paper, we introduce novel methodologies for query-based search engine mining, enabling efficient extraction of social networks from… 

Figures and Tables from this paper

Cobwebs from the Past and Present: Extracting Large Social Networks using Internet Archive Data

Methods for constructing large social graphs from extracted relations and an interface to study their temporal evolution are described and introduced.

Social Network Extraction Unsupervised

An unsupervised as a stream of methods for extracting social networks from information sources, namely simplifying, enriching, and emphasizing the results is described.

Timeline Summarization for Event-Related Discussions on a Chinese Social Media Platform

An approach to automatically generate timeline summarization for sub-event discussions related to a query event without supervised learning using a two-stage method to extract representative entity terms in the event-related discussions and filter out most of the sentences semantically un-related to the query event.

Whom to appease and whom to circumvent: analyzing knowledge sharing with social networks

This is the first comprehensive SNA to decipher the knowledge sharing pattern among researchers and offers to reduce redundant research by delineating the possible avenues in the area of knowledge sharing.

Timeline Summarization for Event-relate Facts and Public Issues on a ChineseSocial Media Platform

An approach to automatically generate timeline summarization for sub-event discussions related to a query event withsupervised learning is proposed.

Learning a Fully Convolutional Network for Object Recognition using very few Data

This paper proposes a system for object recognition that is trained with only 15 examples per class on average, which considerably reduce the required amount of labeled data, and demonstrates good performance on the recognition of traffic signs for cyclists as well as their localization in maps.


This document analyzes some of the challenges of Analyzing the data about the different activities of modern cities, explains their value, and describes some the advanced solutions they have developed.

The Extraction of Social Networks from Web Using Search Engines

A way to help the researchers to be able to specify their favorite topic in a particular field and by this way, observe and extract the social network of the related concepts to that topic which is expressed in the form of social network is introduced.



Superficial Method for Extracting Social Network for Academics Using Web Snippets

This paper demontrate the possibility of exploiting features in Web snippets returned by search engines for disambiguating entities and building relations among entities during the process of extracting social networks.

POLYPHONET: an advanced social network extraction system from the web

A social network extraction system called POLYPHONET is proposed, which employs several advanced techniques to extract relations of persons, detect groups of people, and obtain keywords for a person using Google.

Efficient Entity Relation Discovery on Web

With popularization of Web, there are billions of pages on Web, which contain affluent information of real world entities and their relations. Therefore, much research focuses on named entity

ArnetMiner: extraction and mining of academic social networks

The architecture and main features of the ArnetMiner system, which aims at extracting and mining academic social networks, are described and a unified modeling approach to simultaneously model topical aspects of papers, authors, and publication venues is proposed.

A system to extract social networks based on the processing of information obtained from Internet

An automatic system to extract social networks: a software designed to generate social networks by exploiting information which is already available on Internet through the use of common search engines such as Google or Yahoo is presented.

Identifying comparable entities on the web

This work presents an initial step of mining comparable entities from sources of information available to a large-scale Web search engine, namely, search query logs and documents from a Web crawl, and generates a diverse set of comparables consisting of entities from a broad class of categories.

Robust Estimation of Google Counts for Social Network Extraction

A novel algorithm that estimates the Google count robustly is proposed, which uses the co-occurrence of terms as evidence to estimate the occurrence of a given word, and integrates multiple evidence for robust estimation.

Learning influence probabilities in social networks

This paper proposes models and algorithms for learning the model parameters and for testing the learned models to make predictions, and develops techniques for predicting the time by which a user may be expected to perform an action.

Sopra: a new social personalized ranking function for improving web search

A new ranking function called SoPRa that considers the social dimension of the Web, which is any social information that surrounds documents along with the social context of users, is proposed.

Learning open-domain comparable entity graphs from user search queries

This paper proposes a novel solution, known as Comparable Entity Graph Mining (CEGM), to learn an open-domain comparable entity graph from the user search queries, which covers 73.4% queries in the top 50 million unique queries of a commercial search engine.