• Corpus ID: 13987132

Data Acquisition in Social Networks: Issues and Proposals

  title={Data Acquisition in Social Networks: Issues and Proposals},
  author={Claudia Canali and Michele Colajanni and Riccardo Lancellotti},
The amount of information that is possible to gather from social networks may be useful to different contexts ranging from marketing to intelligence. In this paper, we describe the three main techniques for data acquisition in social networks, the conditions under which they can be applied, and the open problems. We then focus on the main issues that crawlers have to address for getting data from social networks, and we propose a novel solution that exploits the cloud computing paradigm for… 

Figures from this paper

Scalability Issues in Online Social Networks
A comprehensive study of social networks along with their significant characteristics and categorize social network architectures into three broad categories: (a) centralized, (b) decentralized, and (c) hybrid, which highlights various scalability issues faced by social network architecture.
Defending against large-scale crawls in online social networks
Genie is proposed, a system that can be deployed by OSN operators to defend against crawlers in large-scale OSNs, and exploits the fact that the browsing patterns of honest users and crawlers are very different.
Probabilistic graphical models in modern social network analysis
Direct and undirected probabilistic graphical models (PGMs) are described and recent applications in modern SNA are highlighted, including the estimation and quantification of importance, propagation of influence, trust (and distrust), link and profile prediction, privacy protection, and news spread through microblogging.
Privacy in microblogging online social networks: issues and metrics
Samia Oukemeni Internet Interdisciplinary Institute (IN3) Universitat Oberta de Catalunya CYBERCAT Center for Cybersecurity Research of Catalonia Barcelona, Spain soukemeni@uoc.edu Helena Rifà-Pous
A Survey of Sentiment Analysis from Social Media Data
The process of capturing data from social media over the years along with the similarity detection based on similar choices of the users in social networks are addressed.
Prominent microblog users prediction during crisis events : using phase-aware and temporal modeling of users behavior. (Prédiction des utilisateurs primordiaux des microblogs durant les situations de crise : modélisation temporelle des comportements des utilisateurs en fonction des phases des évènem
The different proposed approaches leading to the prediction of prominent users who are susceptible to share the targeted relevant and exclusive information on one hand and enabling emergency responders to have a real-time access to the required information in all formats on the other hand are detailed.
Understanding & controlling user privacy in social media via exposure
This thesis proposes a new model, which is a significant improvement over access control to capture users’ privacy requirements, and investigates the effectiveness of the model to protect users' privacy in three real world scenarios.
The strategies for assessing sentiment have been contemplated, arranged, and thought about, and the impediments uncovered with the expectation this will give extension to more readily look into in what's to come.
INRISCO: INcident monitoRing in Smart COmmunities
In the context of smart communities, the INRISCO proposal intends for the early detection of abnormal situations in cities and the analysis of whether, according to their impact, those incidents are really adverse for the community.


A quantitative methodology to identify relevant users in social networks
Which users may play a key role in the content dissemination and how users may be affected by different dissemination strategies are identified and used to identify relevant users for marketing on the popular YouTube network.
Network level footprints of facebook applications
This is the first and only known study of popular third-party applications on OSNs at this depth, and insights help provide guidelines for OSNs and application developers.
A measurement-driven analysis of information propagation in the flickr social network
Analysis of large-scale traces of information dissemination in the Flickr social network finds that even popular photos do not spread widely throughout the network, and the role of word-of-mouth exchanges between friends in the overall propagation of information in the network is questioned.
Unveiling facebook: a measurement study of social network based applications
A large-scale measurement study of the usage characteristics of online social network based applications and the existence of 'communities', with high degree of interaction within a community and limited interaction outside the community finds that a small fraction of users account for the majority of activity within the context of Facebook applications.
Analysis of topological characteristics of huge online social networking services
Cyworld, MySpace, and orkut, each with more than 10 million users, are compared and it is shown that they deviate from close-knit online social networks which show a similar degree correlation pattern to real-life social networks.
A measure of Online Social Networks
  • B. Krishnamurthy
  • Computer Science
    2009 First International Communication Systems and Networks and Workshops
  • 2009
Online Social Networks(OSN) command a user base of about half a billion users on the Internet. Although the traffic contribution in bytes by OSNs is significantly less than earlier applications
Characterizing social cascades in flickr
This work uses real traces of 1,000 popular photos and a social network collected from Flickr, and a theoretical framework borrowed from epidemiology, to show that social cascades are an important factor in the dissemination of content.
User interactions in social networks and their implications
This paper proposes the use of interaction graphs to impart meaning to online social links by quantifying user interactions, and uses both types of graphs to validate two well-known social-based applications (RE and SybilGuard).
Social Information Processing in News Aggregation
Through mathematical modeling, it's possible to describe how collaborative document rating emerges from the independent decisions users make, and reproduce observed ratings that actual stories on Digg have received.
Measurement and analysis of online social networks
This paper examines data gathered from four popular online social networks: Flickr, YouTube, LiveJournal, and Orkut, and reports that the indegree of user nodes tends to match the outdegree; the networks contain a densely connected core of high-degree nodes; and that this core links small groups of strongly clustered, low-degree node at the fringes of the network.