Corpus ID: 42416536

Should We Be Confident in Peer Effects Estimated from Partial Crawls of Social Networks ?

@inproceedings{Yang2017ShouldWB,
  title={Should We Be Confident in Peer Effects Estimated from Partial Crawls of Social Networks ?},
  author={Jiasen Yang and Bruno Ribeiro and Jennifer Neville},
  year={2017}
}
Research in social network analysis and statistical relational learning has produced a number of methods for learning relational models from large-scale network data. Unfortunately, these methods have been developed under the unrealistic assumption of full data access. In practice, however, the data are often collected by crawling the network, due to proprietary access, limited resources, and privacy concerns. While prior studies have examined the impact of network crawling on the structural… Expand

Figures and Tables from this paper

Stochastic Gradient Descent for Relational Logistic Regression via Partial Network Crawls
TLDR
This work extends the methodology to learning relational logistic regression models via stochastic gradient descent from partial network crawls, and shows that the proposed method yields accurate parameter estimates and confidence intervals. Expand
Simulating systematic bias in attributed social networks and its effect on rankings of minority nodes
TLDR
The implications of systematic bias in edge data depend on an interplay between network topology and type of systematic error, which emphasises the need for an error model framework as developed here, which provides a first step towards studying the effects of systematic edge-uncertainty for various network analysis tasks. Expand

References

SHOWING 1-10 OF 10 REFERENCES
Inference in OSNs via Lightweight Partial Crawls
TLDR
Estimation techniques based on short crawls that have proven statistical guarantees are proposed and an adaptive crawler is provided that makes the method parameter-free, significantly improving the statistical guarantees. Expand
A Walk in Facebook: Uniform Sampling of Users in Online Social Networks
TLDR
This paper develops a practical framework for obtaining a uniform sample of users in an online social network by crawling its social graph by considering and comparing several candidate crawling techniques and introduces online formal convergence diagnostics to assess sample quality during the data collection process. Expand
Network Sampling Designs for Relational Classification
TLDR
Different sampling methods are presented and it is indicated that the choice of sampling method can impact classification performance, and thus consequently affects the accuracy of evaluation. Expand
Identifying User Survival Types via Clustering of Censored Social Network Data
TLDR
This paper proposes a decision tree based algorithm that uses a global normalization of $p$-values to identify clusters with significantly different survival distributions and shows that this model outperforms other competing methods. Expand
Sampling from large graphs
TLDR
The best performing methods are the ones based on random-walks and "forest fire"; they match very accurately both static as well as evolutionary graph patterns, with sample sizes down to about 15% of the original graph. Expand
Classification in Networked Data: a Toolkit and a Univariate Case Study
TLDR
The results demonstrate that very simple network-classification models perform quite well---well enough that they should be used regularly as baseline classifiers for studies of learning with networked data. Expand
Simple estimators for relational Bayesian classifiers
TLDR
This work examines bias and variance tradeoffs over a range of data sets and shows that INDEPVAL's ability to model more multiset information results in lower bias estimates and contributes to its superior performance. Expand
Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues
TLDR
This book describes the development of Markov models for discrete-time Carlo simulation and some of the models used in this study had problems with regard to consistency and Ergodicity. Expand
Bootstrap Methods: Another Look at the Jackknife
The NBER patent citation data file: Lessons, insights and methodological tools
  • 2001