Measuring large-scale distributed systems: case of BitTorrent Mainline DHT

@article{Wang2013MeasuringLD,
  title={Measuring large-scale distributed systems: case of BitTorrent Mainline DHT},
  author={Liang Wang and Jussi Kangasharju},
  journal={IEEE P2P 2013 Proceedings},
  year={2013},
  pages={1-10}
}
Peer-to-peer networks have been quite thoroughly measured over the past years, however it is interesting to note that the BitTorrent Mainline DHT has received very little attention even though it is by far the largest of currently active overlay systems, as our results show. As Mainline DHT differs from other systems, existing measurement methodologies are not appropriate for studying it. In this paper we present an efficient methodology for estimating the number of active users in the network… 

Tables from this paper

Inference on BitTorrent Mainline DHT from Network Evolution Time Series
TLDR
By using the proposed algorithm, the system successfully discovers and clusters the countries of similar user behavior and captures the anomalies like Sybil attacks and other real-world events with high accuracy and provides an interesting and alternative view the on global usage pattern in BitTorrent system.
Inference on the Network Evolution in BitTorrent Mainline DHT
TLDR
It is shown that Fourier transform allows us to extract frequency features from time series data, which can further be used to characterize user behaviors and detect system lies in a peer-to-peer system automatically without needing to resort to visual comparisons.
Study of Network Size Measurement Algorithm for P2P System
TLDR
Simulation results show that both algorithms can efficiently measure the size of P2P networks when topology changes and comparison of performance indexes shows that the binary tree algorithm can give a better estimate at the maintenance cost.
4-th Ieee International Conference on Peer-to-peer Computing 100 Million Dht Replies
TLDR
This paper uses the same mechanism as peers bootstrapping into the DHT to discover more than 20 Million peers in less than 2 hours, and exploits the caches present at bootstrap servers to collect information on peers which are currently connected.
100 Million DHT replies
TLDR
This paper uses the same mechanism as peers bootstrapping into the DHT to discover more than 20 Million peers in less than 2 hours, and discovers more than twice as many peers as BitMon, which crawls a subset of the D HT and then extrapolates.
Determining the Hop Count in Kademlia-type Systems
TLDR
This paper introduces the first comprehensive formal model of the routing for the entire family of Kademlia-type systems and shows that several of the recent improvements to the protocol in fact have been counterproductive with regard to routing efficiency.
Adaptive Search Radius for BitTorrent Swarms
TLDR
An existing locality algorithm for the eDonkey/eMule protocol, Adaptive Search Radius (ASR), is taken and made compatible with the BitTorrent protocol and the test results show that it is possible for ASR to lower the average distance between peers by 2 hops while having the possibility of making downloads faster.
Were You There? Bridging the Gap to Unveil Users' Online Sessions in Networked, Distributed Systems
TLDR
This paper proposes a methodology to correct ill-collected snapshots and build more accurate traces from them, and uses ground-truth data to assess the effectiveness of this methodology.
A Lightweight Approach for Improving the Lookup Performance in Kademlia-type Systems
TLDR
A new, yet backward-compatible, neighbour selection scheme that attempts to maximize the aforementioned diversity of neighbours' identifiers within each routing table bucket and measures the performance gain enabled by a partial deployment for the scheme in the real KAD system.
Locust: Highly Concurrent DHT Experimentation Framework for Security Evaluations
TLDR
Locust is provided, a novel highly concurrent DHT experimentation framework written in Elixir, which is designed for security evaluations and allows running experiments with a full DHT implementation and around 4,000 nodes on a single machine including an adjustable churn rate.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 30 REFERENCES
BitMON: A Tool for Automated Monitoring of the BitTorrent DHT
TLDR
BitMON is a Java-based out-of-the-box platform for monitoring the BitTorrent DHT that monitors the DHT's size in peers as well as the peers' IP addresses, port numbers, countries of origin and session length and the long-term evolution of these indicators can be graphically displayed.
On blind mice and the elephant: understanding the network impact of a large distributed system
TLDR
A comprehensive view of BitTorrent is presented, using data from a representative set of 500,000 users sampled over a two year period, that captures unseen trends and reveals several unexpected features of the largest peer-to-peer system.
Long Term Study of Peer Behavior in the kad DHT
TLDR
This work has been crawling a representative subset of KAD every five minutes for six months and obtained information about geographical distribution of peers, session times, daily usage, and peer lifetime, and found that session times are Weibull distributed.
Large-scale monitoring of DHT traffic
TLDR
This paper proposes the idea of minimally visible monitors to capture the traffic at a large number of peers with minimum disruption to the DHT, and implements and validate the proposed technique, called Montra, on the Kad DHT.
Modeling and analysis of bandwidth-inhomogeneous swarms in BitTorrent
TLDR
A model of a swarm in BitTorrent where peers have arbitrary upload and download bandwidths is presented, which captures the effects of BitTorrent's well-known ‘tit-for-tat’ mechanism in bandwidth-inhomogeneous swarms and provides an accurate mathematical description of the resulting dynamics.
Exploiting KAD: possible uses and misuses
TLDR
This paper relates some of the findings and point out how kad can be used and misused and explains why Mounting a Sybil attack is very easy in kad.
Profiling a million user dht
TLDR
Measurements of the Azureus BitTorrent client's DHT are reported, offering a glimpse into the implementation challenges associated with making structured overlays work in practice and driving the design of a modified DHT lookup algorithm that reduces median D HT lookup time by an order of magnitude for a nominal increase in overhead.
Understanding churn in peer-to-peer networks
TLDR
The understanding of churn is advanced by improving accuracy, comparing different P2P file sharingdistribution systems, and exploring new aspects of churn.
Clustering and sharing incentives in BitTorrent systems
TLDR
This paper presents the first detailed experimental investigation of the peer selection strategy in the popular BitTorrent protocol, and observes that BitTorrent's modified choking algorithmin seed state provides uniform service to all peers, and that an underprovisioned initial seed leads to absence of peer clustering and less effective sharing incentives.
Unraveling the BitTorrent Ecosystem
TLDR
This work develops a high-performance tracker crawler, and over a narrow window of 12 hours, crawl essentially all of the public BitTorrent Ecosystem's trackers, obtaining peer lists for all referenced torrents.
...
1
2
3
...