Sequential hashing: A flexible approach for unveiling significant patterns in high speed networks

Abstract

1389-1286/$ see front matter 2010 Elsevier B.V doi:10.1016/j.comnet.2010.06.018 * Corresponding author. Tel.: +852 31634260. E-mail addresses: tbu@research.bell-labs.com (T bell-labs.com (J. Cao), aychen@research.bell-labs.co cse.cuhk.edu.hk (P.P.C. Lee). Identification of significant patterns in network traffic, such as IPs or flows that contribute large volume (heavy hitters) or those that introduce large changes of volume (heavy changers), has many applications in accounting and network anomaly detection. As network speed and the number of flows grow rapidly, identifying heavy hitters/changers by tracking per-IP or per-flow statistics becomes infeasible due to both the computational overhead and memory requirements. In this paper, we propose SeqHash, a novel sequential hashing scheme that supports fast and accurate recovery of heavy hitters/changers, while requiring memory just slightly higher than the theoretical lower bound. SeqHash monitors data traffic using a sketch data structure that can flexibly trade-off between the memory usage and the computational overhead in a large range that can be utilized by different computer architectures for optimizing the overall performance. In addition, we propose statistically efficient algorithms for estimating the values of heavy hitters/changers. Using both mathematical analysis and experimental studies of Internet traces, we demonstrate that SeqHash can achieve the same accuracy as the existing methods do but using much less memory and computational overhead. 2010 Elsevier B.V. All rights reserved.

DOI: 10.1016/j.comnet.2010.06.018

Cite this paper

@article{Bu2010SequentialHA, title={Sequential hashing: A flexible approach for unveiling significant patterns in high speed networks}, author={Tian Bu and Jin Cao and Aiyou Chen and Patrick P. C. Lee}, journal={Computer Networks}, year={2010}, volume={54}, pages={3309-3326} }