# Wiki-Watchdog: Anomaly Detection in Wikipedia Through a Distributional Lens

@article{Arackaparambil2011WikiWatchdogAD,
title={Wiki-Watchdog: Anomaly Detection in Wikipedia Through a Distributional Lens},
author={Chrisil Arackaparambil and Guanhua Yan},
journal={2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology},
year={2011},
volume={1},
pages={257-264}
}
• Published 22 August 2011
• Computer Science
• 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology
Wikipedia has become a standard source of reference online, and many people (some unknowingly) now trust this corpus of knowledge as an authority to fulfil their information requirements. In doing so they task the human contributors of Wikipedia with maintaining the accuracy of articles, a job that these contributors have been performing admirably. We study the problem of monitoring the Wikipedia corpus with the goal of \emph{automated, online} anomaly detection. We present Wiki-watchdog, an…
5 Citations

## Figures and Tables from this paper

### Detecting Change in News Feeds Using a Context Based Graph

A graph-cut model that leverages context, content, and sentiment information is proposed, empirically evaluate the proposed method, and results that improve upon baseline methods in terms of precision, recall, F1, and accuracy are presented.

### Anomaly detection in network streams through a distributional lens

• Computer Science
• 2011
This thesis provides a unified distribution-based methodology for online detection of anomalies in network traffic streams, which regards the traffic stream as a time series of distributions (histograms), and monitors metrics of distributions in the time series.

### Scalable Algorithms for Mining Dynamic Graphs and Hypergraphs with Applications to Anomaly Detection.

This dissertation develops anomaly detection algorithms for increasingly complex graph edge stream models and shows their effectiveness, both theoretically and empirically.

### On Tuning the Knobs of Distribution-Based Methods for Detecting VoIP Covert Channels

• Computer Science
2012 45th Hawaii International Conference on System Sciences
• 2012
A probabilistic model is developed to explain the effects of the tuning of the knobs on the rate of false positives and false negatives in popular entropy-based anomaly detection in detecting covert channels in Voice over IP (VoIP) traffic.

### Anomaly detection in dynamic networks: a survey

• Computer Science
• 2015
This work focuses on anomaly detection in static graphs, which do not change and are capable of representing only a single snapshot of data, but as real‐world networks are constantly changing, there has been a shift in focus to dynamic graphs,Which evolve over time.

## References

SHOWING 1-10 OF 18 REFERENCES

### Mining anomalies using traffic feature distributions

• Computer Science
SIGCOMM '05
• 2005
It is argued that the distributions of packet features observed in flow traces reveals both the presence and the structure of a wide range of anomalies, and that using feature distributions, anomalies naturally fall into distinct and meaningful clusters that can be used to automatically classify anomalies and to uncover new anomaly types.

### An empirical evaluation of entropy-based traffic anomaly detection

• Computer Science
IMC '08
• 2008
This work considers two classes of distributions: flow-header features (IP addresses, ports, and flow-sizes), and behavioral features (degree distributions measuring the number of distinct destination/source IPs that each host communicates with) and observes that the timeseries of entropy values of the address and port distributions are strongly correlated with each other and provide very similar anomaly detection capabilities.

### The Evolution of Wikipedia

• Computer Science
• 2013
It is proposed that not only the degree of the destination node, but also it’s PageRank score can be used to explain the preferential generative process of graph edges, and the effectiveness of PageRank as a predictor of edge destination is evaluated.

### Anomaly detection: A survey

• Computer Science
CSUR
• 2009
This survey tries to provide a structured and comprehensive overview of the research on anomaly detection by grouping existing techniques into different categories based on the underlying approach adopted by each technique.

### oddball: Spotting Anomalies in Weighted Graphs

• Computer Science
PAKDD
• 2010
Several new rules in density, weights, ranks and eigenvalues that seem to govern the so-called “neighborhood sub-graphs” are discovered and shown how to use these rules for anomaly detection.

### GraphScope: parameter-free mining of large time-evolving graphs

• Computer Science
KDD '07
• 2007
The efficiency and effectiveness of the GraphScope is demonstrated, which is designed to operate on large graphs, in a streaming fashion, on real datasets from several diverse domains, and produces meaningful time-evolving patterns that agree with human intuition.

### A signal analysis of network traffic anomalies

• Computer Science
IMW '02
• 2002
This paper reports results of signal analysis of four classes of network traffic anomalies: outages, flash crowds, attacks and measurement failures, and shows that wavelet filters are quite effective at exposing the details of both ambient and anomalous traffic.

### Dynamics of large networks

• Computer Science
• 2008
This thesis analyzes the world's largest social and communication network of Microsoft Instant Messenger with 240 million people and 255 billion conversations and makes interesting and counterintuitive observations about network community structure that suggest that only small network clusters exist, and that they merge and vanish as they grow.

### Distribution‐based anomaly detection in 3G mobile networks: from theory to practice

• Computer Science
Int. J. Netw. Manag.
• 2010
A statistical based change detection algorithm for identifying deviations in distribution time series and a novel methodology based on semi‐synthetic traces for tuning and performance assessment of the proposed AD algorithm are proposed.

### Information-theoretic measures for anomaly detection

• Computer Science
Proceedings 2001 IEEE Symposium on Security and Privacy. S&P 2001
• 2001
This work proposes to use several information-theoretic measures, namely, entropy, conditional entropy, relative conditional entropy; information gain, information gain; and information cost for anomaly detection for protection mechanisms against novel attacks.