Corpus ID: 220525904

# Quantifying and Reducing Bias in Maximum Likelihood Estimation of Structured Anomalies

@inproceedings{Chitra2021QuantifyingAR,
title={Quantifying and Reducing Bias in Maximum Likelihood Estimation of Structured Anomalies},
author={Uthsav Chitra and Kimberly Ding and Benjamin J. Raphael},
booktitle={ICML},
year={2021}
}
• Published in ICML 2021
• Computer Science, Mathematics
Anomaly estimation, or the problem of finding a subset of a dataset that differs from the rest of the dataset, is a classic problem in machine learning and data mining. In both theoretical work and in applications, the anomaly is assumed to have a specific structure defined by membership in an $\textit{anomaly family}$. For example, in temporal data the anomaly family may be time intervals, while in network data the anomaly family may be connected subgraphs. The most prominent approach for… Expand

#### References

SHOWING 1-10 OF 72 REFERENCES
Near-Optimal and Practical Algorithms for Graph Scan Statistics with Connectivity Constraints
• Computer Science
• ACM Trans. Knowl. Discov. Data
• 2019
This work proposes a framework for designing algorithms for optimizing a large class of scan statistics for networks, subject to connectivity constraints, that run in time that scales linearly on the size of the graph and depends on a parameter the authors call the “effective solution size,” while providing rigorous approximation guarantees. Expand
Optimal Sparse Segment Identification With Application in Copy Number Variation Analysis
• Computer Science, Medicine
• Journal of the American Statistical Association
• 2010
An efficient likelihood ratio selection (LRS) procedure for identifying the segments is developed, and the asymptotic optimality of this method is presented in the sense that the LRS can separate the signal segments from the noise as long as the signals are in the identifiable regions. Expand
Non-parametric scan statistics for event detection and forecasting in heterogeneous social media graphs
• Computer Science
• KDD
• 2014
This paper presents Non-Parametric Heterogeneous Graph Scan (NPHGS), a new approach that considers the entire heterogeneous network for event detection and efficiently maximize a nonparametric scan statistic over connected subgraphs to identify the most anomalous network clusters. Expand
Spatial Scan Statistic
• Computer Science
• Encyclopedia of Social Network Analysis and Mining
• 2014
Flexibly Shaped Spatial Scan Statistic digunakan untuk memetakan kecamatan di Kota Surabaya yang terdeteksi sebagai wilayah kantong balita gizi buruk agar dapat diketahui ke camatan mana yang menjadi prioritas dalam penanganan kasus balita GWNBR. Expand
Near-optimal Anomaly Detection in Graphs using Lovasz Extended Scan Statistic
• Computer Science, Mathematics
• NIPS
• 2013
This work develops from first principles the generalized likelihood ratio test for determining if there is a well connected region of activation over the vertices in the graph in Gaussian noise and provides a relaxation, called the Lovasz extended scan statistic (LESS), that uses submodularity to approximate the intractable generalized likelihood ratios. Expand
Power comparisons for disease clustering tests
• Mathematics, Computer Science
• Comput. Stat. Data Anal.
• 2003
A collection of 1,220,000 simulated benchmark data sets generated under 51 different cluster models and the null hypothesis are presented, to be used for power evaluations and to compare the power of the spatial scan statistic, the maximized excess events test and the nonparametric M statistic. Expand
Computing All Small Cuts in Undirected Networks
• Mathematics, Computer Science
• ISAAC
• 1994
It is shown that all cuts of weights less than kλ(N) can be enumerated in O(mn3 + n2k+2) time without using the maximum flow algorithm. Expand
A Multiscale Scan Statistic for Adaptive Submatrix Localization
• Mathematics, Computer Science
• KDD
• 2019
An optimization framework based on a multiscale scan statistic is established, and algorithms in order to approach the optimizer are developed, which show that the estimator has superior performance compared to other estimators which do not require prior submatrix knowledge, while being comparatively faster to compute. Expand
Graph Anomaly Detection Based on Steiner Connectivity and Density
• Computer Science
• Proceedings of the IEEE
• 2018
This work provides a survey of the various formulations of anomaly detection in dynamic networks with a focus on “window-based” methods, and describes two classes of techniques: 1) generalizations of Steiner connectivity; and 2) dense subgraph mining. Expand
Graph Scan Statistics With Uncertainty
• Computer Science
• AAAI
• 2018
This paper develops the first systematic approach to incorporating uncertainty in scan statistics using two formulations, one based on the sample average approximation and the other using a max-min objective. Expand