Visualizing Big Data Outliers Through Distributed Aggregation

@article{Wilkinson2018VisualizingBD,
  title={Visualizing Big Data Outliers Through Distributed Aggregation},
  author={Leland Wilkinson},
  journal={IEEE Transactions on Visualization and Computer Graphics},
  year={2018},
  volume={24},
  pages={256-266}
}
Visualizing outliers in massive datasets requires statistical pre-processing in order to reduce the scale of the problem to a size amenable to rendering systems like D3, Plotly or analytic systems like R or SAS. This paper presents a new algorithm, called <monospace>hdoutliers</monospace>, for detecting multidimensional outliers. It is unique for a) dealing with a mixture of categorical and continuous variables, b) dealing with big-p (many columns of data), c) dealing with big-<inline-formula… CONTINUE READING
Recent Discussions
This paper has been referenced on Twitter 62 times over the past 90 days. VIEW TWEETS

References

Publications referenced by this paper.
Showing 1-10 of 78 references

Outliers in Statistical Data

  • V. Barnett, T. Lewis
  • John Wiley & Sons,
  • 1994
Highly Influential
7 Excerpts

DSPCP: A data scalable approach for identi- 266 IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, VOL

  • H. Nguyen, P. Rosen
  • 24, NO. 1, JANUARY 2018 fying relationships in…
  • 2017
1 Excerpt

A cluster-based outlier detection scheme for multivariate data

  • J. Jobe, M. Pokojovy
  • Journal of the American Statistical Association…
  • 2015
1 Excerpt

Similar Papers

Loading similar papers…