Searching for effects in big data: Why p-values are not advised and what to use instead

Abstract

P-values of null hypothesis significance testing have long been the standard and decisive measure of deductive statistics. However, for decades, top statistical methodologists have argued that focusing on p-values is not conducive to science, and that these tests are regularly misunderstood. The standard replacement or at least complement proposed for p-values by those critics are confidence intervals and statistical effects sizes. Regrettably, analyzing and comparing huge data sets (from data mining or simulation based data farming) with two measures is awkward. As a single-value measure of first interpretation for the scanning of Big Data this article proposes statistically secured effect sizes either based on exact, mathematically sophisticated confidence intervals for effect sizes or simplified approximations. It is further argued that simplified secured effect sizes are among the most instructive single measures of statistical interpretation completely perspicuous for the layman.

Extracted Key Phrases

4 Figures and Tables

Cite this paper

@article{Hofmann2015SearchingFE, title={Searching for effects in big data: Why p-values are not advised and what to use instead}, author={Marko A. Hofmann}, journal={2015 Winter Simulation Conference (WSC)}, year={2015}, pages={725-736} }