# Information divergence is more χ2-distributed than the χ2-statistics

@article{Harremos2012InformationDI, title={Information divergence is more $\chi$2-distributed than the $\chi$2-statistics}, author={P. Harremo{\"e}s and G. Tusn{\'a}dy}, journal={2012 IEEE International Symposium on Information Theory Proceedings}, year={2012}, pages={533-537} }

For testing goodness of fit it is very popular to use either the χ<sup>2</sup>-statistic or G<sup>2</sup>-statistics (information divergence). Asymptotically both are χ<sup>2</sup>-distributed so an obvious question is which of the two statistics that has a distribution that is closest to the χ<sup>2</sup>-distribution. Surprisingly, when there is only one degree of freedom it seems like the distribution of information divergence is much better approximated by a χ<sup>2</sup>-distribution than… Expand

#### 25 Citations

Mutual information of contingency tables and related inequalities

- Mathematics, Computer Science
- 2014 IEEE International Symposium on Information Theory
- 2014

The signed log-likelihood is introduced and it is demonstrated that its distribution function can be related to the distribution function of a standard Gaussian by inequalities, and a general conjecture about how close the signed Log-Likelihood is to a standardGaussian is formulated. Expand

M ay 2 01 2 Some Refinements of Large Deviation Tail Probabilities

- 2012

If μ is close to the mean ofX1 one would usually approximate Pn,μ by a tail probability of a Gaussian random variable. If μ is far from the mean of X1 the tail probability can be estimated using… Expand

Multinomial Concentration in Relative Entropy at the Ratio of Alphabet and Sample Sizes

- Mathematics
- 2019

We show that the moment generating function of the Kullback-Leibler divergence between the empirical distribution of $n$ independent samples from a distribution $P$ over a finite alphabet of size $k$… Expand

Bayesian Learning of Relationships

- Computer Science
- 2017

This dissertation presents its efforts in developing new statistical techniques for learning the relationships inside data, from Bayesian perspective, on developing a novel hypothesis testing tool to probe a general relationship, either linear or nonlinear. Expand

How Many Samples Required in Big Data Collection: A Differential Message Importance Measure

- Computer Science, Mathematics
- ArXiv
- 2018

It is proved that the change of DMIM can describe the gap between the distribution of a set of sample values and a theoretical distribution, and it is obtained that the empirical distribution approaches the real distribution with decreasing of the DMIM deviation. Expand

Bounds on tail probabilities for negative binomial distributions

- Computer Science, Mathematics
- Kybernetika
- 2016

Various new inequalities for tail proabilities for distributions that are elements of the most improtant exponential families, which include the Poisson distributions, the Gamma distribution, the binomial distributions,The negative binomial distribution and the inverse Gaussian distributions are presented. Expand

Bounds on Tail Probabilities in Exponential families

- Mathematics
- 2016

In this paper we present various new inequalities for tail proabilities for distributions that are elements of the most improtant exponential families. These families include the Poisson… Expand

Differential Message Importance Measure: A New Approach to the Required Sampling Number in Big Data Structure Characterization

- Computer Science, Mathematics
- IEEE Access
- 2018

A new approach to the required sampling number is proposed, where the DMIM deviation is constructed to characterize the process of collecting message importance, and the connection between message importance and distribution goodness-of-fit is established, which verifies that analyzing the data collection with taking message importance into account is feasible. Expand

Algorithmic Fairness Revisited

- 2015

We study fairness in algorithmic decision making, with the goal of proposing formal and robust definitions and measures of an algorithm’s bias towards sensitive user features. The main contribution… Expand

Statistical Methods for Analyzing Time Series Data Drawn from Complex Social Systems

- Computer Science
- 2015

This thesis develops a non-parametric method for smoothing time series data corrupted by serially correlated noise, and demonstrates that user behavior, while quite complex, belies simple underlying computational structures. Expand

#### References

SHOWING 1-10 OF 25 REFERENCES

χ 2 and classical exact tests often wildly misreport significance ; the remedy lies in computers

- 2011

If a discrete probability distribution in a model being tested for goodness-of-fit is not close to uniform, then forming the Pearson χ2 statistic can involve division by nearly zero. This often leads… Expand

On the Bahadur-Efficient Testing of Uniformity by Means of the Entropy

- Computer Science, Mathematics
- IEEE Transactions on Information Theory
- 2008

It is proved that the information divergence statistic in this problem is more efficient in the Bahadur sense than any power divergence statistic of order, which means that the entropy provides the most efficient way of characterizing the uniformity of a distribution. Expand

Chi-square and classical exact tests often wildly misreport significance; the remedy lies in computers

- Mathematics
- 2011

If a discrete probability distribution in a model being tested for goodness-of-fit is not close to uniform, then forming the Pearson chi-square statistic can involve division by nearly zero. This… Expand

Large deviations of divergence measures on partitions

- Mathematics
- 2001

Abstract We discuss Chernoff-type large deviation results for the total variation, the I-divergence errors, and the χ2-divergence errors on partitions. In contrast to the total variation and the… Expand

EFFICIENCIES OF CHI-SQUARE AND LIKELIHOOD RATIO GOODNESS-OF-FIT TESTS

- Mathematics
- 1985

The classical problem of choice of number of classes in testing goodness of fit is considered for a class of alternatives, for the chi-square and likelihood ratio statistics. Pitman and Bahadur… Expand

Accurate Methods for the Statistics of Surprise and Coincidence

- Computer Science
- Comput. Linguistics
- 1993

The basis of a measure based on likelihood ratios that can be applied to the analysis of text is described, and in cases where traditional contingency table methods work well, the likelihood ratio tests described here are nearly identical. Expand

Information-theoretic methods in testing the goodness of fit

- Mathematics
- 2000 IEEE International Symposium on Information Theory (Cat. No.00CH37060)
- 2000

We present a new approach to evaluating the efficiency of information-divergence-type statistics for testing the goodness of fit. Since the Pitman approach is too weak to detect sufficiently sharply… Expand

Efficiency of entropy testing

- Computer Science, Mathematics
- 2008 IEEE International Symposium on Information Theory
- 2008

It is shown that in a certain sense Shannon entropy is more efficient than Renyi entropy for alpha isin [0; 1]. Expand

Testing Statistical Hypotheses

- Mathematics
- 1959

The General Decision Problem.- The Probability Background.- Uniformly Most Powerful Tests.- Unbiasedness: Theory and First Applications.- Unbiasedness: Applications to Normal Distributions.-… Expand

Limiting distribution of the G statistics

- Mathematics
- 2008

The G statistic and its local version have been used extensively in spatial data analysis. The paper proves the asymptotic normality of the G statistic. Theorems in this paper imply that the regular… Expand