# How not to lie with statistics: the correct way to summarize benchmark results

@article{Fleming1986HowNT, title={How not to lie with statistics: the correct way to summarize benchmark results}, author={Philip J. Fleming and John J. Wallace}, journal={Commun. ACM}, year={1986}, volume={29}, pages={218-221} }

Using the arithmetic mean to summarize normalized benchmark results leads to mistaken conclusions that can be avoided by using the preferred method: the geometric mean.

#### 386 Citations

The geometric mean?

- Mathematics
- 2020

The sample geometric mean (SGM) introduced by Cauchy in 1821, is a measure of central tendency with many applications in the natural and social sciences including environmental monitoring, scientom...

How to assess and report the performance of a stochastic algorithm on a benchmark problem: mean or best result on a number of runs?

- Computer Science
- Optim. Lett.
- 2007

This short note analyzes and refute the main argument brought in favor of this statement that reporting the best result obtained by a stochastic algorithm in a number of runs is more meaningful than reporting some central statistic. Expand

The harmonic or geometric mean: does it really matter?

- Computer Science
- CARN
- 2006

It is concluded that for the SPEC CPU2000 benchmark suite, the choice of averaging has very little influence on the relative standing of different machines, and the decision to purchase one system rather then another should not be influenced by the type of averaging used. Expand

Issues in Benchmark Metric Selection

- Computer Science
- TPCTC
- 2009

The case of the TPC-D metric, which used the much debated geometric mean for the single-stream test, confirms that the "real" measure for a decision-support benchmark is the arithmetic mean. Expand

Characterizing computer performance with a single number

- Computer Science
- CACM
- 1988

The controversy surrounding single number performance reduction is examined and solutions are suggested through a comparison of measures.

Fast Sampling of Perfectly Uniform Satisfying Assignments

- Mathematics, Computer Science
- SAT
- 2018

An algorithm for perfectly uniform sampling of satisfying assignments, based on the exact model counter sharpSAT and reservoir sampling, is presented, which is faster than the state of the art by 10 to over 100,000 times. Expand

A compumetrical approach to summarize benchmark results

- Computer Science
- Proceedings of the 5th Jerusalem Conference on Information Technology, 1990. 'Next Decade in Information Technology'
- 1990

The authors suggest a metric-based approach that emphasizes the necessity for analyzing computer performance variables and shows how to normalize the performance variables to a known machine and… Expand

The precision of the arithmetic mean, geometric mean and percentiles for citation data: An experimental simulation modelling approach

- Mathematics, Computer Science
- J. Informetrics
- 2016

The results show that the geometric mean citation count is the most precise, closely followed by the percentage of a country's articles in the top 50% most cited articles for a field, year and document type. Expand

Assessing Probabilistic Inference by Comparing the Generalized Mean of the Model and Source Probabilities

- Mathematics, Computer Science
- Entropy
- 2017

An approach to the assessment of probabilistic inference is described which quantifies the performance on the probability scale by plotting the reported model probabilities versus the histogram calculated source probabilities. Expand

Performance variation across benchmark suites

- Computer Science
- CARN
- 1990

The performance ratio between two systems tends to vary across different benchmarks. Here we study this variation as a "signature" or "fingerprint" of the systems under consideration. This… Expand

#### References

SHOWING 1-10 OF 10 REFERENCES

Re-evaluation of the RISC I

- Computer Science
- CARN
- 1984

This paper hopes to more completely evaluate the reduced Instruction Set Computer, a relatively new concept in c(mput-er architecture, by removing extraneous factors and re-evaluating the RISC I. Expand

A VLSI RISC

- Computer Science
- Computer
- 1982

The hypothesis is that by reducing the instruction set one can design a suitable VLSI architecture that uses scarce resources more effectively than a CISC, and expects this approach to reduce design time, design errors, and the execution time of individual instructions. Expand

Re-evaluation of RISC 1

- Comput. Archit. News
- 1984

Re-evaluation of RISC 1. Comput. Archit. News 12. 1 (Mar

- 1984

6-21. The landmark paper formally introducing the RISC approach to computer architecture

- Computer
- 1982

A comprehensive textbook on functional equations

- A comprehensive textbook on functional equations
- 1966

Funcfional Equations

- A comprehensive textbook on functional equations
- 1966

Authors' Present Addresses: Philip J. Fleming, AT&T Information Systems

- East Warrenville Road. Naperville. IL
- 1100

Performance of Systems]: measurement techniques, performance attribufes General Terms: Measurement. Performance Additional Key Words and Phrases: benchmarking, geometric mean Received 5/65

- Performance of Systems]: measurement techniques, performance attribufes General Terms: Measurement. Performance Additional Key Words and Phrases: benchmarking, geometric mean Received 5/65

The Foxboro Company, Foxboro. MA 02035; Electronic mail: foxvax5!jjw

- The Foxboro Company, Foxboro. MA 02035; Electronic mail: foxvax5!jjw