Not the First Digit! Using Benford's Law to Detect Fraudulent Scientif ic Data

  title={Not the First Digit! Using Benford's Law to Detect Fraudulent Scientif ic Data},
  author={Andreas Diekmann},
  journal={Journal of Applied Statistics},
  pages={321 - 329}
  • A. Diekmann
  • Published 1 April 2007
  • Mathematics
  • Journal of Applied Statistics
Abstract Digits in statistical data produced by natural or social processes are often distributed in a manner described by ‘Benford's law’. Recently, a test against this distribution was used to identify fraudulent accounting data. This test is based on the supposition that first, second, third, and other digits in real data follow the Benford distribution while the digits in fabricated data do not. Is it possible to apply Benford tests to detect fabricated or falsified scientific data as well… 

Two digit testing for Benford's law

Benford’s law has been used by auditors to help reveal data manipulation not only in the context of tax audits and corporate accounting, but also election fraud. The principle idea behind Benford’s

Benford's Law as an Instrument for Fraud Detection in Surveys Using the Data of the Socio-Economic Panel (SOEP)

This paper focuses on fraud detection in surveys using Socio-Economic Panel (SOEP) data as an example for testing newly methods proposed here, and develops a measure that reflects the plausibility of the digit distribution in interviewer clusters and shows that in several SOEP subsamples, Benford's Law holds for the available continuous data.

Benford’s Law as an Instrument for Fraud Detection in Surveys Using the Data of the Socio-Economic Panel (SOEP)

Summary This paper focuses on fraud detection in surveys using Socio-Economic Panel (SOEP) data as an example for testing newly methods proposed here. A statistical theorem referred to as Benford’s

The First Digits Analysis Until the Fifth Benford Law in Financial Statement

research aims to explore if there is fraud in a financial statement, use the Act stated that Benford’s distribution number the first digit until the fifth will follow the trend of lower number.

Testing for Benford's Law: A Monte Carlo Comparison of Methods

Testing data for conformity to Benford's law is used not only by auditors exploiting a numerical phenomenon to detect fraudulently reported data. Operationally goodness-of-fit tests are used to

Does Benford’s Law hold in economic research and forecasting?

First and higher order digits in data sets of natural and socio-economic processes often follow a distribution called Benford’s law. This phenomenon has been used in business and scientific

Testing for Benford’s Law in very small samples: Simulation study and a new test proposal

Benford’s Law defines a statistical distribution for the first and higher order digits in many datasets. Under very general condition, numbers are expected to naturally conform to the theorized

Benford’s Law as an Indicator of Fraud in Economics

Abstract Contrary to intuition, first digits of randomly selected data are not uniformly distributed but follow a logarithmically declining pattern, known as Benford’s law. This law is increasingly

Fraud Detection in Financial Statements Applying Benford's Law with Monte Carlo Simulation

This research confirms the hypothesis that financial statement frauds are usually conducted using the second digit, and tests Benford’s Law for detecting fraud in financial statements.

Sensitivity to statistical regularities: People (largely) follow Benford’s law

The results suggest that Benford’s law is a product of the way people generate responses, rather than sensitivity to the relationship itself, which is a key part of adaptive approaches to decision making such as that of Gigerenzer, et al.



A Statistical Derivation of the Significant-Digit Law

If distributions are selected at random (in any "unbi- ased" way) and random samples are then taken from each of these dis- tributions, the significant digits of the combined sample will converge to the logarithmic (Benford) distribution.

Data fabrication: Can people generate random digits?

Many people have difficulty in generating random numbers. This difficulty suggests that potentially fabricated numbers encountered in investigations of scientific misconduct be examined for nonrandom


A century-old observation concerning an unexpected pattern in the first digits of many tables of numerical data has recently been discovered to also hold for the stock market, census statistics, and

Automatic Identification of Faked and Fraudulent Interviews in the German SOEP

Two new tools for the identification of faked interviews in surveys are presented, one method is based on Benford's Law, and the other exploits the empirical observation that fakers most often produce answers with less variability than could be expected from the whole survey.

Identification, Characteristics and Impact of Faked Interviews in Surveys: An Analysis by Means of Genuine Fakes in the Raw Data of SOEP

To the best of our knowledge, most of the few methodological studies which analyze the impact of faked interviews on survey results are based on “artificial fakes” generated by project students in a

Patterns in Listings of Failure-Rate & MTTF Values and Listings of Other Data

It has been observed that the decimal parts of failure-rate and MTTF values as listed in tables tend to have a logarithmic distribution. A possible explanation for this phenomenon is given. When such

Base-Invariance Implies Benford's Law

A derivation of Benford's Law or the First-Digit Phenomenon is given assuming only base-invariance of the underlying law. The only baseinvariant distributions are shown to be convex combinations of

Characteristics and impact of faked interviews in surveys – An analysis of genuine fakes in the raw data of SOEP

Summary:Panel data offers a unique opportunity to identify data that interviewers clearly faked by comparing data waves. In the German Socio–Economic Panel (SOEP), only 0.5 percent of all records of

The Peculiar Distribution of First Digits

The law of anomalous numbers

  • on Reliability,
  • 1938