# Stupid Data Miner Tricks

@inproceedings{Leinweber2007StupidDM, title={Stupid Data Miner Tricks}, author={David Leinweber}, year={2007} }

This article originated over ten years ago as a set of joke slides showing silly spurious correlations. These statistically appealing relationships between the stock market and diary products and third world livestock populations have been cited often, in Business Week, the Wall Street Journal, the book “A Mathematician Looks at the Stock Market,” and elsewhere. Students from Bill Sharpe's classes at Stanford seem to be familiar with them. The slides were expanded to include some actual content… Expand

#### 78 Citations

Predicting Financial Markets with Google Trends and Not so Random Keywords

- Economics
- 2013

We check the claims that data from Google Trends contain enough data to predict future financial index returns. We first discuss the many subtle (and less subtle) biases that may affect the backtest… Expand

Big Data Mining: An Overview

- Engineering
- 2015

Big data is an evolving term that describes any voluminous amount of structured, semi-structured and unstructured data that has the potential to be mined for information. Although big data doesn't… Expand

Event-Driven Trading and the “New News”

- Economics
- The Journal of Portfolio Management
- 2011

Two information revolutions are underway in trading and investing. Most headlines focus on structured quantitative market information at ever higher frequencies, but the other technology revolution… Expand

CRITICAL QUESTIONS FOR BIG DATA

- Sociology
- 2012

The era of Big Data has begun. Computer scientists, physicists, economists, mathematicians, political scientists, bio-informaticists, sociologists, and other scholars are clamoring for access to the… Expand

The Golden Dilemma

- Economics
- 2013

While gold objects have existed for thousands of years, gold's role in diversified portfolios is not well understood. We critically examine popular stories such as 'gold is an inflation hedge'. We… Expand

Do Google Trend Data Contain More Predictability than Price Returns?

- Economics, Physics
- 2014

Using non-linear machine learning methods and a proper backtest procedure, we critically examine the claim that Google Trends can predict future price returns. We first review the many potential… Expand

Gold, the Golden Constant, and Déjà Vu

- Art
- 2020

Currently, the real, or inflation-adjusted, price of gold is almost as high as it was in January 1980 and August 2011. Since 1975, periods of high real gold prices have occurred during periods of… Expand

Data, Data, Everywhere

- Computer Science
- 2011

The amount of data available combined with the number of variables that need to be considered is of a scale far beyond what is amenable to manual inspection, and automated and semi-automated data analysis is thus essential to sieve through the data for meaningful conclusions. Expand

What Big Data May Mean for Surveys

- Engineering
- 2014

Two converging trends raise questions about the future of large-scale probability surveys conducted by or for National Statistical Institutes (NSIs). First, increasing costs and rising rates of… Expand

Automated algorithmic trading: machine learning and agent-based modelling in complex adaptive financial markets

- Economics, Computer Science
- 2016

An autonomous system that uses novel machine learning techniques to predict the price return over well documented seasonal events and uses these predictions to develop a profitable trading strategy and an adaptation of the system introduced for predicting the price impact of order book events are proposed. Expand

#### References

SHOWING 1-10 OF 14 REFERENCES

Data Mining: Statistics and More?

- Computer Science
- 1998

Abstract Data mining is a new discipline lying at the interface of statistics, database technology, pattern recognition, machine learning, and other areas. It is concerned with the secondary analysis… Expand

Data-Snooping Biases in Tests of Financial Asset Pricing Models

- Economics
- 1989

We investigate the extent to which tests of financial asset pricing models may be biased by using properties of the data to construct the test statistics. Specifically, we focus on tests using… Expand

Selection Models and the File Drawer Problem

- Mathematics
- 1988

Meta-analysis consists of quantitative methods for combining evidence from different studies about a particular issue. A frequent criticism of meta-analysis is that it may be based on a biased sample… Expand

Maximizing Predictability in the Stock and Bond Markets

- Economics
- 1995

We construct portfolios of stocks and of bonds that are maximally predictable with respect to a set of ex ante observable economic variables, and show that these levels of predictability are… Expand

A Note on Screening Regression Equations

- Mathematics
- 1983

Abstract Consider developing a regression model in a context where substantive theory is weak. To focus on an extreme case, suppose that in fact there is no relationship between the dependent… Expand

The Theory and Practice of Econometrics

- Computer Science
- 1985

The Classical Inference Approach for the General Linear Model, Statistical Decision Theory and Biased Estimation, and the Bayesian Approach to Inference are reviewed. Expand

Behind the Smoke and Mirrors: Gauging the Integrity of Investment Simulations

- Business
- 1992

Fund sponsors and others who must evaluate simulated investment results should carefully question the simulation process. In particular, they should ask about the data base used, the portfolio… Expand

Specification Searches

- 1978