# Stupid Data Miner Tricks

@inproceedings{Leinweber2007StupidDM, title={Stupid Data Miner Tricks}, author={David Leinweber}, year={2007} }

This article originated over ten years ago as a set of joke slides showing silly spurious correlations. These statistically appealing relationships between the stock market and diary products and third world livestock populations have been cited often, in Business Week, the Wall Street Journal, the book “A Mathematician Looks at the Stock Market,” and elsewhere. Students from Bill Sharpe's classes at Stanford seem to be familiar with them. The slides were expanded to include some actual content…

## 86 Citations

### Big Data Mining: An Overview

- Computer Science
- 2015

A HACE theorem is presented that characterizes the features of the Big Data revolution and enables companies to "drill down" into summary information to view detail transactional data.

### The Golden Dilemma

- Economics
- 2013

Although gold has been around for thousands of years, its role in diversified portfolios is not well understood. The authors critically examined such popular stories as “gold is an inflation hedge.”…

### Event-Driven Trading and the “New News”

- Computer ScienceThe Journal of Portfolio Management
- 2011

In this article, Leinweber and Sisk include event studies and show U.S. portfolio simulation results for “pure news” signals applied over the period 2006–2009 as well as a true out-of-sample period in 2010, which indicates alpha in excess of 10% a year.

### CRITICAL QUESTIONS FOR BIG DATA

- Political Science
- 2012

The era of Big Data has begun. Computer scientists, physicists, economists, mathematicians, political scientists, bio-informaticists, sociologists, and other scholars are clamoring for access to the…

### Gold, the Golden Constant, and Déjà Vu

- Economics
- 2020

Currently, the real, or inflation-adjusted, price of gold is almost as high as it was in January 1980 and August 2011. Since 1975, periods of high real gold prices have occurred during periods of…

### What Big Data May Mean for Surveys

- Geology
- 2014

Two converging trends raise questions about the future of large-scale probability surveys conducted by or for National Statistical Institutes (NSIs). First, increasing costs and rising rates of…

### Automated algorithmic trading: machine learning and agent-based modelling in complex adaptive financial markets

- Computer Science
- 2016

An autonomous system that uses novel machine learning techniques to predict the price return over well documented seasonal events and uses these predictions to develop a profitable trading strategy and an adaptation of the system introduced for predicting the price impact of order book events are proposed.

### A Perceptron Based Neural Network Data Analytics Architecture for the Detection of Fraud in Credit Card Transactions in Financial Legacy Systems

- Computer ScienceWSEAS TRANSACTIONS ON SYSTEMS AND CONTROL
- 2021

The paper examines the feasibility and practicality of implementing a proof-of-concept Perceptron-based Artificial Neural Network (ANN) architecture that can be directly plugged into a legacy paradigm financial system platform that has been trained on specific fraudulent patterns.

### Big Data Techniques and Applications

- Computer Science
- 2014

In this chapter, past and current research on big data techniques and its applications are reviewed.

## References

SHOWING 1-10 OF 14 REFERENCES

### Data Mining: Statistics and More?

- Computer Science
- 1998

Abstract Data mining is a new discipline lying at the interface of statistics, database technology, pattern recognition, machine learning, and other areas. It is concerned with the secondary analysis…

### Data-Snooping Biases in Tests of Financial Asset Pricing Models

- Economics
- 1989

We investigate the extent to which tests of financial asset pricing models may be biased by using properties of the data to construct the test statistics. Specifically, we focus on tests using…

### Selection Models and the File Drawer Problem

- Biology
- 1988

This paper uses selection models, or weighted distributions, to deal with one source of bias, namely the failure to report studies that do not yield statistically significant results, and applies selection models to two approaches that have been suggested for correcting the bias.

### MAXIMIZING PREDICTABILITY IN THE STOCK AND BOND MARKETS

- EconomicsMacroeconomic Dynamics
- 1997

We construct portfolios of stocks and bonds that are maximally predictable with respect to a set of ex-ante observable economic variables, and show that these levels of predictability are…

### A Note on Screening Regression Equations

- Philosophy
- 1983

Abstract Consider developing a regression model in a context where substantive theory is weak. To focus on an extreme case, suppose that in fact there is no relationship between the dependent…

### The Theory and Practice of Econometrics

- Mathematics, Economics
- 1985

The Classical Inference Approach for the General Linear Model, Statistical Decision Theory and Biased Estimation, and the Bayesian Approach to Inference are reviewed.

### Behind the Smoke and Mirrors: Gauging the Integrity of Investment Simulations

- Economics
- 1992

Fund sponsors and others who must evaluate simulated investment results should carefully question the simulation process. In particular, they should ask about the data base used, the portfolio…