Applying Machine Learning Techniques to Baseball Pitch Prediction

  title={Applying Machine Learning Techniques to Baseball Pitch Prediction},
  author={Michael Hamilton and Phuong Hoang and Lori Layne and Joseph Murray and David Padget and Corey Stafford and Hien Tran},
Major League Baseball, a professional baseball league in the US and Canada, is one of the most popular sports leagues in the world. [] Key Method We apply several common machine learning classification methods to PITCHf/x data to classify pitches by type. We then extend the classification task to prediction by utilizing features only known before a pitch is thrown. By performing significant feature analysis and introducing a novel approach for feature selection, moderate improvement over former results is…

Figures and Tables from this paper

A Dynamic Feature Selection Based LDA Approach to Baseball Pitch Prediction

This work extends the classification to pitch prediction fastball or nonfastball by restricting the analysis to pre-pitch features by performing significant feature analysis and introducing a novel approach for feature selection.

Machine Learning Applications in Baseball: A Systematic Literature Review

It is postulate that recent proliferation of neural networks in general machine learning research will soon carry over into baseball analytics, and two algorithms dominate the literature: Support Vector Machines for classification problems and k-nearest neighbors for both classification and Regression problems.

A Survey of Baseball Machine Learning: A Technical Report

It is speculated that the current popularity of neural networks in general machine learning literature will soon carry over into baseball analytics, although it is found relatively fewer existing articles utilizing this approach when compiling.

An Input Support System for Customized Scouting Charts of Baseball Games

An input support system for customized scouting charts of baseball games in Unity and C# is developed that enables users to create pitch combination records easily and improves the readability of them.

Sport Analytics: Science or Alchemy?

Sport analytics promises to use Big Data and sophisticated statistical methods to identify effective strategies in sports—“the Moneyball moment.” However, much like alchemy, sport analytics is

Can Machine Learning with IMUs Be Used to Detect Different Throws and Estimate Ball Velocity in Team Handball?

Investigating if an inertial measurement unit (IMU) and machine learning techniques could be used to detect different types of team handball throws and predict ball velocity found a practical and automated method for quantifying throw counts and classifying the throw and approach types adopted by handball players.

Gender classification of full-body biological motion of aperiodic actions using machine learning

This work proposes gender classification for aperiodic dynamic overarm throwing motion actions in an unrestricted environment using support vector machine, k-nearest neighbor, and decision tree with and without AdaBoost.

Ball 3D Trajectory Reconstruction without Preliminary Temporal and Geometrical Camera Calibration

A method for reconstructing 3D ball trajectories by using multiple temporally and geometrically uncalibrated cameras, which first detects a ball, and estimates temporal difference between cameras.



Keeping the Hitter Off Balance: Mixed Strategies in Baseball

Pitch-level data from Major League Baseball games is used to see if pitchers mix their pitches optimally and finds that pitchers are mixing optimally to have success on the first pitch of the plate appearance, but the null hypothesis of optimal play for the Plate appearance outcome is rejected.

Predicting the Next Pitch

A machine-learning based predictor of the next pitch type that incorporates information that is available to a batter such as the count, the current game state, the pitcher’s tendency to throw a particular type of pitch, etc.

Slugging Percentage in Differing Baseball Counts

The objective of this research is to compare average slugging percentages between each of the various types of counts in baseball. Data is collected from 1260 MLB games played between March 20, 2008

Pattern Recognition, Fourth Edition

This edition includes many more worked examples and diagrams to help give greater understanding of the methods and their application, including semi-supervised learning, combining clustering algorithms, and relevance feedback.

An introduction to ROC analysis

Wikipedia glossary of baseball

  • Retrieved July, 2013 from htt p : ==en:wikipedia:org=wiki=Glossary o f baseball.
  • 2013

Major league baseball attendance records

  • Retrieved June 19, 2013 from htt p : ==espn:go:com=mlb=attendance==year=2012. Pitchf/x (2013). MLB pitch f/x data. Retrieved July, 2013 from htt p : ==www:mlb:com.
  • 2012