2 Excerpts

- Published 2012

Chess was one of the first problems studied by the AI community. While currently, chessplaying programs perform very well using primarily search-based algorithms to decide the best move to make, in this project I apply machine-learning algorithms to this problem. Specifically, instead of choosing which move is the best to make, I want to produce an function that attempts to determine the probability that a player is likely to win in a given chess position. Note that while searching chess engines also produce a score factor for each position, this score represents the engines own belief (in a Bayesian sense) that it will win the game given the position, whereas our goal is to classify the actual probability of a win given human players. The main possible application of such a classifying function would be as a heuristic in an A*like search-based chess engine. Additionally, the structure of the classifier could shed insights on the nature of the game as a whole. I have acquired training examples from actual games played by humans. I decided to use the FICS games database, which contains over 100 million games played over the internet over a period of years. This dataset consists of games in PGN (portable game notation) format, which encodes the game as a whole rather than as a sequence of positions. Since the goal of this project is to classify positions, I needed to convert these PGN games to position sequences, and used a python script to do so. This presented a technical challenge due to the fact that a sequence of positions is several orders of magnitude larger (in terms of memory consumption) than the PGN-encoded games. By using specialized solvers, such as the stochastic subgradient method, I was able to avoid storing all the positions in memory at once. Since these positions are played by humans, and humans have a wide distribution of skill levels and play styles, any results from these data will depend on how the data are filtered. For this project, I am only pre-filtering these data by excluding (1) fast games in which the amount of time remaining for each player would be a spoiler factor for the classier, (2) games in which either player forfeited on time, (3) games in which either player forfeited due to network disconnection, (4) extremely short games, and (5) games resulting in a draw. This last exclusion is done in order to use a binary classifier for this problem; however, my approach could be extended to include drawn games. Formally, we can express this problem as a ML problem as follows: our content x(i) is a legal chess position from a game played by humans, and our annotation y(i) is the outcome of that game (a win or loss by the playerto-move). We are trying to predict the expected value of the outcome of the game given the position. Note here that, due to the fact that humans are playing these games, the result of the game is not a mathematical function of the board state. Furthermore, the nature of chess is such that the vast majority of positions encountered by the algorithm will not necessarily favor either color, so they will both not be useful as training examples for the classifier, and also raise the error rate when the classifier is tested. Because of these factors, it will be impossible for any classifier for this problem to produce a near-zero error rate over this dataset. The approach will necessitate representing the board state as some multidimensional vector of features. After investigating several possible feature sets, I settled on representing the board

@inproceedings{Sa2012ClassifyingCP,
title={Classifying Chess Positions},
author={Christopher De Sa},
year={2012}
}