DSLOB: A Synthetic Limit Order Book Dataset for Benchmarking Forecasting Algorithms under Distributional Shift

  title={DSLOB: A Synthetic Limit Order Book Dataset for Benchmarking Forecasting Algorithms under Distributional Shift},
  author={Defu Cao and Yousef El-Laham and Loc Trinh and Svitlana Vyetrenko and Y. Liu},
In electronic trading markets, limit order books (LOBs) provide information about pending buy/sell orders at various price levels for a given security. Recently, there has been a growing interest in using LOB data for resolving downstream machine learning tasks (e.g., forecasting). However, dealing with out-of-distribution (OOD) LOB data is challenging since distributional shifts are unlabeled in current publicly available LOB datasets. Therefore, it is critical to build a synthetic LOB dataset… 

Figures and Tables from this paper



Benchmark dataset for mid-price forecasting of limit order book data with machine learning methods

This paper describes the first publicly available benchmark dataset of high-frequency limit order markets for mid-price prediction, extracting normalized data representations of time series data for five stocks from the NASDAQ Nordic stock market for a time period of ten consecutive days, leading to a dataset of ~4,000,000 time series samples in total.

Get real: realism metrics for robust limit order book market simulations

This paper surveyed the literature to collect a set of reference metrics and applied them to real market data and simulation output and provides a comprehensive catalog of these metrics including mathematical formulations where appropriate.

AdaRNN: Adaptive Learning and Forecasting of Time Series

This paper proposes Adaptive RNNs (AdaRNN) to tackle the TCS problem by building an adaptive model that generalizes well on the unseen test data and proposes Temporal Distribution Characterization to better characterize the distribution information in the TS.

Shifts: A Dataset of Real Distributional Shift Across Multiple Large-Scale Tasks

The Shifts Dataset is proposed, a standardized large-scale dataset of tasks across a range of modalities affected by distributional shifts that will enable researchers to meaningfully evaluate the plethora of recently developed uncertainty quantification methods, as well as assessment criteria and state-ofthe-art baselines.

A Fine-Grained Analysis on Distribution Shift

This work introduces a framework that enables fine-grained analysis of various distribution shifts and finds that progress has been made over a standard ERM baseline; in particular, pretraining and augmentations offer large gains in many cases.

OoD-Bench: Quantifying and Understanding Two Dimensions of Out-of-Distribution Generalization

  • Nanyang YeKaican Li Jun Zhu
  • Computer Science
    2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2022
This work first identifies and measure two distinct kinds of distribution shifts that are ubiquitous in various datasets, and compares OoD generalization algorithms across two groups of benchmarks, revealing their strengths on one shift as well as limitations on the other shift.

S&P 500 Index Additions and Earnings Expectations

Prior studies of stocks added to the S&P 500 Index report that Index inclusion is associated with a permanent increase in stock price. This result has been interpreted to mean that demand curves for

Optimal execution for portfolio transactions

In my thesis I explore the problem of optimizing trading strategies for complex portfolio transitions. Institutional investors run into this issue during periodic portfolio rebalancing or transition

Deep learning for limit order books

A new neural network architecture for modeling spatial distributions (i.e. distributions on ) is developed which is more computationally efficient than a traditional fully-connected feedforward architecture and yields a low-dimensional model of price movements deep into the limit order book, allowing more effective use of information from deep in thelimit order book.

Price Pressure on the NYSE and NASDAQ: Evidence from S&P 500 Index Changes

Using additions of NYSE- and Nasdaq-listed firms to the S&P 500, between 1989 and 2000, we explore the price effects of noninformation related demand shocks. After controlling for various firm