Corpus ID: 235166844

Can we imitate stock price behavior to reinforcement learn option price?

  title={Can we imitate stock price behavior to reinforcement learn option price?},
  author={Xin Jin},
  • Xin Jin
  • Published 2021
  • Computer Science, Economics
  • ArXiv
This paper presents a framework of imitating the price behavior of the underlying stock for reinforcement learning option price. We use accessible features of the equities pricing data to construct a non-deterministic Markov decision process for modeling stock price behavior driven by principal investor's decision making. However, low signal-to-noise ratio and instability that appear immanent in equity markets pose challenges to determine the state transition (price change) after executing an… Expand

Figures from this paper


QLBS: Q-Learner in the Black-Scholes (-Merton) Worlds
This paper presents a discrete-time option pricing model that is rooted in Reinforcement Learning (RL), and more specifically in the famous Q-Learning method of RL, which suggests that RL may provide efficient data-driven and model-free methods for optimal pricing and hedging of options. Expand
Option pricing when underlying stock returns are discontinuous
Abstract The validity of the classic Black-Scholes option pricing formula depends on the capability of investors to follow a dynamic portfolio strategy in the stock that replicates the payoffExpand
Maximum Entropy Inverse Reinforcement Learning
A probabilistic approach based on the principle of maximum entropy that provides a well-defined, globally normalized distribution over decision sequences, while providing the same performance guarantees as existing methods is developed. Expand
Learning Deep Mean Field Games for Modeling Large Population Behavior
This work achieves a synthesis of MFG and Markov decision processes (MDP) by showing that a special MFG is reducible to an MDP, which enables to broaden the scope of mean field game theory and infer MFG models of large real-world systems via deep inverse reinforcement learning. Expand
Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization
This work explores how inverse optimal control (IOC) can be used to learn behaviors from demonstrations, with applications to torque control of high-dimensional robotic systems and an efficient sample-based approximation for MaxEnt IOC. Expand
Reinforcement Learning: An Introduction
This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications. Expand
A Closed-Form Solution for Options with Stochastic Volatility with Applications to Bond and Currency Options
I use a new technique to derive a closed-form solution for the price of a European call option on an asset with stochastic volatility. The model allows arbitrary correlation between volatility andExpand
State of the Art—A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms
This paper surveys models and algorithms dealing with partially observable Markov decision processes. A partially observable Markov decision process (POMDP) is a generalization of a Markov decisionExpand
Options, Futures, and Other Derivatives
Contents: Introduction. Futures Markets and the Use of Futures for Hedging. Forward and Futures Prices. Interest Rate Futures. Swaps. Options Markets. Properties of Stock Option Prices. TradingExpand
Calibrating Multivariate Lévy Processes with Neural Networks
Deep neural networks are used and found that they are robust and can capture sharp transitions in the L\'evy density and perform favorably compared to piecewise linear functions and radial basis functions. Expand