InBEDE: Integrating Contextual Bandit with TD Learning for Joint Pricing and Dispatch of Ride-Hailing Platforms

  title={InBEDE: Integrating Contextual Bandit with TD Learning for Joint Pricing and Dispatch of Ride-Hailing Platforms},
  author={Haipeng Chen and Yan Jiao and Zhiwei Qin and Xiaocheng Tang and Hao Li and Bo An and Hongtu Zhu and Jieping Ye},
  journal={2019 IEEE International Conference on Data Mining (ICDM)},
For both the traditional street-hailing taxi industry and the recently emerged on-line ride-hailing, it has been a major challenge to improve the ride-hailing marketplace efficiency due to spatio-temporal imbalance between the supply and demand, among other factors. Despite the numerous approaches to improve marketplace efficiency using pricing and dispatch strategies, they usually optimize pricing or dispatch separately. In this paper, we show that these two processes are in fact intrinsically… 

Figures and Tables from this paper

Learn to Earn: Enabling Coordination Within a Ride-Hailing Fleet

This work introduces explainability in the existing supply-repositioning approaches by establishing the need for coordination between the drivers at specific locations and times, and provides envy-free recommendations i.e., drivers at the same location and time do not envy one another’s expected future earnings.

Learning Model Predictive Controllers for Real-Time Ride-Hailing Vehicle Relocation and Pricing Decisions

The resulting machine-learning model then serves as the optimization proxy and predicts its optimal solutions, making it possible to use the MPC at higher spatial-temporal fidelity, since the optimizations can be solved and learned offline.

Joint Pricing and Matching for City-Scale Ride-Pooling

This paper creates a framework for batched pricing and matching in which pricing is seen as a meta- level optimization over different possible matching decisions, and develops a variant of the revenue-maximizing auction corresponding to the meta-level opti- mization problem.

Reinforcement learning for ridesharing: An extended survey

Dynamic pricing or not? — Pricing models of Finnish taxi dispatch centers under the Act on Transport Services

  • Business
  • 2020
In 1.7.2018, Finnish government liberalized Finnish taxi markets to create possibilities to introduce new technology, digitalization and new business models into transport sector. Also allowing usage

Integrated Optimization of Bipartite Matching and Its Stochastic Behavior: New Formulation and Approximation Algorithm via Min-cost Flow Optimization

An optimization problem for determining the values of the control variables so as to maximize the expected value of matching weights is formulated and an approximation algorithm via a minimum-cost flow algorithm is constructed that can find 3-approximation solutions rapidly.

Reinforcement Learning for Ridesharing: A Survey

A comprehensive, in-depth survey of the literature on reinforcement learning approaches to ridesharing problems is presented and a number of challenges and opportunities for reinforcement learning research on this important domain are discussed.



Deep Reinforcement Learning with Knowledge Transfer for Online Rides Order Dispatching

This work model the ride dispatching problem as a Markov Decision Process and proposes learning solutions based on deep Q-networks with action search to optimize the dispatching policy for drivers on ride-sharing platforms.

Large-Scale Order Dispatch in On-Demand Ride-Hailing Platforms: A Learning and Planning Approach

A novel order dispatch algorithm in large-scale on-demand ride-hailing platforms that is designed to provide a more efficient way to optimize resource utilization and user experience in a global and more farsighted view is presented.

Dynamic Pricing and Matching in Ride-Hailing Platforms

Ride-hailing platforms such as Uber, Lyft and DiDi have achieved explosive growth and reshaped urban transportation. The theory and technologies behind these platforms have become one of the most

A Deep Value-network Based Approach for Multi-Driver Order Dispatching

This work proposes a deep reinforcement learning based solution for order dispatching and conducts large scale online A/B tests on DiDi's ride-dispatching platform to show that the proposed method achieves significant improvement on both total driver income and user experience related metrics.

Dynamic pricing and matching in ride‐hailing platforms

This work provides a review of matching and DP techniques in ride‐hailing, and shows that they are critical for providing an experience with low waiting time for both riders and drivers, and links the two levers together by studying a pool‐matching mechanism that varies rider waiting and walking before dispatch.

Efficient Ridesharing Order Dispatching with Mean Field Multi-Agent Reinforcement Learning

This paper addresses the order dispatching problem using multi-agent reinforcement learning (MARL), which follows the distributed nature of the peer-to-peer ridesharing problem and possesses the ability to capture the stochastic demand-supply dynamics in large-scale ridesh sharing scenarios.

Spatio-Temporal Pricing for Ridesharing Platforms

An empirical analysis conducted in simulation suggests that the STP mechanism can achieve significantly higher social welfare than a myopic pricing mechanism, and it is proved an impossibility result, that there can be no dominant-strategy mechanism with the same economic properties.

Pricing in Ride-Share Platforms: A Queueing-Theoretic Approach

A queueing-theoretic economic model is built that lets platforms realize the benefits of optimal static pricing, even with imperfect knowledge of system parameters, and shows that dynamic pricing is much more robust to fluctuations in system parameters compared to static pricing.

Pricing in Ride-Sharing Platforms: A Queueing-Theoretic Approach

This work shows that profit under any dynamic pricing strategy cannot exceed profit under the optimal static pricing policy (i.e., one which is agnostic of stochastic fluctuations in the system load), and explains the apparent paradox.

Optimal Vehicle Dispatching Schemes via Dynamic Pricing

This paper uses a so-called "ironing" technique to convert the problem into an equivalent convex optimization one via a clean Markov decision process (MDP) formulation and defines the optimal solution of the MDP by a primal-dual analysis of a corresponding convex program.