Learning to Optimize Industry-Scale Dynamic Pickup and Delivery Problems

  title={Learning to Optimize Industry-Scale Dynamic Pickup and Delivery Problems},
  author={Xijun Li and Weilin Luo and Mingxuan Yuan and Jun Wang and Jiawen Lu and Jie Wang and Jinhu Lu and Jia Zeng},
  journal={2021 IEEE 37th International Conference on Data Engineering (ICDE)},
  • Xijun Li, Weilin Luo, Jia Zeng
  • Published 1 April 2021
  • Computer Science
  • 2021 IEEE 37th International Conference on Data Engineering (ICDE)
The Dynamic Pickup and Delivery Problem (DPDP) is aimed at dynamically scheduling vehicles among multiple sites in order to minimize the cost when delivery orders are not known a priori. Although DPDP plays an important role in modern logistics and supply chain management, state-of-the-art DPDP algorithms are still limited on their solution quality and efficiency. In practice, they fail to provide a scalable solution as the numbers of vehicles and sites become large. In this paper, we propose a… 

A Hierarchical Reinforcement Learning Based Optimization Framework for Large-scale Dynamic Pickup and Delivery Problems

This paper designs an upper-level agent to dynamically partition the DPDP into a series of sub-problems with different scales to optimize vehicles routes towards globally better solutions and proposes a novel hierarchical optimization framework to better solve large-scale DPDPs.

Deep Reinforcement Learning for Demand Driven Services in Logistics and Transportation Systems: A Survey

This survey comprehensively introduces the existing DRL solutions for DDS, then highlights common applications and important decision/control problems within, and comprehensively introduced open simulation environments for development and evaluation of DDS applications.

Introduction to The Dynamic Pickup and Delivery Problem Benchmark - ICAPS 2021 Competition

A new benchmark from real business scenarios as well as a simulator supporting the dynamic evaluation of the Dynamic Pickup and Delivery Problem ICAPS 2021 competition are introduced.

Off-line approximate dynamic programming for the vehicle routing problem with a highly variable customer basis and stochastic demands

A Q-learning algorithm called DecQN is developed, which features state-of-the-art acceleration techniques such as Replay Memory and Double Q Network, and is developed to solve a stochastic variant of the vehicle routing problem (VRP) arising in the context of domestic donor collection services.

Adaptive Task Planning for Large-Scale Robotized Warehouses

A new task planning problem called TPRW is proposed, which aims to minimize the end-to-end makespan that incorporates the entire item distribution pipeline, known as a fulfilment cycle, and adopts a series of efficient optimizations on both time and memory to handle large-scale item throughput.



Efficient Large-Scale Fleet Management via Multi-Agent Deep Reinforcement Learning

This paper proposes a contextual multi-agent reinforcement learning framework including two concrete algorithms, namely contextual deep Q-learning and contextualmulti-agent actor-critic, to achieve explicit coordination among a large number of agents adaptive to different contexts.

Deep Reinforcement Learning with Knowledge Transfer for Online Rides Order Dispatching

This work model the ride dispatching problem as a Markov Decision Process and proposes learning solutions based on deep Q-networks with action search to optimize the dispatching policy for drivers on ride-sharing platforms.

An Exact Algorithm for the Multiple Vehicle Pickup and Delivery Problem

This work develops an alternative optimization solution approach for the multiple vehicle pickup and delivery problem (MVPDP) that does not require these constraints to be tight and was able to optimally solve problem instances of up to 5 vehicles and 17 customers on problems without clusters.

Recent Models and Algorithms for One-to-One Pickup and Delivery Problems

This chapter contains two main sections devoted to single vehicle and multi-vehicle problems, respectively, and is subdivided into two parts, one on exact algorithms and one on heuristics.

A Cooperative Multi-Agent Reinforcement Learning Framework for Resource Balancing in Complex Logistics Network

This paper introduces an innovative cooperative mechanism for state and reward design resulting in more effective and efficient transportation and shows that this approach can give rise to a significant improvement in terms of both performance and stability.

Waiting and Buffering Strategies for the Dynamic Pickup and Delivery Problem with Time Windows

Comparisons of the solution quality provided by these strategies to a more conventional approach were performed on randomly generated instances with static and dynamic travel times and different degrees of dynamism, and the results indicate the advantages of the strategies both in terms of lost requests and number of vehicles.

A Taxi Order Dispatch Model based On Combinatorial Optimization

A novel system that attempts to optimally dispatch taxis to serve multiple bookings and a method to predict destinations of a user once the taxi-booking APP is started, both deployed in online systems and leading to enhanced user experience.