Neural Approximate Dynamic Programming for the Ultra-fast Delivery Problem
The same-day delivery (SDD) services necessitate optimized operational efficiency, especially for the two sub-tasks of order assignment/dispatching and vehicle routing. We consider the ultra-fast delivery variant of SDD, where the critical time constraints make the real-time decision-making particularly challenging. We propose a neural approximate dynamic programming (NeurADP) approach for this problem, a hybrid methodology combining approximate dynamic programming (ADP) and deep reinforcement learning (DRL). This work constitutes the first application of NeurADP beyond its original context of the ride-pooling problem, extending its versatility in solving real-world, dynamic, discrete optimization problems. NeurADP integrates neural network-based value function approximations into integer programming models solved in the ADP algorithm in an efficient way via a two-step decomposition approach, and also it performs a more general Bellman update via the DRL connection. Our extensive numerical experiments demonstrate the benefits of the NeurADP approach, with generated policies outperforming various DRL and myopic policies. Detailed sensitivity analysis further confirms effectiveness of the NeurADP policies under varying operational constraints.