Team Orienteering Coverage Planning with Uncertain Reward (2021)
Bo Liu, Xuesu Xiao, and Peter Stone
Many municipalities and large organizations have fleets of vehicles that need to be coordinated for tasks such as garbage collection or infrastructure inspection. Motivated by this need, this paper focuses on the common subproblem in which a team of vehicles needs to plan coordinated routes to patrol an area over iterations while minimizing temporally and spatially dependent costs. In particular, at a specific location (e.g., a vertex on a graph), we assume the cost accumulates over time and its growth rate is a random variable with a fixed but unknown mean, and the cost is reset to zero whenever any vehicle visits the vertex (representing the robot ``servicing" the vertex). We formulate this problem in graph terminology and call it Team Orienteering Coverage Planning with Uncertain Reward (TOCPUR). We propose to solve TOCPUR by simultaneously estimating the accumulated cost at every vertex on the graph and solving a novel variant of the Team Orienteering Problem (TOP) iteratively, which we call the Team Orienteering Coverage Problem (TOCP). We provide the first mixed integer programming formulation for the TOCP, as a significant adaptation of the original TOP. We introduce a new benchmark consisting of hundreds of randomly generated graphs for comparing different methods. We show the proposed solution outperforms both the exact TOP solution and a greedy algorithm. In addition, we provide a demo of our method on a team of three physical robots in a real-world environment. The code is publicly available at

Peter Stone Faculty pstone [at] cs utexas edu