AI4TSP
  • AI for TSP Competition
  • Tracks
    • Track 1: Online Supervised Learning (surrogates)
    • Track 2: Reinforcement Learning
  • README
  • Announcements
  • FAQ
  • Github
Powered by GitBook
On this page

Was this helpful?

  1. Tracks

Track 1: Online Supervised Learning (surrogates)

PreviousTracksNextTrack 2: Reinforcement Learning

Last updated 3 years ago

Was this helpful?

Problem: time-dependent orienteering problem with stochastic weights and time windows (TD-OPSWTW) [1]. Given one instance, previously tried routes, and the reward for those routes, the goal is to learn a model that can predict the reward for a new route. Then an optimizer finds the route that gives the best reward according to that model, and that route is then evaluated, giving a new data point. Then the model is updated, and this iterative procedure continues for a fixed number of steps.

C Verbeeck, Pieter Vansteenwegen, and E-H Aghezzaf. Solving the stochastic time-dependent orienteering problem with time windows. European Journal of Operational Research, 255(3):699–718, 2016.

[1]