XML100 Instances for the Capacitated Vehicle Routing Problem

Link: https://openreview.net/forum?id=yHiMXKN6nTl

Instance format: XML100_ABCD_EF

  • A: Depot (1: random, 2: centered, 3: cornered)

  • B: Customers (1: random, 2: clustered, 3: random-clustered)

  • C: Demand (1. unitary, 2. small values with large CV, 3. small values with small CV, 4. large values with large CV, 5. large values with small CV, 6. depending on quadrant, 7. many small values and few large values)

  • D: Route size (1: very short, 2: short, 3: medium, 4: long, 5: very long, 6: ultra long)

EF: instance ID within group ABCD.
Ex: XML100_2216_01 → centered depot, clustered customers, unitary demand, ultra-long routes, instance 01.

378 groups = 3×3×7×6 combinations; the first 172 groups have 27 instances, the rest have 26.

Ex: python generate.py 200 2 1 4 5 1 0XML200_2145_01.vrp
(Use python generator.py for help)

Complete ZIP

Optimal solutions

All the 10,000 solutions were proven to be optimal. The vast majority of these optimal solutions were found by VRPSolver (Pessoa et al., 2020; vrpsolver.math.u-bordeaux.fr), sometimes using customized parameterizations. Eighteen instances could only be solved using cluster branching by Marcos P Silva et al. (2024), and seventeen instances were solved by a Branch-and-Cut algorithm, as in Lysgaard et al. (2004) and Subramanian et al. (2011).  Both VRPSolver and BC approaches used the very good initial bounds provided by HGS-CVRP (Vidal, 2022; github.com/vidalt/HGS-CVRP). 

Suggested Experimental Guidelines

1. The instance generator, provided in Python, can be used for generating as many training instances as desired.

2. One of the publicly available state-of-the-art heuristics can be used for generating very good solutions for the training instances (using the known exact methods for solving hundreds of thousands of instances would be too time-consuming).

3. The 10,000 instances dataset can be used for the final testing. The potential advantages of following that guideline:

• The provided generator is the same used for creating the X instances (Uchoa et al. 2017). It was carefully designed to create that very diversified dataset, mimicking the features of real-world problems. The X instances were already used in hundreds of published works and are now the most widely used CVRP benchmark. It is desirable for the community not to go back to more simplistic ways of generating instances.

• It is recommended to have different methods tested over exactly the same instances, not only instances that are generated in a similar way. That point was strongly advised in (Johnson 1999) as a way of eliminating a source of noise in the comparisons. The size of the testing dataset is big enough to produce statistically significant results and to make overfitting unlikely (the 10,000 dataset should not be used for training!).

• Finally, the existence of optimal solution values for the 10,000 dataset allows measuring absolute errors, which is certainly better than measuring relative errors with respect to a reference method.