Authors: Andrei Lissovoi, Pietro S. Oliveto and John Alasdair Warwicker.
Accepted for the Genetic and Evolutionary Computation Conference.
Selection hyper-heuristics are randomised search methodologies which choose and execute heuristics from a set of low-level heuristics. Recent time complexity analyses for the LeadingOnes benchmark function have shown that the standard simple random, permutation, random gradient, greedy and reinforcement learning selection mechanisms show no effects of learning. The idea behind the learning mechanisms is to continue to exploit the currently selected heuristic as long as it is successful. However, the probability that a promising heuristic is successful in the next step is relatively low when perturbing a reasonable solution to a combinatorial optimisation problem. In this paper we generalise the classical selection-perturbation mechanisms so success can be measured over some fixed period of length τ , rather than in a single iteration. We present a benchmark function where it is necessary to learn to exploit a particular low-level heuristic, rigorously proving that it makes the difference between an efficient and an inefficient algorithm. For LeadingOnes we prove that the generalised random gradient mechanism approaches optimal performance while generalised greedy, although not as fast, still outperforms random local search. An experimental analysis shows that combining the two generalised mechanisms leads to even better performance.