Carleton University
Technical Report TR-01-04
July 2001

A Formal Analysis of Why Heuristics Work

B. John Oommen & Luis G. Rueda

Abstract

Many optimization problems in computer science have been proven to be NP-hard, and hence there are no general polynomial-time algorithms to solve them. Alternatively, they are solved using heuristics, providing a sub-optimal solution that, hopefully, is arbitrarily close to the optimal one. Such problems are found in a wide range of applications, including articial intelligence, game theory, graph partitioning, database query optimization, etc. Given two heuristics, the question of determining which is superior, has typically demanded a yes/no answer which is often substantiated based on empirical evidence. We have solved the problem of deciding on the superior heuristic by using Pattern Classication Techniques (PCT). We prove the following assertion: Given two heuristics H1 and H2 used in determining the goal of a particular problem, if the accuracy in obtaining the optimal solution by H1 is greater than that of H2, then H1 has a higher probability of leading to the optimal solution than H2. To the best of our knowledge, this is an open problem; this unproven conjecture has been the basis for designing numerous algorithms such as the A* algorithm, and its variants. By formulating the problem from a Pattern Recognition perspective, we use PCT to present a mathematical, rigorous proof of this fact, and show some uniqueness results. The corresponding database query optimization problem has been open for at least two decades, the diculty of which has partially been due to the hurdles involved in the formulation itself. To validate our proofs, we report empirical results on database query optimization techniques involving a few well-known histogram estimation methods.

TR-01-04.pdf