Carleton University
Technical Report TR-99-05
February 1999
The Case for the Rectangular Attribute Cardinality Map in Query Optimization: Modeling, Prototype Validation and Testing*
Abstract
Current business database systems utilize histograms to approximate frequency distributions of attribute values of relations. These are used to efficiently estimate query result sizes and access plan costs and thus minimize the query response time for business (and non-commercial) database systems. In a recent work [12] we proposed a new form of histogram-like technique called the Rectangular Attribute Cardinality Map (RACM) that gives much smaller estimation errors than the traditional equi-width and equi-depth histograms currently being used by many commercial database systems. We also provided a fairly extensive mathematical analysis for its average and worst case errors for its frequency estimates which was verified for synthetic data.
This paper demonstrates the earlier claim that the R-ACM is indeed a viable tool for query optimization. It, first of all, presents the R-ACM model and reports a prototype validation for the R-ACM for query optimization in real-world database systems. By investigating the performance of the scheme on an extensive set of experiments using real-life data [l, 2), we demonstrate that the R-AC!\I scheme is much more accurate than the traditional histograms for query result size estimation. We anticipate that, due to its high accuracy and low construction costs, it could become an invaluable tool for query optimization in the future database systems.