Abstract
We consider the index selection problem. Given either a fixed query workload or an unknown probability distribution on possible future queries, and a bound B on how much space is available to build indices, we seek to build a collection of indices for which the average query response time is minimized. We give strong negative and positive peformance bounds. Let m be the number of queries in the workload. We show how to obtain with high probability a collection of indices using space O(B ln m) for which the average query cost is optB the optimal performance possible for indices using at most B total space. Moreover, this space relaxation is necessary: unless N P ⊆ no(log log n), no polynomial time algorithm can guarantee average query cost less than M1-∈ optB using space αB, for any constant α, where M is the size of the dataset. We quantify the error in performance introduced by running the algorithm on a sample drawn from a query distribution.
Original language | English (US) |
---|---|
Pages | 244-251 |
Number of pages | 8 |
DOIs | |
State | Published - 2003 |
Externally published | Yes |
Event | Twenty second ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2003 - San Diego, CA, United States Duration: Jun 9 2003 → Jun 11 2003 |
Other
Other | Twenty second ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2003 |
---|---|
Country/Territory | United States |
City | San Diego, CA |
Period | 6/9/03 → 6/11/03 |
ASJC Scopus subject areas
- Software
- Information Systems
- Hardware and Architecture