Abstract
We consider the index selection problem. Given either a fixed query workload or an unknown probability distribution on possible future queries, and a bound B on how much space is available to build indices, we seek to build a collection of indices for which the average query response time is minimized. We give strong negative and positive peformance bounds. Let m be the number of queries in the workload. We show how to obtain with high probability a collection of indices using space O(B ln m) for which the average query cost is optB the optimal performance possible for indices using at most B total space. Moreover, this space relaxation is necessary: unless N P ⊆ no(log log n), no polynomial time algorithm can guarantee average query cost less than M1-∈ optB using space αB, for any constant α, where M is the size of the dataset. We quantify the error in performance introduced by running the algorithm on a sample drawn from a query distribution.
| Original language | English (US) |
|---|---|
| Pages | 244-251 |
| Number of pages | 8 |
| DOIs | |
| State | Published - 2003 |
| Externally published | Yes |
| Event | Twenty second ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2003 - San Diego, CA, United States Duration: Jun 9 2003 → Jun 11 2003 |
Other
| Other | Twenty second ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2003 |
|---|---|
| Country/Territory | United States |
| City | San Diego, CA |
| Period | 6/9/03 → 6/11/03 |
ASJC Scopus subject areas
- Software
- Information Systems
- Hardware and Architecture