Efficient bulk-loading of gridfiles

Scott T. Leutenegger, David M. Nicol

Research output: Contribution to journalArticlepeer-review

Abstract

This paper considers the problem of bulk-loading large data sets for the gridfile multiattribute indexing technique. We propose a rectilinear partitioning algorithm that heuristically seeks to minimize the size of the gridfile needed to ensure no bucket overflows. Empirical studies on both synthetic data sets and on data sets drawn from computational fluid dynamics applications demonstrate that our algorithm is very efficient, and is able to handle large data sets. In addition, we present an algorithm for bulk-loading data sets too large to fit in main memory. Utilizing a sort of the entire data set it creates a gridfile without incurring any overflows.

Original languageEnglish (US)
Pages (from-to)410-420
Number of pages11
JournalIEEE Transactions on Knowledge and Data Engineering
Volume9
Issue number3
DOIs
StatePublished - 1997
Externally publishedYes

Keywords

  • Bulk loading
  • Databases
  • Dynamic programming
  • Gridfile
  • Multidimensional indexing
  • Rectilinear partitioning

ASJC Scopus subject areas

  • Information Systems
  • Computer Science Applications
  • Computational Theory and Mathematics

Fingerprint Dive into the research topics of 'Efficient bulk-loading of gridfiles'. Together they form a unique fingerprint.

Cite this