TY - GEN
T1 - Grip
T2 - 28th ACM International Conference on Information and Knowledge Management, CIKM 2019
AU - Zhang, Minjia
AU - He, Yuxiong
N1 - We thank Junhua Wang, Jason Li, Shi Zhang, and Han Zhang from Search & AI and DLVS for their support of GRIP. We thank David Andersen, Conglong Li, Olatunji Ruwase, Samyam Rajbhandari, Wenhan Wang, Qi Chen, and Mingqin Li for their helpful discussions. We thank our anonymous reviewers for their valuable feedback that helps improve the quality of the work.
PY - 2019/11/3
Y1 - 2019/11/3
N2 - This paper presents GRIP, an approximate nearest neighbor (ANN) search algorithm for building vector search engine which makes heavy use of the algorithm. GRIP is designed to retrieve documents at large-scale based on their semantic meanings in a scalable way. It is both fast and capacity-optimized. GRIP combines new algorithmic and system techniques to collaboratively optimize the use of memory, storage, and computation. The contributions include: (1) The first hybrid memory-storage ANN algorithm that allows ANN to benefit from both DRAM and SSDs simultaneously; (2) The design of a highly optimized indexing scheme that provides both memory-efficiency and high performance; (3) A cost analysis and a cost function for evaluating the capacity improvements of ANN algorithms. GRIP achieves an order of magnitude improvements on overall system efficiency, significantly reducing the cost of vector search, while attaining equal or higher accuracy, compared with the state-of-the-art.
AB - This paper presents GRIP, an approximate nearest neighbor (ANN) search algorithm for building vector search engine which makes heavy use of the algorithm. GRIP is designed to retrieve documents at large-scale based on their semantic meanings in a scalable way. It is both fast and capacity-optimized. GRIP combines new algorithmic and system techniques to collaboratively optimize the use of memory, storage, and computation. The contributions include: (1) The first hybrid memory-storage ANN algorithm that allows ANN to benefit from both DRAM and SSDs simultaneously; (2) The design of a highly optimized indexing scheme that provides both memory-efficiency and high performance; (3) A cost analysis and a cost function for evaluating the capacity improvements of ANN algorithms. GRIP achieves an order of magnitude improvements on overall system efficiency, significantly reducing the cost of vector search, while attaining equal or higher accuracy, compared with the state-of-the-art.
KW - Approximate nearest neighbor search
KW - Information retrieval
KW - SSD
UR - https://www.scopus.com/pages/publications/85075440293
UR - https://www.scopus.com/pages/publications/85075440293#tab=citedBy
U2 - 10.1145/3357384.3357938
DO - 10.1145/3357384.3357938
M3 - Conference contribution
AN - SCOPUS:85075440293
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 1673
EP - 1682
BT - CIKM 2019 - Proceedings of the 28th ACM International Conference on Information and Knowledge Management
PB - Association for Computing Machinery
Y2 - 3 November 2019 through 7 November 2019
ER -