Stop Thinking, Just Do!

Sungsoo Kim's Blog

Awesome Vector Database

tagsTags

11 November 2023


Article Source


Awesome Vector Database Awesome

A curated list of awesome works related to high dimensional structure/vector search & database

Services

Libraries & Engines

Multidimensional data / Vectors

Texts

Others

  • SimSIMD: Efficient Alternative to scipy.spatial.distance and numpy.inner

Benchmarks & Databases

đź“š Books

Conferences & Workshops

Publications

Survey

  • Pan, James Jie, Jianguo Wang, and Guoliang Li. “Survey of Vector Database Management Systems.” arXiv preprint arXiv:2310.14021 (2023). [Paper]
  • Nearest neighbor search: the old, the new, and the impossible. Andoni, Alexandr. [Paper]

    Quantization

    Source: A survey of product quantization.

  • PQ: Product quantization for nearest neighbor search. Jegou, Herve, Matthijs Douze, and Cordelia Schmid. [Paper, Code, Julia Code, nanopq]
  • k-selection on GPU: Billion-scale similarity search with gpus. Johnson, Jeff, Matthijs Douze, and HervĂ© JĂ©gou [Paper, Code]
  • A survey of product quantization. Matsui, Yusuke, Yusuke Uchida, HervĂ© JĂ©gou, and Shin’ichi Satoh [Paper]
  • OPQ: Optimized Product Quantization. Ge, Tiezheng, Kaiming He, Qifa Ke, and Jian Sun [Homepage, Paper, Code, nanopq]
  • Quicker adc: Unlocking the hidden potential of product quantization with simd. AndrĂ©, Fabien, Anne-Marie Kermarrec, and Nicolas Le Scouarnec [Paper, Code]
    • Accelerated nearest neighbor search with quick adc. AndrĂ©, Fabien, Anne-Marie Kermarrec, and Nicolas Le Scouarnec [Paper].
    • Cache locality is not enough: High-performance nearest neighbor search with product quantization fast scan. Fabien AndrĂ©, Anne-Marie Kermarrec, Nicolas Le Scouarnec [Paper]
  • ScaNN: Accelerating Large-Scale Inference with Anisotropic Vector Quantization. Guo, Ruiqi, Philip Sun, Erik Lindgren, Quan Geng, David Simcha, Felix Chern, and Sanjiv Kumar [Paper, Python/C++ Inference, Julia Training/Inference]
  • The inverted multi-index. Babenko, Artem, and Victor Lempitsky [Paper, Code]
  • Are We There Yet? Product Quantization and its Hardware Acceleration. Fernandez-Marques, Javier, Ahmed F. AbouElhamayed, Nicholas D. Lane, and Mohamed S. Abdelfattah. [Paper]
  • LibVQ: A Toolkit for Optimizing Vector Quantization and Efficient Neural Retrieval. Li, Chaofan, Zheng Liu, Shitao Xiao, Yingxia Shao, Defu Lian, and Zhao Cao. [Paper, Code]

Graph-based Methods

  • A comprehensive survey and experimental comparison of graph-based approximate nearest neighbor search. Wang, Mengzhao, Xiaoliang Xu, Qiang Yue, and Yuxiang Wang. [Paper, Code]
  • HNSW: Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs. Malkov, Yu A., and Dmitry A. Yashunin. [Paper, Code], Rust Version
  • Scaling Graph-Based ANNS Algorithms to Billion-Size Datasets: A Comparative Analysis. Dobson, Magdalen, Zheqi Shen, Guy E. Blelloch, Laxman Dhulipala, Yan Gu, Harsha Vardhan Simhadri, and Yihan Sun. [Paper]
  • FINGER: Fast Inference for Graph-based Approximate Nearest Neighbor Search. Chen, Patrick, Wei-Cheng Chang, Jyun-Yu Jiang, Hsiang-Fu Yu, Inderjit Dhillon, and Cho-Jui Hsieh [Paper, Video]
  • NSG : Navigating Spread-out Graph For Approximate Nearest Neighbor Search. Fu, Cong, Chao Xiang, Changxu Wang, and Deng Cai. [Paper, Code]
  • EFANNA : Extremely Fast Approximate Nearest Neighbor Search Algorithm Based on kNN Graph. Cong Fu, Deng Cai. [Paper, Code]

Hashing

  • Awesome Papers on Learning to Hash
  • A survey on learning to hash. Wang, Jingdong, Ting Zhang, Nicu Sebe, and Heng Tao Shen [Paper]
  • A survey on deep hashing methods. Luo, Xiao, Haixin Wang, Daqing Wu, Chong Chen, Minghua Deng, Jianqiang Huang, and Xian-Sheng Hua. [Paper]
  • Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval. Gong, Yunchao, Svetlana Lazebnik, Albert Gordo, and Florent Perronnin [Paper, Python code, Matlab code]

Evaluation & Metrics

  • Which BM25 do you mean? A large-scale reproducibility study of scoring variants. Kamphuis, Chris, Arjen P. de Vries, Leonid Boytsov, and Jimmy Lin [Paper]

đź“° Articles & Talks


comments powered by Disqus