EmbeddingData & ETLFree

FAISS

Facebook AI Similarity Search: blazing-fast library for efficient similarity search and clustering of dense vectors.

Instalaciones130k
Valoración★ 4.7
Reseñas43
Repositorio fuente

FAISS

FAISS (Facebook AI Similarity Search) is the industry-standard C++ library (with Python bindings) for nearest-neighbor search over high-dimensional embedding vectors. It powers the vector retrieval layer in countless RAG systems, recommendation engines, and semantic search applications.

Key Features

  • Multiple index types — Flat (exact), IVF (inverted file), HNSW, PQ (product quantization), and hybrids
  • GPU support — single and multi-GPU acceleration via CUDA for billion-scale search
  • Compression — Product Quantization and Scalar Quantization reduce memory by 4–64×
  • Billion-scale — benchmarked on datasets with 1B+ vectors
  • Python & C++ APIs — use from Python for prototyping, C++ for production integration
  • LangChain / LlamaIndex integration — FAISS vector store is a first-class integration in both frameworks

Quick Start

pip install faiss-cpu  # or faiss-gpu for CUDA
import faiss
import numpy as np

d = 128  # vector dimension
nb = 10000  # database size

xb = np.random.random((nb, d)).astype("float32")
index = faiss.IndexFlatL2(d)
index.add(xb)

xq = np.random.random((5, d)).astype("float32")
D, I = index.search(xq, k=5)  # 5 nearest neighbors
print(I)  # indices of nearest neighbors

Install via ai-supply

npx ai-supply add faiss-vector-search

Curated mirror of the open-source FAISS project (MIT). Install upstream from the repository.

More from @ai-supply

View profile →
Model
llama.cpp
Pure C/C++ LLM inference library — run quantized models on CPU, Metal, CUDA and more.
900k4.9
Connector
vLLM
High-throughput, memory-efficient LLM inference engine with PagedAttention and continuous batching.
820k4.9
Agent
MetaGPT
Multi-agent framework that assigns GPT roles (PM, engineer, QA) to solve complex software tasks end-to-end.
820k4.8
Skill
NLTK
The Natural Language Toolkit — Python's foundational NLP library for tokenization, POS tagging, parsing, and corpora.
760k4.7