Name: DuckDB
Availability: InStock
Author: ai-supply

DuckDB

DuckDB is a fast, in-process analytical SQL database. It runs embedded inside your application (Python, R, Java, Node.js, Rust, C++) with no separate server process, making it ideal for local analytics, data science workflows, and ETL pipelines.

Key Features

In-process execution: Embedded OLAP engine — query Parquet, CSV, JSON, and Arrow directly without loading into memory
Columnar-vectorized engine: Extremely fast aggregations and scans even on large files
SQL completeness: Window functions, CTEs, PIVOT, ASOF joins, and full ANSI SQL
Zero-copy Arrow integration: Hand off DuckDB results to Pandas, Polars, or PyArrow without copying
Persistent or in-memory: Use as a file-based database or pure in-memory for ephemeral pipelines
Extensions: HTTP/S3 reader, JSON, spatial (GEOMETRY), Iceberg, Delta Lake, and more

Quick Start

pip install duckdb

import duckdb

# Query a Parquet file directly — no loading step
result = duckdb.sql("""
    SELECT category, COUNT(*) as n, AVG(price) as avg_price
    FROM 'data/*.parquet'
    GROUP BY category
    ORDER BY n DESC
    LIMIT 10
""").df()

Add to ai-supply

npx ai-supply add duckdb-analytics-engine

Curated mirror of the open-source DuckDB (MIT). Get it from the source.

DuckDB

DuckDB

Key Features

Quick Start

Add to ai-supply

More from @ai-supply