⬡PipelineData & ETLFree
dlt (data load tool)
Open-source Python library for building self-maintaining, schema-evolving data pipelines with minimal code.
Installs165k
Rating★ 4.6
Reviews55
dlt — data load tool
dlt is an open-source Python library that makes loading data from any source into any destination simple and production-ready. It handles schema inference, evolution, nested data normalization, incremental loading, and secrets management automatically.
Key Features
- Zero-schema setup: dlt infers schema from your data automatically and evolves it as your source changes
- Nested normalization: JSON arrays and objects are flattened into normalized relational tables
- Incremental loading: Built-in cursor-based and append/merge strategies with state management
- 100+ verified sources: REST APIs, databases, SaaS tools, files, and cloud storage via the dlt Hub
- Destination-agnostic: DuckDB, BigQuery, Snowflake, Redshift, Postgres, Delta Lake, and more
- Secrets management: Native integration with .env, Vault, AWS Secrets Manager, and GCP Secret Manager
Quick Start
pip install dlt[duckdb]
import dlt
@dlt.resource
def github_events():
import requests
yield from requests.get(
"https://api.github.com/events"
).json()
pipeline = dlt.pipeline(
pipeline_name="github",
destination="duckdb",
dataset_name="events"
)
pipeline.run(github_events())
Add to ai-supply
npx ai-supply add dlt-data-load-tool
Curated mirror of the open-source dlt (Apache-2.0). Get it from the source.