⬡PipelineOrchestrationFree
Dagster
Asset-oriented data orchestration platform for building, testing, and monitoring data pipelines in production.
Installs280k
Rating★ 4.7
Reviews93
Dagster
Dagster is an orchestration platform for the development, production, and observation of data assets. It lets you define data pipelines as code with a focus on software-defined assets — the data outputs your pipelines produce.
Key Features
- Software-defined assets: Declare data assets and their dependencies; Dagster determines execution order
- Integrated lineage: Automatically track which assets depend on which, with a visual DAG UI
- Type system: Annotate assets with types and get runtime validation for free
- Partitioned assets: Native support for backfills and incremental processing by partition
- Sensors & schedules: Event-driven triggers and cron-based scheduling
- First-class testing: Unit-test individual ops and assets without running full pipelines
- Ecosystem integrations: dbt, Spark, Snowflake, Databricks, Fivetran, and 100+ others
Quick Start
pip install dagster dagster-webserver
dagster dev
from dagster import asset, Definitions
@asset
def raw_data():
return [1, 2, 3, 4, 5]
@asset
def processed_data(raw_data):
return [x * 2 for x in raw_data]
defs = Definitions(assets=[raw_data, processed_data])
Add to ai-supply
npx ai-supply add dagster-data-orchestrator
Curated mirror of the open-source Dagster (Apache-2.0). Get it from the source.