Ethereum ETL — On-Chain Data Pipeline
Extracts blocks, transactions, token transfers, receipts, and logs from Ethereum into CSV/JSON or warehouses for on-chain analytics.
Ethereum ETL
Ethereum ETL is a widely used set of Python tools for extracting, transforming, and loading Ethereum blockchain data into analytics-ready formats. It converts raw JSON-RPC data into structured tables of blocks, transactions, ERC-20/ERC-721 token transfers, receipts, logs, and contracts, and is the pipeline behind Google BigQuery's public Ethereum dataset.
Key features
- Exports blocks, transactions, receipts, logs, and contracts
- Decodes ERC-20 and ERC-721 token transfers
- Streams data to CSV/JSON, Postgres, BigQuery, or message queues
- Extends to other EVM-compatible chains via the broader blockchain-ETL project
- Scales from a single node to production data-warehouse ingestion
Point it at an Ethereum JSON-RPC endpoint and run ethereumetl export_all (or the streaming command) to build a queryable on-chain dataset for wallet analytics, DeFi research, or compliance investigations.
Curated mirror of the open-source Ethereum ETL (MIT). Get it from the source.