⊜Fine-tuneLanguage & NLPFree
DeepSpeed
Microsoft's deep learning optimization library — train trillion-parameter models with ZeRO memory parallelism.
Installs420k
Rating★ 4.7
Reviews140
DeepSpeed
DeepSpeed is a deep learning optimization library by Microsoft that makes distributed training and inference of large models easy, efficient, and effective. Its ZeRO (Zero Redundancy Optimizer) technology eliminates memory redundancy across data-parallel processes, enabling training of models with hundreds of billions to trillions of parameters on standard GPU clusters.
Key Features
- ZeRO Optimizer (Stage 1/2/3): partition optimizer states, gradients, and parameters across GPUs
- ZeRO-Infinity: offload to CPU and NVMe for near-infinite model scale
- 3D Parallelism: combine data, tensor, and pipeline parallelism
- DeepSpeed-Chat: end-to-end RLHF training pipeline
- Inference optimization: INT4/INT8 quantization, kernel fusion, custom CUDA kernels
- MoE (Mixture of Experts) training support
Quick Start
import deepspeed
model_engine, optimizer, _, _ = deepspeed.initialize(
args=args,
model=model,
model_parameters=model.parameters(),
config="ds_config.json"
)
for batch in data_loader:
outputs = model_engine(batch)
loss = outputs.loss
model_engine.backward(loss)
model_engine.step()
Install via ai-supply
npx ai-supply add deepspeed-distributed-training
Curated mirror of the open-source DeepSpeed (Apache-2.0). Get it from the source.