⊜Fine-tuneLanguage & NLPFree
PEFT
Hugging Face library for parameter-efficient fine-tuning — LoRA, QLoRA, IA3, and more. Fine-tune billion-parameter models on a single consumer GPU.
PEFT — Parameter-Efficient Fine-Tuning
PEFT is the Hugging Face library for fine-tuning large language models without updating all parameters. Using techniques like LoRA, QLoRA, and IA3, you can adapt a 7B+ parameter model on a single consumer GPU in hours while matching or exceeding full fine-tuning quality on downstream tasks.
Key features
- LoRA — inject low-rank adapter matrices; only train ~0.1% of parameters
- QLoRA — quantize base model to 4-bit, then add LoRA adapters — fine-tune 70B models on 48 GB VRAM
- Prompt tuning & prefix tuning — learn a soft prompt prefix instead of touching weights
- IA3 — learns rescaling vectors; even more parameter-efficient than LoRA
- Merge & unload — merge adapters back into base weights for zero-latency inference
- Seamless HF ecosystem — works with
transformers,trl,accelerate, andbitsandbytes
Quick start
npx ai-supply add peft-parameter-efficient-finetuning
# Or install directly
pip install peft transformers torch
from transformers import AutoModelForCausalLM
from peft import LoraConfig, get_peft_model
base_model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-v0.1")
lora_config = LoraConfig(
r=16,
lora_alpha=32,
target_modules=["q_proj", "v_proj"],
lora_dropout=0.05,
task_type="CAUSAL_LM"
)
model = get_peft_model(base_model, lora_config)
model.print_trainable_parameters()
# trainable params: 4,194,304 || all params: 3,756,523,520 || trainable%: 0.1117
Curated mirror of the open-source PEFT project (Apache-2.0). Install upstream from the repository.