◐ModelAudio & SpeechFree
WeSpeaker
Speaker embedding, verification, and diarization toolkit with pretrained ECAPA-TDNN, ResNet, and CAM++ models.
WeSpeaker
A production-oriented toolkit for speaker embedding, verification and diarization. It provides trainable and pretrained speaker-representation models (x-vector/TDNN, ResNet, ECAPA-TDNN, CAM++, ReDimNet) plus the full pipeline around them: voice activity detection, embedding extraction, back-end scoring, and clustering to answer "who spoke when." Built by the WeNet community, it emphasizes runtime deployment with ONNX/runtime export alongside research reproducibility.
Key features
- Pretrained speaker-embedding models: TDNN x-vector, ResNet, ECAPA-TDNN, CAM++, ReDimNet
- End-to-end diarization: VAD, embedding, scoring and spectral/agglomerative clustering
- Speaker verification recipes with standard trial-based EER/minDCF evaluation
- ONNX and C++ runtime export for on-device and server deployment
- Recipes for VoxCeleb, CN-Celeb and other benchmark corpora
Curated mirror of the open-source WeSpeaker (Apache-2.0). Get it from the source.