⬡PipelineAudio & SpeechFree
SpeechBrain
All-in-one conversational AI toolkit for ASR, speaker recognition, speech enhancement, and language identification.
SpeechBrain
SpeechBrain is an open-source, all-in-one conversational AI platform developed at Mila and Université de Montréal. A single, modular codebase covers automatic speech recognition, speaker recognition and diarisation, speech enhancement and separation, language identification, and spoken language understanding.
Key Features
- 200+ pretrained models on HuggingFace Hub across all speech tasks
- Modular Brain class: compose any pipeline from reusable blocks
- State-of-the-art ASR with Transformer, Conformer, and hybrid CTC/attention
- Speaker verification and identification (ECAPA-TDNN, x-vectors)
- Speech enhancement: MetricGAN+, SEGAN, and ConvTasNet separation
Quick Start
pip install speechbrain
import speechbrain as sb
from speechbrain.inference.ASR import EncoderDecoderASR
asr_model = EncoderDecoderASR.from_hparams(
source="speechbrain/asr-conformer-transformerlm-librispeech",
savedir="pretrained_models/asr-transformer-transformerlm-librispeech",
)
result = asr_model.transcribe_file("audio.wav")
print(result)
npx ai-supply add speechbrain-audio-toolkit
Curated mirror of the open-source SpeechBrain (Apache-2.0). Get it from the source.