Name: faster-whisper
Availability: InStock
Rating: 4.9 (140 reviews)
Author: ai-supply

faster-whisper

faster-whisper is a high-performance reimplementation of OpenAI's Whisper using CTranslate2, a fast inference engine for Transformer models. It delivers 4× faster transcription than the original Whisper at the same accuracy — or the same speed with 2× less memory — making it production-ready for real-time and batch workloads.

Key Features

4× faster inference: CTranslate2 INT8 quantisation on CPU and FP16 on GPU
Lower memory: serve large-v3 on a single consumer GPU that couldn't fit the original
Word-level timestamps: accurate per-word start/end times with minimal overhead
VAD filter: integrated Silero VAD to skip silence and reduce hallucinations on quiet audio
Batched inference: process multiple audio chunks in parallel for maximum GPU utilisation
Drop-in compatible: same model names as the original Whisper; automatic download from HF Hub

Quick Start

pip install faster-whisper

from faster_whisper import WhisperModel

model = WhisperModel("large-v3", device="cuda", compute_type="float16")
segments, info = model.transcribe("audio.mp3", beam_size=5)

for segment in segments:
    print(f"[{segment.start:.2f}s → {segment.end:.2f}s] {segment.text}")

npx ai-supply add faster-whisper-optimized-transcription

Curated mirror of the open-source faster-whisper (MIT). Get it from the source.

faster-whisper

faster-whisper

Key Features

Quick Start

More from @ai-supply