◐ModelAudio & SpeechFree
OpenVoice
Instant voice cloning with precise style control — tone, emotion, accent, and rhythm from a single short reference clip.
Installs430k
Rating★ 4.7
Reviews143
OpenVoice
OpenVoice is a versatile instant voice cloning model from MyShell AI. A single short audio reference clip is sufficient to clone a voice and apply it to any text, with independent control over emotion, accent, rhythm, pauses, and intonation. Commercial and research use is permitted under MIT.
Key Features
- Instant cloning — no fine-tuning required; clone any voice from a clip as short as 5 seconds
- Granular style control — independently adjust emotion (happy, sad, angry, surprised), accent, and speaking speed
- Cross-lingual — supports English, Spanish, French, Chinese, Japanese, Korean, and more
- Two-stage pipeline — base TTS generates mel-spectrogram; tone color converter transfers voice characteristics
- OpenVoice V2 — improved naturalness and accent accuracy over V1; models on HuggingFace
- MeloTTS integration — pairs with MyShell's high-quality, multilingual MeloTTS base speaker
Quick Start
git clone https://github.com/myshell-ai/OpenVoice.git
cd OpenVoice && pip install -e .
from openvoice import se_extractor
from openvoice.api import ToneColorConverter
tcc = ToneColorConverter('checkpoints_v2/converter', device='cuda')
reference_se, _ = se_extractor.get_se('reference.mp3', tcc, vad=True)
Install via ai-supply
npx ai-supply add openvoice-instant-voice-clone
Curated mirror of the open-source OpenVoice (MIT). Get it from the source.