Skip to content
ai-supply.store
DiscoverCategoriesLeaderboardsCommunityAgent APIFAQ
PublishSign in
catalog / Audio & Speech / NVIDIA NeMo — Scalable Speech & LLM Training Framework
⬡PipelineAudio & SpeechFree

NVIDIA NeMo — Scalable Speech & LLM Training Framework

NVIDIA's modular framework for training, fine-tuning, and deploying speech recognition, TTS, and large language models at scale.

@ai-supply
Installs170k
Rating★ 4.6
Reviews57
↗ Source repository

NVIDIA NeMo

NeMo is NVIDIA's end-to-end framework for developing and deploying state-of-the-art conversational AI, large language models, and speech models. It is built on PyTorch Lightning and supports distributed training across thousands of GPUs with tensor, pipeline, and data parallelism.

Key Features

  • ASR: Conformer, Citrinet, FastConformer — SOTA word error rates
  • TTS: FastPitch, HiFi-GAN, Mixer-TTS for natural speech synthesis
  • NLP/LLM: GPT-style training, instruction tuning (SFT), RLHF, parameter-efficient fine-tuning
  • Multimodal: vision-language alignment pipelines
  • Collections: modular model collections for ASR, NLP, TTS, Vision
  • Megatron-LM integration for ultra-large-scale training
  • Deployment: NVIDIA TRT-LLM, Triton Inference Server export paths

Quick Start

import nemo.collections.asr as nemo_asr

# Load a pre-trained ASR model
asr_model = nemo_asr.models.EncDecCTCModelBPE.from_pretrained(
    model_name="stt_en_conformer_ctc_large"
)

transcriptions = asr_model.transcribe(["podcast.wav"])
print(transcriptions[0])

Install via ai-supply

npx ai-supply add nemo-speech-and-llm-framework

Curated mirror of the open-source NeMo (Apache-2.0). Get it from the source.

More from @ai-supply

View profile →
◐Model
llama.cpp
Pure C/C++ LLM inference library — run quantized models on CPU, Metal, CUDA and more.
↓ 900k★ 4.9
⇄Connector
vLLM
High-throughput, memory-efficient LLM inference engine with PagedAttention and continuous batching.
↓ 820k★ 4.9
⠿Embedding
Sentence Transformers
State-of-the-art sentence and text embeddings — compute semantic similarity, clustering, and dense retrieval.
↓ 750k★ 4.9
⬡Pipeline
Diffusers
Hugging Face's state-of-the-art library for diffusion-based image, video, and audio generation models.
↓ 750k★ 4.9
ai-supply.store

The marketplace for AI capabilities. Skills, MCPs, plugins, agents, datasets — discoverable by humans, consumable by machines.

api · v3.1status · all green
Marketplace
  • Discover
  • Categories
  • Leaderboards
  • Benchmarks
Community
  • Community
  • FAQ
For agents
  • Quickstart (60s)
  • Authorize an agent
  • Agent API
  • OpenAPI spec
For builders
  • Publish
  • Dashboard
  • Revenue share
Account
  • Sign in
  • Settings
Legal
  • Terms
  • Publisher Agreement
  • Acceptable Use
  • Privacy