Skip to content
ai-supply.store
DiscoverCategoriesLeaderboardsCommunityAgent APIFAQ
PublishSign in
catalog / DevOps & Infra / BentoML
⇄ConnectorDevOps & InfraFree

BentoML

Build, ship, and scale AI services — unified framework from local development to production Kubernetes.

@ai-supply
Installs230k
Rating★ 4.7
Reviews77
↗ Source repository

BentoML

BentoML is an open-source unified model serving framework that lets you build AI services from any ML framework and deploy them on any infrastructure. It handles the full lifecycle from packaging models into reproducible Bentos to autoscaling Kubernetes deployments with adaptive batching.

Key Features

  • Framework agnostic: PyTorch, TensorFlow, Keras, XGBoost, scikit-learn, LLMs, diffusion models
  • Adaptive micro-batching: automatically batch requests for optimal GPU throughput
  • Runners API: modular service composition with independent scaling
  • Bento packaging: reproducible bundles with model, code, dependencies, Dockerfile
  • BentoCloud integration: one-command deployment to managed inference infrastructure
  • Built-in OpenTelemetry, Prometheus metrics, and gRPC support

Quick Start

import bentoml

@bentoml.service
class SentimentAnalyzer:
    model = bentoml.models.get("sentiment:latest")

    @bentoml.api
    def classify(self, text: str) -> str:
        return self.model.predict([text])[0]
# Serve locally
bentoml serve sentiment_service:SentimentAnalyzer

# Build + containerize
bentoml build && bentoml containerize sentiment:latest

Install via ai-supply

npx ai-supply add bentoml-model-serving-framework

Curated mirror of the open-source BentoML (Apache-2.0). Get it from the source.

More from @ai-supply

View profile →
◐Model
llama.cpp
Pure C/C++ LLM inference library — run quantized models on CPU, Metal, CUDA and more.
↓ 900k★ 4.9
⇄Connector
vLLM
High-throughput, memory-efficient LLM inference engine with PagedAttention and continuous batching.
↓ 820k★ 4.9
⠿Embedding
Sentence Transformers
State-of-the-art sentence and text embeddings — compute semantic similarity, clustering, and dense retrieval.
↓ 750k★ 4.9
⬡Pipeline
Diffusers
Hugging Face's state-of-the-art library for diffusion-based image, video, and audio generation models.
↓ 750k★ 4.9
ai-supply.store

The marketplace for AI capabilities. Skills, MCPs, plugins, agents, datasets — discoverable by humans, consumable by machines.

api · v3.1status · all green
Marketplace
  • Discover
  • Categories
  • Leaderboards
  • Benchmarks
Community
  • Community
  • FAQ
For agents
  • Quickstart (60s)
  • Authorize an agent
  • Agent API
  • OpenAPI spec
For builders
  • Publish
  • Dashboard
  • Revenue share
Account
  • Sign in
  • Settings
Legal
  • Terms
  • Publisher Agreement
  • Acceptable Use
  • Privacy