Name: Grounded SAM — Open-Vocabulary Detection + Segmentation
Availability: InStock
Author: ai-supply

Grounded Segment Anything

Grounded SAM marries Grounding DINO (open-vocabulary detection) with Segment Anything Model (SAM) to create a pipeline that can detect and precisely segment any object described in free-form text — no classes, no training.

Key Features

Text-prompt detection via Grounding DINO: "a cat", "all vehicles", "the red cup"
Pixel-perfect segmentation masks via SAM for each detected object
Extensions: Stable Diffusion inpainting, RAM++ auto-tagging, Recognize Anything
Grounded SAM 2 variant using SAM 2 for video object tracking+segmentation
REST API and Gradio demo included
Batch processing support for large image datasets

Quick Start

import groundingdino.datasets.transforms as T
from groundingdino.util.inference import load_model, predict
from segment_anything import sam_model_registry, SamPredictor

# 1) Detect with text prompt
model = load_model("groundingdino/config/GroundingDINO_SwinT.py", "weights/groundingdino_swint.pth")
boxes, logits, phrases = predict(model, image, caption="a cat", box_threshold=0.3, text_threshold=0.25)

# 2) Segment detected boxes
sampredictor = SamPredictor(sam_model_registry["vit_h"](checkpoint="weights/sam_vit_h.pth"))
sampredictor.set_image(image_np)
masks, _, _ = sampredictor.predict_torch(point_coords=None, point_labels=None, boxes=boxes)

Install via ai-supply

npx ai-supply add grounded-segment-anything-pipeline

Curated mirror of the open-source Grounded-Segment-Anything (Apache-2.0). Get it from the source.

Grounded SAM — Open-Vocabulary Detection + Segmentation

Grounded Segment Anything

Key Features

Quick Start

Install via ai-supply

More from @ai-supply