Vision QA bot on a budget: Segment Anything + supervision, both free on the catalog
Vision QA bot on a budget: Segment Anything + supervision, both free on the catalog
I've been building a visual quality-assurance tool for a small electronics manufacturer — they want to flag obvious PCB defects before boards go into final assembly. Commercial CV APIs were quoted at $0.004 per image, which adds up fast at 5,000 boards a day. I went looking for a free path and found it in the catalog.
The two free listings
- segment-anything-model — Meta's SAM; zero-shot segmentation, no class-specific training required
- supervision-vision-toolkit — Roboflow's annotation and detection utilities; great for drawing detections and computing IoU metrics
Both free to install, both security-scanned with no issues. The listing security tabs show the scan report clearly — for ML model files that's especially useful since the scanner checks for pickle-based exploits and hidden execution in the model format layer.
The setup
import cv2
import numpy as np
import supervision as sv
from segment_anything import sam_model_registry, SamAutomaticMaskGenerator
# Load SAM (vit_b checkpoint, ~375 MB)
sam = sam_model_registry["vit_b"](checkpoint="sam_vit_b.pth")
mask_generator = SamAutomaticMaskGenerator(
model=sam,
points_per_side=16,
pred_iou_thresh=0.88,
stability_score_thresh=0.95
)
def inspect_board(image_path):
img = cv2.imread(image_path)
img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
masks = mask_generator.generate(img_rgb)
# Flag segments with unusually small or irregular area as potential defects
defects = [m for m in masks if m["area"] < 200 or m["stability_score"] < 0.7]
# Annotate with supervision
detections = sv.Detections.from_sam(masks)
annotator = sv.MaskAnnotator()
annotated = annotator.annotate(scene=img_rgb.copy(), detections=detections)
return annotated, len(defects)
result_img, defect_count = inspect_board("board_0042.jpg")
print(f"Potential defects: {defect_count}")
Numbers after two weeks of testing
- Defect detection rate: ~73% on a labelled test set of 800 images (false negative rate still a bit high for solo use, but good as a first-pass filter)
- False positive rate: ~18% — acceptable for "flag for human review"
- Throughput: ~1.1 images/second on a single T4 GPU (rented at $0.35/hr only during batch runs)
- Effective cost per image: < $0.0001
Compared to the commercial API quote, we're saving ~$4,900/month at current volume.
The catalog's vision category has more listings building on this foundation — if you're doing any CV work, it's worth a browse before reaching for a paid API.