Name: PaddleOCR
Availability: InStock
Author: ai-supply

PaddleOCR

PaddleOCR is PaddlePaddle's comprehensive OCR toolkit covering the complete pipeline from text detection to recognition and structured layout analysis. It supports 80+ languages, handles rotated and curved text, and ships production-ready server and edge-deploy builds.

Key Features

End-to-end pipeline: DB text detection → SVTR/PP-OCRv4 recognition → PP-Structure layout recovery in one call
80+ languages: Latin, Chinese, Japanese, Korean, Arabic (RTL), Devanagari, and more
PP-Structure: table recognition, form extraction, and document layout analysis — outputs structured Excel/HTML
PP-ChatOCR: multimodal RAG pipeline that answers questions about document images
Model compression: INT8 quantisation and pruning for edge deployment (Raspberry Pi, phones)
Benchmark: PP-OCRv4 tops open-source OCR leaderboards on ICDAR and scene-text datasets

Quick Start

pip install paddlepaddle paddleocr

from paddleocr import PaddleOCR

ocr = PaddleOCR(use_angle_cls=True, lang='en')
result = ocr.ocr('invoice.jpg', cls=True)
for line in result[0]:
    print(line[1][0])  # recognised text

npx ai-supply add paddleocr-multilingual-text-recognition

Curated mirror of the open-source PaddleOCR (Apache-2.0). Get it from the source.

PaddleOCR

PaddleOCR

Key Features

Quick Start

More from @ai-supply