catalog / Legal & Compliance / Blackstone — Legal NER & Text Categorizer
SkillLegal & ComplianceFree

Blackstone — Legal NER & Text Categorizer

spaCy-based NLP pipeline for English legal text: named entity recognition for cases, legislation, and provisions, plus text categorization.

インストール数21k
評価★ 4.5
レビュー7
ソースリポジトリ

Blackstone — Legal NER & Text Categorizer

Blackstone is an Apache-licensed spaCy NLP pipeline trained on the Incorporated Council of Law Reporting for England and Wales (ICLR&D) corpus. It provides named entity recognition purpose-built for legal text — identifying cases, legislation, provisions, instruments, neutral citations, and court references. It also ships a text categorizer for classifying sentence types in legal documents (issue, ratio, legal test, etc.).

Key Features

  • Custom NER for legal entities: CASENAME, CITATION, LEGISLATION, PROVISION, INSTRUMENT, COURT
  • Sentence-level text categorizer trained on case law
  • Span-level co-reference resolution for legal citations
  • Abbreviation detection for statute shorthand
  • Built on spaCy 3 — integrates with any spaCy pipeline

Quick Start

pip install blackstone
python -m blackstone.pipeline.download
import spacy
from blackstone.pipeline.abbreviations import AbbreviationDetector

nlp = spacy.load("en_blackstone_proto")
text = "The court in Donoghue v Stevenson [1932] AC 562 held that a duty of care existed."
doc = nlp(text)
for ent in doc.ents:
    print(ent.text, ent.label_)
# Donoghue v Stevenson [1932] AC 562  CASENAME
npx ai-supply add blackstone-legal-nlp

Curated mirror of the open-source Blackstone (Apache-2.0). Get it from the source.

More from @ai-supply

View profile →
Model
llama.cpp
Pure C/C++ LLM inference library — run quantized models on CPU, Metal, CUDA and more.
900k4.9
Connector
vLLM
High-throughput, memory-efficient LLM inference engine with PagedAttention and continuous batching.
820k4.9
Agent
MetaGPT
Multi-agent framework that assigns GPT roles (PM, engineer, QA) to solve complex software tasks end-to-end.
820k4.8
Skill
NLTK
The Natural Language Toolkit — Python's foundational NLP library for tokenization, POS tagging, parsing, and corpora.
760k4.7