◐ModelLegal & ComplianceFree
Legal-BERT — Legal Domain Language Model
BERT pre-trained on 12GB of diverse English legal text for superior legal NLP: case law, contracts, legislation, and EU/US court documents.
Legal-BERT — Legal Domain Language Model
Legal-BERT is a family of BERT models pre-trained on 12 GB of diverse English legal corpora, including EU legislation, US court opinions, contracts, and legal journals. It consistently outperforms general-purpose BERT on legal NLP tasks such as contract clause classification, legal judgement prediction, and named entity recognition in legal documents.
Key Features
- Three variants:
legal-bert-base-uncased,legal-bert-small-uncased,bert-base-uncasedfine-tuned on legal text - Pre-trained on EU legislation, US court opinions, contracts, and ECHR cases
- Drop-in replacement for
bert-base-uncased— use the same HuggingFace API - State-of-the-art performance on EURLEX, ECHR, and contract analysis benchmarks
- Supports zero-shot and few-shot fine-tuning with minimal labeled data
Quick Start
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("nlpaueb/legal-bert-base-uncased")
model = AutoModel.from_pretrained("nlpaueb/legal-bert-base-uncased")
inputs = tokenizer("The party shall indemnify the other party against all losses.", return_tensors="pt")
outputs = model(**inputs)
print(outputs.last_hidden_state.shape) # [1, seq_len, 768]
npx ai-supply add legal-bert-base-uncased
Curated mirror of the open-source Legal-BERT (CC-BY-SA-4.0). Get it from the source.