⬡PipelineMarketingFree
Sumy — Automatic Text Summarization
Python library with 7 summarization algorithms (LSA, Luhn, Lex Rank, TextRank) for documents and HTML pages.
インストール数58k
評価★ 4.6
レビュー19
Sumy — Automatic Text Summarization
Sumy is a Python module and CLI for extractive text summarization, implementing seven proven algorithms: LSA, Luhn, Edmundson, Lex Rank, TextRank, SumBasic, and KL-Sum. Works on raw text, HTML, or plain URLs — no LLM required.
Key features
- 7 summarization algorithms; easily swap to compare quality
- Supports HTML page input (strips boilerplate automatically)
- Multi-language support via NLTK tokenizers (30+ languages)
- CLI for quick prototyping; Python API for pipelines
- Zero API calls — fully local, no rate limits or cost
Quick start
pip install sumy
# Summarize a URL with LexRank in 5 sentences
sumy lex-rank --url https://en.wikipedia.org/wiki/Artificial_intelligence --sentences 5
from sumy.parsers.html import HtmlParser
from sumy.nlp.tokenizers import Tokenizer
from sumy.summarizers.lex_rank import LexRankSummarizer
parser = HtmlParser.from_url("https://example.com/article", Tokenizer("english"))
summarizer = LexRankSummarizer()
for sentence in summarizer(parser.document, sentences_count=5):
print(sentence)
npx ai-supply add sumy-text-summarization
Curated mirror of the open-source Sumy (Apache-2.0). Get it from the source.