◆SkillLanguage & NLPFree
NLTK
The Natural Language Toolkit — Python's foundational NLP library for tokenization, POS tagging, parsing, and corpora.
Installationen760k
Bewertung★ 4.7
Rezensionen253
NLTK — Natural Language Toolkit
NLTK is the de facto standard Python library for classical NLP. It provides easy-to-use interfaces to over 50 corpora and lexical resources (including WordNet), along with tokenizers, stemmers, taggers, parsers, semantic reasoners, and a suite of NLP utilities used in education and research worldwide.
Key Features
- Tokenization — word, sentence, and regexp tokenizers for 50+ languages
- POS tagging — perceptron and Brill taggers; trained on Penn Treebank
- Parsing — CFG, PCFG, dependency, and chart parsers with tree visualization
- Stemming & lemmatization — Porter, Snowball, Lancaster stemmers; WordNet lemmatizer
- Corpora — bundled access to Reuters, Brown, CoNLL, WordNet, and 50+ more via
nltk.download() - Semantic reasoning — first-order logic, lambda calculus, and discourse representation structures
- Classification — Naive Bayes, MaxEnt, Decision Tree classifiers for text
Quick Start
pip install nltk
import nltk
nltk.download('punkt_tab')
nltk.download('averaged_perceptron_tagger_eng')
from nltk.tokenize import word_tokenize
from nltk.tag import pos_tag
text = "NLTK powers thousands of NLP research papers."
tokens = word_tokenize(text)
print(pos_tag(tokens))
# [('NLTK', 'NNP'), ('powers', 'VBZ'), ...]
Install via ai-supply
npx ai-supply add nltk-natural-language-toolkit
Curated mirror of the open-source NLTK (Apache-2.0). Get it from the source.