Category
Language & NLP
Translation, summarization, extraction.
34 listings
◐Модель
llama.cpp
Pure C/C++ LLM inference library — run quantized models on CPU, Metal, CUDA and more.
ai-supply
↓ 900k★ 4.9
◆Навык
NLTK
The Natural Language Toolkit — Python's foundational NLP library for tokenization, POS tagging, parsing, and corpora.
ai-supply
↓ 760k★ 4.7
⠿Эмбеддинг
Sentence Transformers
State-of-the-art sentence and text embeddings — compute semantic similarity, clustering, and dense retrieval.
ai-supply
↓ 750k★ 4.9
▣Датасет
Hugging Face Datasets
Fast, memory-mapped dataset library for NLP and ML — 50,000+ datasets, streaming, and Arrow-backed processing.
ai-supply
↓ 680k★ 4.8
◐Модель
GPT4All
Run powerful open-source LLMs locally — privacy-first desktop app and Python SDK, no internet required.
ai-supply
↓ 520k★ 4.7
◆Навык
Hugging Face Tokenizers
Ultra-fast tokenizer library (Rust core) — BPE, WordPiece, SentencePiece — tokenize GBs in seconds.
ai-supply
↓ 520k★ 4.8
◆Навык
spaCy
Industrial-strength NLP library for Python with pre-trained pipelines for tokenization, NER, parsing, and more.
ai-supply
↓ 460k★ 4.8
◆Навык
Hugging Face Accelerate
Run PyTorch training scripts on any hardware — single GPU, multi-GPU, TPU — with minimal code changes.
ai-supply
↓ 450k★ 4.8
⊜Дообучение
DeepSpeed
Microsoft's deep learning optimization library — train trillion-parameter models with ZeRO memory parallelism.
ai-supply
↓ 420k★ 4.7
⊜Дообучение
bitsandbytes
8-bit and 4-bit quantization for PyTorch — run and fine-tune LLMs on consumer GPUs with minimal quality loss.
ai-supply
↓ 390k★ 4.7
⊜Дообучение
TRL (Transformer Reinforcement Learning)
Fine-tune LLMs with RLHF, PPO, DPO, SFT, and GRPO — the standard library for aligning language models.
ai-supply
↓ 380k★ 4.8
⊜Дообучение
LLaMA-Factory
Unified fine-tuning framework for 100+ LLMs — SFT, RLHF, DPO, LoRA, QLoRA via Web UI or CLI.
ai-supply
↓ 340k★ 4.8
◐Модель
Mistral-7B-v0.1
Apache-licensed 7B language model from Mistral AI — beats Llama 2 13B on most benchmarks at half the size.
ai-supply
↓ 320k★ 4.8
◐Модель
Transformers
Hugging Face Transformers: state-of-the-art pre-trained models for NLP, vision, audio, and multimodal tasks.
ai-supply
↓ 295k★ 4.9
◐Модель
fastText — Fast Text Classification & Word Vectors
Facebook AI's library for efficient text classification and word representation learning — trains in seconds on millions of documents.
ai-supply
↓ 265k★ 4.6
❝Промпт
Awesome ChatGPT Prompts
The largest open prompt collection — thousands of curated system prompts for personas, coding, writing, education, and creative tasks. CC0 licensed.
ai-supply
↓ 248k★ 4.9
⠿Эмбеддинг
BGE-large-en-v1.5
MIT-licensed SOTA English embedding model from BAAI — top MTEB leaderboard performer, commercial-friendly.
ai-supply
↓ 230k★ 4.8
◐Модель
Phi-3-mini-4k-instruct
Microsoft's MIT-licensed 3.8B SLM — instruction-tuned, runs on CPU/mobile, punches far above its weight class.
ai-supply
↓ 210k★ 4.7
⠿Эмбеддинг
all-MiniLM-L6-v2
384-dimensional sentence embeddings with tens of millions of downloads — fast, compact, and remarkably accurate for semantic search and clustering.
ai-supply
↓ 210k★ 4.9
❝Промпт
Prompt Engineering Guide
MIT-licensed comprehensive guide and prompt library — techniques, examples, and templates for every major LLM prompting method.
ai-supply
↓ 187k★ 4.8
◆Навык
BERTopic
Modular topic modeling framework using transformer embeddings and c-TF-IDF for interpretable, coherent topics.
ai-supply
↓ 180k★ 4.7
◐Модель
Qwen2.5-7B-Instruct
Alibaba's Apache-2.0 7B instruction model — top multilingual performance, 128k context, strong at coding and math.
ai-supply
↓ 175k★ 4.8
◆Навык
Flair — State-of-the-Art NLP Framework
Simple NLP library with SOTA models for NER, POS tagging, chunking, text classification, and contextual string embeddings.
ai-supply
↓ 145k★ 4.7
⠿Эмбеддинг
nomic-embed-text-v1
Apache-2.0 text embedding model with 8192-token context — the first open, auditable, long-context embedding.
ai-supply
↓ 145k★ 4.7
⊜Дообучение
PEFT
Hugging Face library for parameter-efficient fine-tuning — LoRA, QLoRA, IA3, and more. Fine-tune billion-parameter models on a single consumer GPU.
ai-supply
↓ 98k★ 4.8
▣Датасет
OpenOrca
MIT-licensed 4.2M instruction dataset — GPT-4/3.5 augmented CoT traces that power top open-source fine-tunes.
ai-supply
↓ 98k★ 4.6
⬡Пайплайн
Haystack
Production-ready NLP pipeline framework for building search, RAG, and question-answering systems with any LLM.
ai-supply
↓ 95k★ 4.6
⊜Дообучение
Axolotl
Apache-2.0 fine-tuning framework — train any HuggingFace model with LoRA/QLoRA/full-fine-tune via a single YAML config.
ai-supply
↓ 93k★ 4.7
▣Датасет
Databricks Dolly-15k
CC-BY-SA-3.0 instruction dataset of 15k human-written prompts — the first commercially licensed open instruction dataset.
ai-supply
↓ 84k★ 4.5
◆Навык
Stanza — Stanford NLP Python Toolkit
Stanford NLP's Python library for tokenisation, sentence segmentation, NER, dependency parsing, and coreference across 70+ languages.
ai-supply
↓ 78k★ 4.6
△Эвал
RAGAS
Apache-2.0 RAG evaluation framework — faithfulness, answer relevancy, context recall, and more in one pip install.
ai-supply
↓ 58k★ 4.6
⬡Пайплайн
txtai
All-in-one semantic search, RAG, and LLM workflow engine — embeddings, vector DB, and pipelines in one library.
ai-supply
↓ 55k★ 4.6
◆Навык
Argilla — Collaborative Data Annotation for LLMs
Open-source annotation platform for building high-quality fine-tuning and RLHF datasets; integrates with Hugging Face Hub.
ai-supply
↓ 48k★ 4.5
⊕Плагин
spacy-llm — LLMs in spaCy NLP Pipelines
Integrates LLMs as spaCy pipeline components for NER, classification, lemmatisation, and relation extraction with zero/few-shot prompting.
ai-supply
↓ 14k★ 4.4