Category

Research

Web, papers, knowledge graphs, citation.

17 listings

Every Research listing on ai-supply is a free, open-source AI capability that we scan and grade A–D for security before it appears here — so you can adopt with confidence, not just a star count. This category spans 4 Agents, 3 Evals, 3 MCP servers, 2 Pipelines, 2 Connectors, 1 Workflow, 1 Dataset and 1 Template.

Security posture: 6 of 17 scanned rated safe · avg score 84/100.

Research leaderboard →How we grade security →Security findings →All categories →

LM Evaluation Harness

EleutherAI's MIT-licensed unified benchmark suite — the de-facto standard for evaluating language models across 200+ tasks.

✓ A · 100⟳ 2mo agoguards:2

Autonomous AI research agent that searches the web and produces detailed, cited research reports in minutes.

! B · 75⟳ 10d agoshellnetworkfs

MIT-licensed framework for evaluating LLMs and AI systems — build custom evals, run model comparisons, log results.

✓ A · 100⟳ 3mo agoguards:4

Stanford's LLM-powered knowledge curation agent that researches any topic and generates a full Wikipedia-style article.

! B · 75⟳ 1y agonetworkfs

Microsoft's graph-based RAG: build knowledge graphs from documents for global, multi-hop reasoning beyond vector search.

! B · 75⟳ 10d agosecretsshellnetwork

Tavily Search MCP Server

Official Tavily MCP server giving agents real-time web search, content extraction, site mapping, and crawling.

✓ A · 100⟳ 3d agonetworkfs

Firecrawl MCP Server

Official Firecrawl MCP server — web search, scraping to clean data, and autonomous deep research.

! B · 88⟳ 10mo agoshellnetworkfs

AI agent that retrieves, reads, and synthesises answers from scientific PDFs with citation-level accuracy.

! B · 75⟳ 4mo agonetworkfs

AI-powered literature discovery and review engine for medical and scientific papers, built on txtai semantic search.

✓ A · 100⟳ 1y agosecretsnetworkfs

HELM — Holistic Evaluation of Language Models

Stanford CRFM's reproducible, multi-metric benchmark framework for evaluating any foundation model.

! B · 75⟳ 2mo agoguards:7

MMLU: 57-subject multiple-choice benchmark for broad LLM knowledge and reasoning.

✓ A · 100⟳ 3y ago

arxiv.py — arXiv API Python Wrapper

Pythonic client for the arXiv API: search, download, and stream 2M+ preprints by query, author, or ID.

✓ A · 95⟳ 2mo agonetworkfs

Fast retrieval-augmented generation framework that fuses knowledge-graph structure with vector retrieval.

! D · 58⟳ 15d agosecretsshellinstall-hook

Prompt-driven web scraping: describe the data you want and LLM graph pipelines extract clean structured JSON from any page.

! B · 75⟳ 8d agoshellnetworkfs

arxiv-sanity-lite

Self-hostable web app to tag arXiv papers and get recommendations of similar papers using SVMs over tf-idf abstract features.

! B · 88⟳ 3y ago

semanticscholar

Unofficial Python client for the Semantic Scholar Academic Graph APIs: papers, authors, citations, and references.

! B · 75⟳ 4mo agonetworkfs

Connects AI assistants to Exa web search, code search, and company/deep research.

! B · 75⟳ 4d agonetworkfs