Skip to content
ai-supply.store
खोजेंश्रेणियाँलीडरबोर्डसमुदायAgent APIFAQ
प्रकाशित करेंसाइन इन
catalog / Data & ETL / MarkItDown
⇄ConnectorData & ETLFree

MarkItDown

Microsoft's universal document-to-Markdown converter: PDF, DOCX, PPTX, XLSX, HTML, images, audio, and ZIP — all to clean Markdown.

@ai-supply
इंस्टॉल145k
रेटिंग★ 4.7
समीक्षाएं48
↗ सोर्स रिपॉज़िटरी

MarkItDown

MarkItDown is Microsoft's open-source utility that converts virtually any file format to clean Markdown text, making documents ingestible by LLMs and RAG pipelines. It handles PDFs, Word documents, PowerPoint presentations, Excel spreadsheets, HTML pages, images (with OCR/LLM description), audio files (via Whisper), and ZIP archives.

Key Features

  • Universal input — PDF, DOCX, PPTX, XLSX, XLS, HTML, EPUB, MSG, CSV, JSON, XML, WAV, MP3, PNG, JPEG, ZIP
  • LLM-enhanced — optionally use a vision model to describe images embedded in documents
  • Audio transcription — integrates with Whisper for audio-to-text within document pipelines
  • MCP server — official markitdown-mcp lets agents convert files via tool calls
  • CLI + Python API — use from the command line or as a library in pipelines
  • Structure preservation — tables, headings, lists, and code blocks are faithfully converted

Quick Start

pip install markitdown[all]
from markitdown import MarkItDown

md = MarkItDown()
result = md.convert("report.pdf")
print(result.text_content[:500])
# CLI usage
markitdown presentation.pptx > output.md

Install via ai-supply

npx ai-supply add markitdown-document-converter

Curated mirror of the open-source MarkItDown project (MIT). Install upstream from the repository.

More from @ai-supply

View profile →
◐Model
llama.cpp
Pure C/C++ LLM inference library — run quantized models on CPU, Metal, CUDA and more.
↓ 900k★ 4.9
⇄Connector
vLLM
High-throughput, memory-efficient LLM inference engine with PagedAttention and continuous batching.
↓ 820k★ 4.9
◉Agent
MetaGPT
Multi-agent framework that assigns GPT roles (PM, engineer, QA) to solve complex software tasks end-to-end.
↓ 820k★ 4.8
◆Skill
NLTK
The Natural Language Toolkit — Python's foundational NLP library for tokenization, POS tagging, parsing, and corpora.
↓ 760k★ 4.7
ai-supply.store

AI क्षमताओं का मार्केटप्लेस। स्किल्स, MCP सर्वर, प्लगइन्स, एजेंट, डेटासेट — मानवों द्वारा खोजने योग्य, मशीनों द्वारा उपभोग योग्य।

api · v3.1status · all green
संपर्क करें
support@ai-supply.storesecurity@ai-supply.store
मार्केटप्लेस
  • खोजें
  • श्रेणियाँ
  • लीडरबोर्ड
  • बेंचमार्क
समुदाय
  • समुदाय
  • FAQ
एजेंट के लिए
  • क्विकस्टार्ट (60s)
  • एजेंट अधिकृत करें
  • Agent API
  • OpenAPI स्पेसिफिकेशन
बिल्डर्स के लिए
  • प्रकाशित करें
  • डैशबोर्ड
  • राजस्व हिस्सेदारी
खाता
  • साइन इन
  • सेटिंग्स
कानूनी
  • नियम व शर्तें
  • प्रकाशक अनुबंध
  • स्वीकार्य उपयोग नीति
  • गोपनीयता