loaditout.ai
SkillsPacksTrendingLeaderboardAPI DocsBlogSubmitRequestsCompareAgentsXPrivacyDisclaimer
{}loaditout.ai
Skills & MCPPacksBlog

pdf-rag-mcp

MCP Tool

MBaranekTech/pdf-rag-mcp

MCP server for RAG over messy PDFs — semantic search, OCR, table extraction. Works with Claude Desktop, Claude Code, Cursor, and any MCP client.

Install

$ npx loaditout add MBaranekTech/pdf-rag-mcp

Platform-specific configuration:

.claude/settings.json
{
  "mcpServers": {
    "pdf-rag-mcp": {
      "command": "npx",
      "args": [
        "-y",
        "pdf-rag-mcp"
      ]
    }
  }
}

Add the config above to .claude/settings.json under the mcpServers key.

About

PDF RAG MCP Server

> MCP server for RAG over messy PDFs — extract, chunk, embed, and search scanned, multi-column, and table-heavy documents.

[](https://www.python.org/downloads/) [](LICENSE) [](https://modelcontextprotocol.io)

---

<p align="center"> <br/> <em>All 6 tools running in the MCP Inspector</em> </p>

What is RAG?

RAG (Retrieval-Augmented Generation) is a technique that makes AI assistants smarter by giving them access to your own documents. Instead of relying only on training data, the AI first *retrieves* relevant chunks from your files, then uses them as context to generate accurate, grounded answers.

Traditional AI:  User Question → LLM → Answer (may hallucinate)
RAG:             User Question → Search Your Docs → LLM + Context → Accurate Answer

This MCP server is the "Search Your Docs" part — it ingests PDFs, breaks them into searchable chunks, and lets any MCP-compatible AI assistant find the right information instantly.

Why This Server?

Most PDF tools choke on real-world documents — scanned pages, multi-column layouts, embedded tables. This MCP server handles them all:

  • Scanned PDFs — Automatic OCR via Tesseract when text extraction fails
  • Multi-column layouts — Layout-preserving block sorting with PyMuPDF
  • Tables — Detected and extracted as clean markdown via pdfplumber
  • Semantic search — Find information by meaning, not just keywords
  • 100% local — Embeddings run on your machine. No data leaves your system.
Demo
Ingest a PDF and search it

<p align="center"> <br/> <em>Ingesting a PDF — extracts text, chunks it, generates e

Tags

aidocument-processingfastmcpllmmcpmcp-servermodel-contenlpocrpdfpymupdfpythonragsemantic-search

Reviews

Loading reviews...

Quality Signals

0
Installs
Last updated20 days ago
Security: AREADME

Safety

Risk Levelmedium
Data Access
read
Network Accessnone

Details

Sourcegithub-crawl
Last commit3/28/2026
View on GitHub→

Embed Badge

[![Loaditout](https://loaditout.ai/api/badge/MBaranekTech/pdf-rag-mcp)](https://loaditout.ai/skills/MBaranekTech/pdf-rag-mcp)