pdf-rag-mcp

MCP Tool

MBaranekTech/pdf-rag-mcp

MCP server for RAG over messy PDFs — semantic search, OCR, table extraction. Works with Claude Desktop, Claude Code, Cursor, and any MCP client.

Install

$ npx loaditout add MBaranekTech/pdf-rag-mcp

Platform-specific configuration:

.claude/settings.json

{
  "mcpServers": {
    "pdf-rag-mcp": {
      "command": "npx",
      "args": [
        "-y",
        "pdf-rag-mcp"
      ]
    }
  }
}

Add the config above to .claude/settings.json under the mcpServers key.

About

PDF RAG MCP Server

> MCP server for RAG over messy PDFs — extract, chunk, embed, and search scanned, multi-column, and table-heavy documents.

[](https://www.python.org/downloads/) [](LICENSE) [](https://modelcontextprotocol.io)

---

All 6 tools running in the MCP Inspector

What is RAG?

RAG (Retrieval-Augmented Generation) is a technique that makes AI assistants smarter by giving them access to your own documents. Instead of relying only on training data, the AI first *retrieves* relevant chunks from your files, then uses them as context to generate accurate, grounded answers.

Traditional AI:  User Question → LLM → Answer (may hallucinate)
RAG:             User Question → Search Your Docs → LLM + Context → Accurate Answer

This MCP server is the "Search Your Docs" part — it ingests PDFs, breaks them into searchable chunks, and lets any MCP-compatible AI assistant find the right information instantly.

Why This Server?

Most PDF tools choke on real-world documents — scanned pages, multi-column layouts, embedded tables. This MCP server handles them all:

Scanned PDFs — Automatic OCR via Tesseract when text extraction fails
Multi-column layouts — Layout-preserving block sorting with PyMuPDF
Tables — Detected and extracted as clean markdown via pdfplumber
Semantic search — Find information by meaning, not just keywords
100% local — Embeddings run on your machine. No data leaves your system.

Demo

Ingest a PDF and search it

Ingesting a PDF — extracts text, chunks it, generates e

Reviews

Loading reviews...

Quality Signals

Installs

Last updated20 days ago

Security: AREADME

Safety

Risk Levelmedium

Data Access

read

Network Accessnone

Details

Sourcegithub-crawl

Last commit3/28/2026

View on GitHub→

Embed Badge

[![Loaditout](https://loaditout.ai/api/badge/MBaranekTech/pdf-rag-mcp)](https://loaditout.ai/skills/MBaranekTech/pdf-rag-mcp)

pdf-rag-mcp

MCP Tool

MBaranekTech/pdf-rag-mcp

MCP server for RAG over messy PDFs — semantic search, OCR, table extraction. Works with Claude Desktop, Claude Code, Cursor, and any MCP client.

Install

$ npx loaditout add MBaranekTech/pdf-rag-mcp

Platform-specific configuration:

.claude/settings.json

{
  "mcpServers": {
    "pdf-rag-mcp": {
      "command": "npx",
      "args": [
        "-y",
        "pdf-rag-mcp"
      ]
    }
  }
}

Add the config above to .claude/settings.json under the mcpServers key.

About

PDF RAG MCP Server

> MCP server for RAG over messy PDFs — extract, chunk, embed, and search scanned, multi-column, and table-heavy documents.

[](https://www.python.org/downloads/) [](LICENSE) [](https://modelcontextprotocol.io)

---

All 6 tools running in the MCP Inspector

What is RAG?

Traditional AI:  User Question → LLM → Answer (may hallucinate)
RAG:             User Question → Search Your Docs → LLM + Context → Accurate Answer

This MCP server is the "Search Your Docs" part — it ingests PDFs, breaks them into searchable chunks, and lets any MCP-compatible AI assistant find the right information instantly.

Why This Server?

Most PDF tools choke on real-world documents — scanned pages, multi-column layouts, embedded tables. This MCP server handles them all:

Scanned PDFs — Automatic OCR via Tesseract when text extraction fails
Multi-column layouts — Layout-preserving block sorting with PyMuPDF
Tables — Detected and extracted as clean markdown via pdfplumber
Semantic search — Find information by meaning, not just keywords
100% local — Embeddings run on your machine. No data leaves your system.

Demo

Ingest a PDF and search it

Ingesting a PDF — extracts text, chunks it, generates e

Reviews

Loading reviews...

Quality Signals

Installs

Last updated20 days ago

Security: AREADME

Safety

Risk Levelmedium

Data Access

read

Network Accessnone

Details

Sourcegithub-crawl

Last commit3/28/2026

View on GitHub→

Embed Badge

[![Loaditout](https://loaditout.ai/api/badge/MBaranekTech/pdf-rag-mcp)](https://loaditout.ai/skills/MBaranekTech/pdf-rag-mcp)

pdf-rag-mcp

Install

About

Tags

Reviews

Quality Signals

Safety

Details

Embed Badge

pdf-rag-mcp

Install

About

Tags

Reviews

Quality Signals

Safety

Details

Embed Badge