loaditout.ai
SkillsPacksTrendingLeaderboardAPI DocsBlogSubmitRequestsCompareAgentsXPrivacyDisclaimer
{}loaditout.ai
Skills & MCPPacksBlog

compress-tokens

MCP Tool

Amir-Zecharia/compress-tokens

MCP server that compresses text by removing unnecessary tokens using local LLM surprisal scoring

Install

$ npx loaditout add Amir-Zecharia/compress-tokens

Platform-specific configuration:

.claude/settings.json
{
  "mcpServers": {
    "compress-tokens": {
      "command": "npx",
      "args": [
        "-y",
        "compress-tokens"
      ]
    }
  }
}

Add the config above to .claude/settings.json under the mcpServers key.

About

compress-tokens

An MCP server that compresses text by removing low-information tokens using local LLM surprisal scoring via candle. No API keys. No cloud. Everything runs on your machine.

How it works

Each token in the input is scored by its surprisal — how unexpected it is given all preceding tokens, computed by a local quantized LLM. Tokens with low surprisal (predictable filler) are dropped; tokens with high surprisal (informative content) are kept. The remaining tokens are decoded back to text.

The primary use case is reducing context window usage: Claude Code can call compress_file on a large file and get back a shorter version that preserves the information-dense parts before reasoning over it.

Tools

| Tool | Description | |---|---| | compress_text | Compress text with an explicit keep_ratio (fraction of tokens to keep, default 0.7) | | compress_text_auto | Compress text with automatic keep ratio via elbow detection on the surprisal curve | | compress_file | Read a file, compress it, and return the result. Optionally write to output_path. Large files are chunked at 2048 tokens. |

Installation
Prerequisites
  • Rust toolchain (curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh)
  • Claude Code
Build
git clone https://github.com/Amir-Zecharia/compress-tokens
cd compress-tokens
cargo build --release

On macOS, enable Metal GPU acceleration:

cargo build --release --features metal
Register with Claude Code
claude mcp add compress-tokens /path/to/compress-tokens/target/release/compress-tokens --scope user

On first use, the server downloads the default model (~700MB) from HuggingFace and caches it locally. All subsequent starts load from cache.

Memory usage

The server loads the model at startup and exits automatically after 60 seconds of inactivity, freeing

Tags

candleclaudecontext-compressionggufllmlocal-aimcprust

Reviews

Loading reviews...

Quality Signals

1
Stars
0
Installs
Last updated22 days ago
Security: AREADME

Safety

Risk Levelmedium
Data Access
read
Network Accessnone

Details

Sourcegithub-crawl
Last commit3/26/2026
View on GitHub→

Embed Badge

[![Loaditout](https://loaditout.ai/api/badge/Amir-Zecharia/compress-tokens)](https://loaditout.ai/skills/Amir-Zecharia/compress-tokens)