Keshab0310/agent-memory
Save 60-90% on LLM token costs with intelligent memory compression for multi-agent systems
Platform-specific configuration:
{
"mcpServers": {
"agent-memory": {
"command": "npx",
"args": [
"-y",
"agent-memory"
]
}
}
}Add the config above to .claude/settings.json under the mcpServers key.
Save 60-90% on LLM token costs with intelligent memory compression for multi-agent systems.
agent-memory compresses raw LLM tool output into structured observations, shares context across agents via a memory bus, and injects only relevant memory into each prompt — keeping your token budget under control.
---
Running 5+ concurrent LLM agents burns tokens fast:
agent-memory sits between your agents and their context window:
Raw Tool Output (5,000 tokens)
-> Observation Compression (500 tokens)
-> Shared Memory Bus (SQLite + FTS5)
-> Budget-Controlled Context Injection (8,000 token cap)Tested results: 66-94% token savings, 3-74x compression ratio.
---
pip install agent-memoryfrom agent_memory import MemoryStore, ContextBuilder
# Initialize
memory = MemoryStore("./my_project.db")
# Store a compressed observation
memory.store_observation(Observation(
agent_id="researcher-1",
project="my-app",
title="Found pagination bug in /users endpoint",
narrative="The API returns 500 when page > 100 due to missing LIMIT clause",
facts=["Max page size is 100", "No server-side validation"],
concepts=["api", "bug", "pagination"],
))
# Build context for another agent (token-budgeted)
builder = ContextBuilder(memory)
context = builder.build(
project="my-app",
agent_id="coder-1",
task_description="Fix the pagination bug",
)
# -> Returns compressed context within 8000 token budget
# -> Includes researcher-1's findings automatically# Add the marketplace
claude plugin marketplace add Keshab03Loading reviews...