Bambushu/screenread
Your AI agent doesn't need screenshots to read text. ScreenRead gives it the macOS accessibility tree in ~100ms. CLI + MCP server.
Platform-specific configuration:
{
"mcpServers": {
"screenread": {
"command": "npx",
"args": [
"-y",
"screenread"
]
}
}
}Add the config above to .claude/settings.json under the mcpServers key.
<p align="center"> <picture> <source media="(prefers-color-scheme: dark)" srcset="assets/logo-dark.svg"> <source media="(prefers-color-scheme: light)" srcset="assets/logo.svg"> </picture> </p>
<p align="center"> Read what's on screen without taking a screenshot. </p>
ScreenRead gives AI agents access to the macOS accessibility tree — the same structured data that powers VoiceOver and other screen readers. Instead of capturing pixels and feeding them through vision models, your agent gets instant, structured text describing every UI element on screen.
~100ms instead of 1-3 seconds. Zero hallucination — it reads what the OS knows, not what a model thinks it sees.
Most AI agent tooling uses screenshots to "see" the screen:
But ~90% of agent tasks are text-based: "what does the error say?", "is this button visible?", "what's the page title?". Screenshots are overkill.
ScreenRead skips all of that. It asks macOS directly: "What UI elements exist in this window?" and returns structured text instantly.
git clone https://github.com/Bambushu/screenread.git
cd screenread
swift build -c release
cp .build/release/screenread ~/.local/bin/
cp .build/release/screenread-mcp ~/.local/bin/# Read the frontmost app
screenread
# Read a specific app
screenread --app Safari
# Fuzzy match a window title
screenread --window "inbox"
# Text only (no structure)
screenread --app Warp --text-only
# Shallow read (depth 2)
screenread --app Finder --shallow
# Full text, no truncation
screenread --app Terminal -Loading reviews...