rouapps/caret
Terminal tool for inspecting and cleaning large LLM training datasets. Handles JSONL, Parquet, and CSV with memory-mapped I/O, near-duplicate detection, token visualization, dataset linting, and an MCP server.
Platform-specific configuration:
{
"mcpServers": {
"caret": {
"command": "npx",
"args": [
"-y",
"caret"
]
}
}
}Add the config above to .claude/settings.json under the mcpServers key.
Loading reviews...