NetEase-Media/grps_trtllm
Higher performance OpenAI LLM service than vLLM serve: A pure C++ high-performance OpenAI LLM service implemented with GPRS+TensorRT-LLM+Tokenizers.cpp, supporting chat and function call, AI agents, distributed multi-GPU inference, multimodal capabilities, and a Gradio chat interface.
Platform-specific configuration:
{
"mcpServers": {
"grps_trtllm": {
"command": "npx",
"args": [
"-y",
"grps_trtllm"
]
}
}
}Add the config above to .claude/settings.json under the mcpServers key.
Loading reviews...