MCP · A2A · x402 · agentndx.ai llms.txt MCP endpoint
BrowseAI & Models → Cerebras MCP
Cerebras MCP
MCP server for the Cerebras Inference API — the world's fastest AI inference engine. Run Llama and other open models at 1,800+ tokens/second for latency-sensitive agentic workloads.
MCP unverified
Transport stdio
Auth api-key
Endpoint npx -y @cerebras/mcp-server
Install
npx -y @cerebras/mcp-server
01 Run Llama 3 and other open models at ultra-low latency
02 Power latency-sensitive agent loops with 1800+ token/s throughput
03 Route cost-sensitive workloads to fast open-model inference
inference llm llama fast-inference open-models cerebras
Machine-readable: /api/servers.json  ·  JSON-LD schema embedded in <head>
FEATURED LISTING

Top placement + verified badge for your MCP server

Get Featured — $149
API PRO

Full API access — no rate limits, all endpoints

API Pro — $29/mo