Browse
→ AI & Models
→ Groq MCP
Groq MCP
Ultra-fast LLM inference via Groq. Access Llama 3, Mixtral, and Gemma models at speeds up to 800 tokens/sec from any MCP-compatible agent.
MCP unverified
Integration
| Transport | stdio |
| Auth | api-key |
| Endpoint | npx groq-mcp |
| Install | npx groq-mcp |
Use Cases
| 01 | Run high-throughput inference tasks requiring sub-second response times |
| 02 | Use open-weight models (Llama 3, Mixtral) as tools in agent pipelines |
| 03 | Offload latency-sensitive subtasks to Groq from slower orchestrators |
Tags
groq llama mixtral inference fast
Machine-readable: /api/servers.json
· JSON-LD schema embedded in <head>