Browse → Models & LLM Access
Models & LLM Access
10 model providers — 6 AI infrastructure tools.
Cloud and local inference available as MCP tools.
10
Model Providers
6
Infrastructure
1
Local Inference
MCP
Protocol
Model Providers
LLM inference accessible as MCP tools. Call models directly from any agent workflow.
| Provider | Models | Speed | Deployment | Auth | Status |
|---|---|---|---|---|---|
| Replicate MCP Official MCP server for Replicate, enabling agents to run ML model inference across thousands of open-source models. | 1000s of open models | variable | cloud | api-key | verified |
| Perplexity MCP MCP server for Perplexity's AI-powered search API, providing sourced answers with citations for agent research tasks. | Sonar (search-augmented) | standard | cloud | api-key | verified |
| OpenAI MCP Access OpenAI models including GPT-4o, o1, DALL-E, and Whisper from MCP-compatible agents. Use OpenAI capabilities as a tool inside any MCP workflow. | GPT-4o, o1, DALL-E, Whisper | standard | cloud | api-key | |
| Groq MCP Ultra-fast LLM inference via Groq. Access Llama 3, Mixtral, and Gemma models at speeds up to 800 tokens/sec from any MCP-compatible agent. | Llama 3, Mixtral, Gemma | 800+ tok/s | cloud | api-key | |
| HuggingFace MCP Access 900,000+ models on HuggingFace Hub. Run inference, search models, retrieve datasets, and interact with Spaces from AI agent workflows. | 900K+ models on Hub | variable | both | api-key | verified |
| Cohere MCP Access Cohere Command and Embed models from AI agents. Generate text, create embeddings, rerank search results, and build RAG pipelines via Cohere API. | Command R+, Embed, Rerank | standard | cloud | api-key | verified |
| Mistral MCP Access Mistral AI models including Mistral Large, Codestral, and Pixtral from MCP-compatible agents for text generation, code completion, and vision tasks. | Mistral Large, Codestral, Pixtral | standard | cloud | api-key | |
| Anthropic Claude MCP Call Anthropic Claude models as tools within MCP-compatible agent workflows. Access Claude 3.5 Sonnet, Haiku, and Opus for text generation, analysis, and reasoning. | Claude 3.5 Sonnet, Haiku, Opus | standard | cloud | api-key | |
| Together AI MCP Together AI inference API via MCP. Run open-source LLMs — Llama, Mistral, DBRX, and more — with fast parallel inference at scale. | Llama, Mistral, DBRX, Qwen | 800+ tok/s | cloud | api-key | verified |
| Ollama MCP Local LLM inference via Ollama. Run Llama, Mistral, Gemma, and other models locally — no API keys, no data leaving the machine. | Llama, Mistral, Phi, Qwen (local) | local hw | local | none | verified |
LOCAL
Ollama MCP is the only local inference provider in this set.
No API key, no egress, no token costs. Runs Llama, Mistral, Phi, Qwen on your hardware.
Requires Ollama installed locally.
AI Infrastructure
Memory, reasoning, observability, and search tools for building reliable AI systems.
| Tool | Purpose | Protocols | Auth | Status |
|---|---|---|---|---|
| Context7 MCP Live library documentation fetcher for LLMs. Resolves library names to current API docs, preventing hallucinations on outdated APIs. | documentation · libraries | MCP | none | verified |
| Sequential Thinking MCP Structured multi-step reasoning for complex problems. Enables agents to break tasks into explicit thought chains, revise reasoning, and build toward solutions methodically. | reasoning · thinking | MCP | none | verified |
| Qdrant MCP Official Qdrant MCP server for storing, retrieving, and searching vector embeddings in the Qdrant vector database. | qdrant · vector-search | MCP | api-key | verified |
| Mem0 MCP Persistent memory layer for AI agents. Store, search, and retrieve user preferences, conversation history, and learned facts across sessions. Official Mem0 MCP server. | memory · persistence | MCP | api-key | verified |
| LangSmith MCP LangSmith LLM observability via MCP. Trace agent runs, inspect prompts and outputs, evaluate quality, and debug complex chain failures. | langsmith · observability | MCP | api-key | verified |
| Weights & Biases MCP Weights & Biases ML experiment tracking via MCP. Log runs, compare metrics, manage model artifacts, and query training history from agents. | wandb · mlops | MCP | api-key | verified |
Other AI Servers
| Name | Protocols | Auth | Status |
|---|---|---|---|
| Langfuse MCP Official MCP server for Langfuse LLM observability. Access and manage prompts, traces, and datasets through the Model Context Protocol. | MCP | api-key | |
| Opik MCP MCP server for Comet Opik, providing unified access to LLM prompts, projects, traces, and evaluation metrics from your IDE. | MCP | api-key | |
| Arize Phoenix MCP MCP server for Arize Phoenix AI observability. Explore projects, traces, spans, prompts, datasets, and experiments via the Model Context Protocol. | MCP | api-key | |
| Confident AI MCP Official MCP server for Confident AI and DeepEval. Run LLM evaluations, manage prompt templates, pull datasets, and trigger cloud evals from your editor. | MCP | api-key | |
| Braintrust MCP MCP server for Braintrust AI evaluation and observability. Access experiments, datasets, scoring functions, and production logs for LLM quality management. | MCP | api-key | |
| MCP LLM Eval Local MCP server that packages LLM evaluation gates as reusable CI/CD primitives. Run datasets against models, score with LLM-as-judge, enforce quality thresholds. | MCP | api-key | |
| MCP Bench Benchmarking framework by Accenture for evaluating LLM tool-use via MCP. End-to-end pipeline assessing how effectively models discover, select, and use tools. | MCP | none | |
| Promptfoo MCP MCP server exposing Promptfoo eval and red-team testing tools to AI agents. Run prompt evaluations, security tests, and quality checks from your IDE. | MCP | none |