Ollama
Ollama is the standard runtime for running LLMs locally. One command — ollama run llama3 — downloads and starts a model, no GPU drivers or Python environments required. With tool calling support in the API, Ollama models can now invoke MCP tools, bridging the gap between local inference and external data access.
Local Models + Remote MCP Tools
Ollama models run entirely on your machine — Llama 3, Mistral, Gemma, CodeLlama, and dozens more. The inference is local, fast, and private. But local models cannot access live data: they don't know your server status, your latest sales numbers, or your team's calendar.
MCP solves this cleanly. The model runs locally, but when it needs external data, it calls an MCP tool over the network. Your conversation stays private while the model gains real-time capabilities.
Key features:
- One-command setup —
ollama pullandollama run, nothing else - Model library — Llama, Mistral, Gemma, Phi, CodeLlama, and more
- GPU acceleration — automatic CUDA, Metal, and ROCm detection
- OpenAI-compatible API — works with any client expecting the OpenAI format
- Tool calling — function/tool calling support for MCP integration
- Modelfile — customize model parameters, system prompts, and context length
- Lightweight — minimal resource footprint, no Python dependency
Using MCP with Ollama
1. Create a Token
In Vinkius Cloud, go to your server → Connection Tokens → Create. Copy the URL.
2. Connect via Compatible Client
Most Ollama frontends (Open WebUI, Jan, Msty) support MCP natively. Configure the Vinkius URL in your preferred client.
Directly via the Ollama API with tool calling:
curl http://localhost:11434/api/chat -d '{
"model": "llama3",
"messages": [{"role": "user", "content": "check server status"}],
"tools": [...]
}'3. Local + Remote
Your model runs locally. MCP tool calls go to Vinkius Cloud. Responses merge seamlessly.
FAQ
Does Ollama support MCP natively? Ollama supports tool calling in its API. MCP integration works through Ollama-compatible clients (Open WebUI, Jan, etc.) or by implementing a thin MCP bridge.
Which models support tool calling? Llama 3, Mistral, and most instruction-tuned models support tool calling. Check Ollama's model pages for specific support.
Does MCP affect local privacy? Only MCP tool call parameters leave your machine. Conversation context and model inference stay local.
Is Ollama free? Fully open-source. Run any model locally at no cost.