Skip to content

Ollama

Ollama is the standard runtime for running LLMs locally. One command — ollama run llama3 — downloads and starts a model, no GPU drivers or Python environments required. With tool calling support in the API, Ollama models can now invoke MCP tools, bridging the gap between local inference and external data access.


Ol
Ollama
Open-Source · ollama.com
TRANSPORT
Streamable HTTP ✓
PLATFORM
Windows · macOS · Linux
MCP VIA
API Tool Calling

Local Models + Remote MCP Tools

Ollama models run entirely on your machine — Llama 3, Mistral, Gemma, CodeLlama, and dozens more. The inference is local, fast, and private. But local models cannot access live data: they don't know your server status, your latest sales numbers, or your team's calendar.

MCP solves this cleanly. The model runs locally, but when it needs external data, it calls an MCP tool over the network. Your conversation stays private while the model gains real-time capabilities.

Key features:

  • One-command setupollama pull and ollama run, nothing else
  • Model library — Llama, Mistral, Gemma, Phi, CodeLlama, and more
  • GPU acceleration — automatic CUDA, Metal, and ROCm detection
  • OpenAI-compatible API — works with any client expecting the OpenAI format
  • Tool calling — function/tool calling support for MCP integration
  • Modelfile — customize model parameters, system prompts, and context length
  • Lightweight — minimal resource footprint, no Python dependency

Using MCP with Ollama

1. Create a Token

In Vinkius Cloud, go to your server → Connection TokensCreate. Copy the URL.

2. Connect via Compatible Client

Most Ollama frontends (Open WebUI, Jan, Msty) support MCP natively. Configure the Vinkius URL in your preferred client.

Directly via the Ollama API with tool calling:

bash
curl http://localhost:11434/api/chat -d '{
  "model": "llama3",
  "messages": [{"role": "user", "content": "check server status"}],
  "tools": [...]
}'

3. Local + Remote

Your model runs locally. MCP tool calls go to Vinkius Cloud. Responses merge seamlessly.


FAQ

Does Ollama support MCP natively? Ollama supports tool calling in its API. MCP integration works through Ollama-compatible clients (Open WebUI, Jan, etc.) or by implementing a thin MCP bridge.

Which models support tool calling? Llama 3, Mistral, and most instruction-tuned models support tool calling. Check Ollama's model pages for specific support.

Does MCP affect local privacy? Only MCP tool call parameters leave your machine. Conversation context and model inference stay local.

Is Ollama free? Fully open-source. Run any model locally at no cost.