Ollama

Ollama is the standard runtime for running LLMs locally. One command — ollama run llama3 — downloads and starts a model, no GPU drivers or Python environments required. With tool calling support in the API, Ollama models can now invoke MCP tools, bridging the gap between local inference and external data access.

Ollama

Open-Source · ollama.com

TRANSPORT

Streamable HTTP ✓

PLATFORM

Windows · macOS · Linux

MCP VIA

API Tool Calling

Local Models + Remote MCP Tools

Ollama models run entirely on your machine — Llama 3, Mistral, Gemma, CodeLlama, and dozens more. The inference is local, fast, and private. But local models cannot access live data: they don't know your server status, your latest sales numbers, or your team's calendar.

MCP solves this cleanly. The model runs locally, but when it needs external data, it calls an MCP tool over the network. Your conversation stays private while the model gains real-time capabilities.

Key features:

One-command setup — ollama pull and ollama run, nothing else
Model library — Llama, Mistral, Gemma, Phi, CodeLlama, and more
GPU acceleration — automatic CUDA, Metal, and ROCm detection
OpenAI-compatible API — works with any client expecting the OpenAI format
Tool calling — function/tool calling support for MCP integration
Modelfile — customize model parameters, system prompts, and context length
Lightweight — minimal resource footprint, no Python dependency

Using MCP with Ollama

1. Create a Token

In Vinkius Cloud, go to your server → Connection Tokens → Create. Copy the URL.

2. Connect via Compatible Client

Most Ollama frontends (Open WebUI, Jan, Msty) support MCP natively. Configure the Vinkius URL in your preferred client.

Directly via the Ollama API with tool calling:

bash

curl http://localhost:11434/api/chat -d '{
  "model": "llama3",
  "messages": [{"role": "user", "content": "check server status"}],
  "tools": [...]
}'

3. Local + Remote

Your model runs locally. MCP tool calls go to Vinkius Cloud. Responses merge seamlessly.

FAQ

Does Ollama support MCP natively? Ollama supports tool calling in its API. MCP integration works through Ollama-compatible clients (Open WebUI, Jan, etc.) or by implementing a thin MCP bridge.

Which models support tool calling? Llama 3, Mistral, and most instruction-tuned models support tool calling. Check Ollama's model pages for specific support.

Does MCP affect local privacy? Only MCP tool call parameters leave your machine. Conversation context and model inference stay local.

Is Ollama free? Fully open-source. Run any model locally at no cost.

Ollama ​

Local Models + Remote MCP Tools ​

Using MCP with Ollama ​

1. Create a Token ​

2. Connect via Compatible Client ​

3. Local + Remote ​

FAQ ​

Ollama

Local Models + Remote MCP Tools

Using MCP with Ollama

1. Create a Token

2. Connect via Compatible Client

3. Local + Remote

FAQ