Skip to content

Groq Desktop

Connect your Vinkius Cloud MCP server to Groq Desktop, the AI desktop app powered by Groq's LPU (Language Processing Unit) chip — purpose-built hardware that delivers up to 625 tokens per second. Groq Desktop includes a built-in MCP server, and GroqCloud offers remote MCP integration in beta since September 2025.


Gq
Groq Desktop
by Groq · groq.com
TRANSPORT
Streamable HTTP ✓
PLATFORM
Windows · macOS · Linux
MCP VIA
Built-in MCP Server
SPEED
Up to 625 tok/s

Why Groq for MCP

Groq's LPU chip makes MCP tool interactions feel instantaneous. When your AI reasons about which tools to call and how to use the results, that reasoning happens at 625 tokens per second — making complex multi-tool workflows dramatically faster than on GPU-based infrastructure.

What makes Groq unique for MCP:

  • LPU chip — Purpose-built Language Processing Unit with 750 TOPS (INT8), 188 TFLOPs (FP16), 230MB SRAM per chip, and 80 TB/s on-die memory bandwidth
  • Up to 625 tokens/sec — Deterministic, low-latency execution through software-controlled hardware with no external memory bottlenecks
  • Built-in MCP server — Groq Desktop includes a native MCP server for function calling with multimodal input (image support)
  • GroqCloud MCP (beta) — Remote MCP server integration on GroqCloud since September 2025, compatible with OpenAI's remote MCP specification
  • 10x energy efficiency — LPUs are up to 10x more energy-efficient than GPUs on an architectural level, air-cooled design
  • Free API access — Free API keys with no waitlist and billions of free tokens daily for developers
  • Multi-modal — Text, audio, and vision model support through GroqCloud

Quick Connect

1. Create a Connection Token

In Vinkius Cloud, open your server's Dashboard → Connection Tokens and create a new token. Copy the generated URL:

https://edge.vinkius.com/{TOKEN}/mcp

2. Connect MCP in Groq Desktop

Open Groq DesktopMCP Server Settings. Add your Vinkius Cloud URL. The built-in MCP server handles tool discovery and invocation at LPU speed.


FAQ

Does Groq support MCP? Yes. Built-in MCP server in Groq Desktop + GroqCloud remote MCP beta (September 2025), compatible with OpenAI's remote MCP spec.

How fast is Groq? Up to 625 tokens/sec throughput. The LPU delivers 750 TOPS (INT8), 230MB SRAM, and 80 TB/s memory bandwidth per chip.

Is Groq free? GroqCloud offers free API keys with no waitlist and billions of free daily tokens for developers.

What is GroqCloud MCP? A remote MCP server integration in beta since September 2025 that lets AI agents use Groq's LPU-powered inference with tool capabilities.