MCP Defaults
Platform-wide defaults for how tools are exposed to AI clients. These settings apply to every new server — individual servers can override them.
Access via Settings → MCP Settings in Vinkius Cloud.
Tool Grouping
When you deploy an API with 30, 50, or 100+ endpoints, each endpoint becomes an MCP tool. This creates a problem: AI models have limited context windows, and listing all tools at once consumes a significant portion of that budget — sometimes leaving too little room for the actual conversation.
Tool Grouping solves this by collapsing related tools into navigable groups. Instead of seeing 50 individual tools, the AI model sees 5 groups and drills into the one it needs.
Modes
| Mode | Behavior | When to use |
|---|---|---|
| Flat | Every endpoint is its own MCP tool | APIs with fewer than 20 endpoints |
| Grouped | Tools are always collapsed by tag | APIs where you want to control discoverability |
| Auto (default) | Flat below the threshold, grouped above it | Most APIs — the platform adapts automatically |
The grouping threshold (default: 20) controls when Auto mode switches from flat to grouped. Lower it if your AI model has a small context window; raise it if you want tools always visible.
TOON Token Compression
TOON (Tool Object Optimized Notation) is a compression protocol that rewrites tool descriptions and response payloads into a compact tabular format. In benchmarks, this reduces token consumption by 30-50% per request without losing information.
Instead of verbose JSON schema definitions, TOON encodes tools as structured tables that modern language models parse natively. The AI model receives the same semantic information in fewer tokens.
Compatibility
TOON works with Claude 3.5+, GPT-4, and Gemini 2.0+. Older models or clients that expect raw JSON schemas may behave unpredictably — test before enabling in production.
Response Presenters
Response Presenters transform raw API responses into structured, AI-optimized formats. When an API returns deeply nested JSON with metadata, pagination cursors, and wrapper objects, Presenters extract the useful data and flatten it into a format that models reason about more effectively.
The result: higher-quality AI responses because the model receives cleaner input. Disable only if your AI client expects unprocessed responses.
Next steps
Frequently Asked Questions
What does Tool Grouping do?
Tool Grouping controls how your MCP tools are presented to AI clients. For large APIs (30+ endpoints), listing every tool individually consumes a significant portion of the AI model's context window. Grouping collapses related tools into navigable categories, reducing context overhead by an order of magnitude.
Which Tool Grouping mode should I use?
Use Auto (the default) for most APIs. It keeps tools flat when you have fewer than the threshold (default: 20) and automatically groups them above that number. Use Flat for small APIs under 20 tools, and Grouped when you want to manually control tool discoverability.
How much does TOON compression save?
In benchmarks, TOON reduces token consumption by 30-50% per request. It rewrites verbose JSON schema definitions into compact tabular format that modern language models parse natively — the AI model receives the same semantic information in fewer tokens.
Is TOON compression compatible with all AI models?
TOON works reliably with Claude 3.5+, GPT-4, and Gemini 2.0+. Older models or clients that expect raw JSON schemas may behave unpredictably. Always test in your environment before enabling in production.
What are Response Presenters?
Response Presenters transform raw API responses into structured, AI-optimized formats. They strip pagination cursors, metadata wrappers, and deeply nested objects — delivering cleaner input that improves the quality of AI model reasoning. Disable only if your client expects unprocessed responses.
Do MCP Defaults apply to existing servers?
No. These are creation-time defaults. Existing servers keep their current settings. You can update individual servers from their detail page at any time.