Mingly — Multi-LLM Desktop App
Mingly brings Claude, GPT, and locally running Ollama models into a native macOS desktop app. Requests switch between cloud and local models without context loss — history is model-agnostic, sensitive content can stay on-device. In production since early 2026, written in TypeScript on Electron, stored in local SQLite.
Architecture highlight — multi-model routing without reload
Multi-model routing sounds simple but is UX-fragile: most multi-tools trigger a context reset when switching from cloud to local. We abstracted Mingly's conversation history so each turn is serializable independent of the model API. Result: switching mid-conversation from Claude Sonnet to local Llama-3 works without reload — token limits, provider-specific tool calls, and streaming semantics are encapsulated in a model-agnostic layer. Persistence runs on SQLite WAL mode, local models via Ollama MCP, state via Zustand.
Privacy mode — when local, when cloud
Not every task needs GPT-5. Research on market data, brainstorming, longer text drafts benefit from cloud models. Sensitive content — contract drafts, HR cases, internal strategy — belongs on local models. Mingly offers an explicit privacy mode that enforces hard routing rules: when active, no content reaches the cloud, not even accidentally. This split answers the most common question in our AI-tool workshops: 'How do I use AI without violating compliance?'
Practice proof for our advisory
From running Mingly we know not only which model is good for what — we know where multi-model strategies break in practice. This experience flows directly into our workshops on AI tools in daily work and into advisory on multi-model strategies for SMEs. Anyone who wants to know how to combine cloud and local models without breaking a Swiss-nDSG privacy impact assessment gets from us not a slide, but a running product as reference.