Product Roadmap
What we're building
A living record of what's shipped, what's in progress, and what's coming. Have thoughts? Every item has a linked Feature Request or GitHub Discussion — we'd love to hear from you.
Windows & Linux
Polyphon is currently macOS-only. Windows and Linux support — x86_64 and arm64 — is planned, bringing the full desktop experience to all major platforms.
Sync
End-to-end encrypted sync of your sessions, compositions, and settings across all your devices. Your data stays private — we can't read it.
Plugins
Extend Polyphon with community and first-party plugins. Add new voice providers, custom session behaviors, output formatters, and more — without touching core.
Extensions
Build and install extensions that hook into the Polyphon UI — custom panels, session widgets, and integrations with external tools and workflows.
Everything that's landed in a release, newest first.
JavaScript / TypeScript SDK
A first-class SDK (@polyphon-ai/js) for connecting to Polyphon from Node.js or any bundler-based project. Manage compositions and sessions, broadcast messages to the full ensemble, and stream voice responses — all over the local TCP API. Published to npm and versioned in sync with the app.
Obsidian Plugin
Bring multi-voice AI conversations into your Obsidian vault. The plugin connects to a running Polyphon instance over the local TCP API, lets you pick a composition, manage sessions, and chat with your ensemble without leaving your notes. Supports @mention routing and automatic session context from your open file.
TCP API Server
A built-in TCP API server that exposes Polyphon's core functionality over a local network interface. Enables programmatic control of sessions, compositions, and voices from external tools, scripts, and CI pipelines.
poly CLI
A command-line tool (@polyphon-ai/poly) for controlling Polyphon from the terminal. List compositions, start sessions, broadcast prompts with live streaming output, and export transcripts — all without touching the UI.
MCP Server
Expose Polyphon as an MCP tool server. Any MCP-compatible agent — Claude Code, Cursor, Codex CLI, GitHub Copilot, Windsurf, Gemini CLI — can list compositions, create sessions, broadcast to the full ensemble, and retrieve transcripts via standard MCP tool calls. Runs over stdio with a Settings toggle to auto-start on launch.
API Voice Filesystem Access & Sandboxing
Give API voices (Claude, GPT, Gemini) the ability to read and write files, list directories, and run shell commands through a set of host-brokered tools. The model decides what to do; Polyphon executes it in the session working directory. The same sequential execution model already used for CLI voices keeps multiple voices from stepping on each other.
Session Export
Export full conversation transcripts in multiple formats — Markdown, JSON, or plain text.
Full Text Search
Search across all your sessions and messages instantly. A global search view finds matches across every session; per-session Cmd+F search highlights matches inline. Powered by FTS5 for fast, accurate results.
Markdown Rendering
Voice responses rendered as formatted Markdown — headings, code blocks, lists, bold, italics, and inline code displayed as intended rather than as raw syntax.
Working Directory
Set a starting directory when you create a session. CLI voices spawn with that directory as their working directory, and all voices receive it as context in the ensemble system prompt — so API voices know what project they're working in too.
Custom Color Picker
Choose any accent color for each voice in your session. Color-coded message bubbles, voice panels, and indicators make it easy to track who said what at a glance.
Auto Update
Polyphon checks for new releases in the background and notifies you when one is available. Install with one click — no manual download required.
YOLO Mode for CLI Voices
Per-voice checkboxes to enable non-interactive / auto-approve mode for CLI providers like Claude Code and Codex. Let them run without confirmation prompts when you trust the session.
Encryption at Rest
Your sessions, compositions, and settings are encrypted on disk. Your data is protected even if your device is compromised.
Tones & System Prompt Templates
Custom tone presets and reusable system prompt templates for fine-grained control over each voice's personality and behavior.
Conductor Profile
Set your name, pronouns, and context once. Every voice in every session knows who they're talking to, injected automatically into the ensemble system prompt.
Custom OpenAI-Compatible Providers
Add your own endpoints — Ollama, LM Studio, vLLM, or any OpenAI-compatible API — and run voices entirely on your machine.
CLI Voices
Use Claude Code, Codex, and GitHub Copilot as voices in your sessions. Spawned as local subprocesses — no extra credentials beyond what you already have.
API Voices
Connect Anthropic Claude, OpenAI GPT, and Google Gemini via API key with full model selection per voice and real-time streaming responses.
Compositions
Save and reuse multi-voice configurations. Build your preferred ensemble once and launch it instantly in any new session.
Multi-Voice Sessions
Orchestrate multiple AI voices simultaneously in a single conversation. Voices hear each other and build on each other's responses in real time.