Skip to content

ClawMux

Multi-agent voice interface. Talk to multiple Claude Code agents from a browser — each with their own voice and personality. All speech processing runs locally on your GPU.

ClawMux

Architecture

  • Whisper STT — speech-to-text runs locally on your GPU via whisper.cpp
  • Kokoro TTS — text-to-speech with multiple voice personas
  • Browser UI — connect from any device on your Tailscale network
  • Multi-agent — multiple Claude Code sessions, each with a unique voice
  • Self-hosted — everything runs on your own hardware, nothing leaves your network

How It Works

ClawMux sits between your browser and Claude Code. When you speak, Whisper transcribes your speech locally. The text is sent to your Claude Code agent, which processes the request and returns a response. Kokoro then synthesizes the response into speech and plays it back in your browser.

Each agent can have its own voice, personality, and workspace. You can switch between agents or have multiple running simultaneously.

Stack

Component Technology
STT whisper.cpp (local GPU)
TTS Kokoro (local GPU)
Frontend Browser-based, WebSocket
Backend Python MCP server
Agent Claude Code CLI
Network Tailscale (secure, zero-config)