AI / MLTypeScript

multimodal-mcp-client

A Multi-modal MCP client for voice powered agentic workflows.

Free
N/A rating (0 reviews)0 installs210 GitHub stars
voice controlAI workflowsmultimodal

This project is a modern voice-controlled AI interface that leverages Google Gemini and the Model Control Protocol (MCP) to transform user interactions with AI through natural speech and multimodal inputs. It is designed to enhance productivity by allowing users to engage with AI systems in a more intuitive manner, making workflows more efficient and user-friendly. The client supports both custom MCP servers, which require user configuration, and Systemprompt MCP servers that can be easily installed using a Systemprompt API key. Users can create personalized configurations to tailor the experience to their specific needs, ensuring flexibility and adaptability in various use cases.

Compatible with

Claude DesktopCursor

Install

Claude Desktop

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "multimodal-mcp-client": {
      "command": "npx",
      "args": [
        "-y",
        "@modelcontextprotocol/server-multimodal-mcp-client"
      ]
    }
  }
}

Config File Location

Mac: ~/Library/Application Support/Claude/claude_desktop_config.json

Windows: %APPDATA%\Claude\claude_desktop_config.json

Linux: ~/.config/claude/claude_desktop_config.json

Some servers require additional setup - check the GitHub README for specific instructions.

Permissions

ok0
ok1
multimodal-mcp-client MCP Server — MCPNest — MCPNest