multimodal-mcp-client
A Multi-modal MCP client for voice powered agentic workflows.
This project is a modern voice-controlled AI interface that leverages Google Gemini and the Model Control Protocol (MCP) to transform user interactions with AI through natural speech and multimodal inputs. It is designed to enhance productivity by allowing users to engage with AI systems in a more intuitive manner, making workflows more efficient and user-friendly. The client supports both custom MCP servers, which require user configuration, and Systemprompt MCP servers that can be easily installed using a Systemprompt API key. Users can create personalized configurations to tailor the experience to their specific needs, ensuring flexibility and adaptability in various use cases.
Compatible with
Install
Claude Desktop
Add to your claude_desktop_config.json:
{
"mcpServers": {
"multimodal-mcp-client": {
"command": "npx",
"args": [
"-y",
"@modelcontextprotocol/server-multimodal-mcp-client"
]
}
}
}Config File Location
Mac: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
Linux: ~/.config/claude/claude_desktop_config.json
Some servers require additional setup - check the GitHub README for specific instructions.
Permissions