ADK 的 Cartesia MCP 工具¶
Supported in ADKPythonTypeScript
Cartesia MCP 服务器将你的 ADK 智能体连接到 Cartesia AI 音频平台。此集成使你的智能体能够生成语音、将声音本地化为不同语言,并使用自然语言创建音频内容。
使用场景¶
-
Text-to-Speech Generation: Convert text into natural-sounding speech using Cartesia's diverse voice library, with control over voice selection and output format.
-
Voice Localization: Transform existing voices into different languages while preserving the original speaker's characteristics—ideal for multilingual content creation.
-
Audio Infill: Fill gaps between audio segments to create smooth transitions, useful for podcast editing or audiobook production.
-
Voice Transformation: Convert audio clips to sound like different voices from Cartesia's library.
前置条件¶
- Sign up for a Cartesia account
- Generate an API key from the Cartesia playground
与智能体一起使用¶
from google.adk.agents import Agent
from google.adk.tools.mcp_tool import McpToolset
from google.adk.tools.mcp_tool.mcp_session_manager import StdioConnectionParams
from mcp import StdioServerParameters
CARTESIA_API_KEY = "YOUR_CARTESIA_API_KEY"
root_agent = Agent(
model="gemini-2.5-pro",
name="cartesia_agent",
instruction="Help users generate speech and work with audio content",
tools=[
McpToolset(
connection_params=StdioConnectionParams(
server_params=StdioServerParameters(
command="uvx",
args=["cartesia-mcp"],
env={
"CARTESIA_API_KEY": CARTESIA_API_KEY,
# "OUTPUT_DIRECTORY": "/path/to/output", # Optional
}
),
timeout=30,
),
)
],
)
import { LlmAgent, MCPToolset } from "@google/adk";
const CARTESIA_API_KEY = "YOUR_CARTESIA_API_KEY";
const rootAgent = new LlmAgent({
model: "gemini-2.5-pro",
name: "cartesia_agent",
instruction: "Help users generate speech and work with audio content",
tools: [
new MCPToolset({
type: "StdioConnectionParams",
serverParams: {
command: "uvx",
args: ["cartesia-mcp"],
env: {
CARTESIA_API_KEY: CARTESIA_API_KEY,
// OUTPUT_DIRECTORY: "/path/to/output", // Optional
},
},
}),
],
});
export { rootAgent };
可用工具¶
| Tool | Description |
|---|---|
text_to_speech |
Convert text to audio using a specified voice |
list_voices |
List all available Cartesia voices |
get_voice |
Get details about a specific voice |
clone_voice |
Clone a voice from audio samples |
update_voice |
Update an existing voice |
delete_voice |
Delete a voice from your library |
localize_voice |
Transform a voice into a different language |
voice_change |
Convert an audio file to use a different voice |
infill |
Fill gaps between audio segments |
配置¶
The Cartesia MCP server can be configured using environment variables:
| Variable | Description | Required |
|---|---|---|
CARTESIA_API_KEY |
Your Cartesia API key | Yes |
OUTPUT_DIRECTORY |
Directory to store generated audio files | No |