ADK 中的多智能体系统¶
随着智能体应用程序变得越来越复杂,将它们构建为单一的、整体式智能体可能会变得难以开发、维护和理解。智能体开发套件 (ADK) 支持通过将多个不同的 BaseAgent
实例组合成多智能体系统 (MAS) 来构建复杂的应用程序。
在 ADK 中,多智能体系统是一个应用程序,其中不同的智能体(通常形成层次结构)协作或协调以实现更大的目标。以这种方式构建你的应用程序提供了显著的优势,包括增强的模块化、专业化、可重用性、可维护性,以及使用专用工作流智能体定义结构化控制流的能力。
你可以组合从 BaseAgent
派生的各种类型的智能体来构建这些系统:
- LLM 智能体: 由大型语言模型驱动的智能体。(参见 LLM 智能体)
- 工作流智能体: 专门设计用于管理其子智能体执行流程的智能体(
SequentialAgent
、ParallelAgent
、LoopAgent
)。(参见工作流智能体) - 自定义智能体: 你自己的继承自
BaseAgent
并具有特殊非 LLM 逻辑的智能体。(参见自定义智能体)
以下部分详细介绍了使你能够有效构建和管理这些多智能体系统的核心 ADK 原语——如智能体层次结构、工作流智能体和交互机制。
1. ADK 组合智能体的原语¶
ADK 提供了核心构建块——原语——使你能够在多智能体系统中构建和管理交互。
Note
这些原语的具体参数或方法名可能因 SDK 语言而略有不同(如 Python 中为 sub_agents
,Java 中为 subAgents
)。详情请参考对应语言的 API 文档。
1.1. 智能体层次结构(父智能体、子智能体)¶
多智能体系统结构的基础是 BaseAgent
中定义的父子关系。
- 建立层次结构: 通过在初始化父智能体时,将一组智能体实例传递给
sub_agents
参数来创建树状结构。ADK 会自动在初始化时为每个子智能体设置parent_agent
属性。 - 单一父级规则: 一个智能体实例只能作为子智能体添加一次。尝试为其分配第二个父级会导致
ValueError
。 - 重要性: 该层次结构定义了工作流智能体的作用域,并影响 LLM 驱动委托的潜在目标。你可以通过
agent.parent_agent
导航层次结构,或用agent.find_agent(name)
查找后代。
# 概念示例:定义层次结构
from google.adk.agents import LlmAgent, BaseAgent
# 定义单个智能体
greeter = LlmAgent(name="Greeter", model="gemini-2.0-flash")
task_doer = BaseAgent(name="TaskExecutor") # 自定义非 LLM 智能体
# 创建父智能体并通过 sub_agents 分配子智能体
coordinator = LlmAgent(
name="Coordinator",
model="gemini-2.0-flash",
description="我负责协调问候和任务。",
sub_agents=[ # 在这里分配子智能体
greeter,
task_doer
]
)
# 框架会自动设置:
# assert greeter.parent_agent == coordinator
# assert task_doer.parent_agent == coordinator
// 概念示例:定义层次结构
import com.google.adk.agents.SequentialAgent;
import com.google.adk.agents.LlmAgent;
// 定义单个智能体
LlmAgent greeter = LlmAgent.builder().name("Greeter").model("gemini-2.0-flash").build();
SequentialAgent taskDoer = SequentialAgent.builder().name("TaskExecutor").subAgents(...).build(); // 顺序智能体
// 创建父智能体并分配子智能体
LlmAgent coordinator = LlmAgent.builder()
.name("Coordinator")
.model("gemini-2.0-flash")
.description("我负责协调问候和任务")
.subAgents(greeter, taskDoer) // 在这里分配子智能体
.build();
// 框架会自动设置:
// assert greeter.parentAgent().equals(coordinator);
// assert taskDoer.parentAgent().equals(coordinator);
1.2. 工作流智能体作为编排者¶
ADK 包含从 BaseAgent
派生的专门智能体,它们本身不执行任务,而是编排其 sub_agents
的执行流程。
SequentialAgent
: 按照列出的顺序依次执行其sub_agents
。- 上下文: 顺序传递相同的
InvocationContext
,使智能体能够通过共享状态轻松传递结果。
- 上下文: 顺序传递相同的
# 概念示例:顺序流水线
from google.adk.agents import SequentialAgent, LlmAgent
step1 = LlmAgent(name="Step1_Fetch", output_key="data") # 将输出保存到 state['data']
step2 = LlmAgent(name="Step2_Process", instruction="处理来自状态键 'data' 的数据。")
pipeline = SequentialAgent(name="MyPipeline", sub_agents=[step1, step2])
# 当 pipeline 运行时,Step2 可以访问 Step1 设置的 state['data']。
// 概念示例:顺序流水线
import com.google.adk.agents.SequentialAgent;
import com.google.adk.agents.LlmAgent;
LlmAgent step1 = LlmAgent.builder().name("Step1_Fetch").outputKey("data").build(); // 输出保存到 state.get("data")
LlmAgent step2 = LlmAgent.builder().name("Step2_Process").instruction("处理来自状态键 'data' 的数据。").build();
SequentialAgent pipeline = SequentialAgent.builder().name("MyPipeline").subAgents(step1, step2).build();
// 当 pipeline 运行时,Step2 可以访问 Step1 设置的 state.get("data")。
ParallelAgent
: 并行执行其sub_agents
。子智能体产生的事件可能交错。- 上下文: 为每个子智能体修改
InvocationContext.branch
(如ParentBranch.ChildName
),提供独立的上下文路径,这在某些内存实现中有助于隔离历史。 - 状态: 尽管分支不同,所有并行子智能体访问同一个共享
session.state
,可读取初始状态并写入结果(请使用不同的键以避免竞态)。
- 上下文: 为每个子智能体修改
# 概念示例:并行执行
from google.adk.agents import ParallelAgent, LlmAgent
fetch_weather = LlmAgent(name="WeatherFetcher", output_key="weather")
fetch_news = LlmAgent(name="NewsFetcher", output_key="news")
gatherer = ParallelAgent(name="InfoGatherer", sub_agents=[fetch_weather, fetch_news])
# 当 gatherer 运行时,WeatherFetcher 和 NewsFetcher 并发运行。
# 后续智能体可以读取 state['weather'] 和 state['news']。
// 概念示例:并行执行
import com.google.adk.agents.LlmAgent;
import com.google.adk.agents.ParallelAgent;
LlmAgent fetchWeather = LlmAgent.builder()
.name("WeatherFetcher")
.outputKey("weather")
.build();
LlmAgent fetchNews = LlmAgent.builder()
.name("NewsFetcher")
.instruction("news")
.build();
ParallelAgent gatherer = ParallelAgent.builder()
.name("InfoGatherer")
.subAgents(fetchWeather, fetchNews)
.build();
// 当 gatherer 运行时,WeatherFetcher 和 NewsFetcher 并发运行。
// 后续智能体可以读取 state['weather'] 和 state['news']。
LoopAgent
: 在循环中顺序执行其sub_agents
。- 终止条件: 如果达到可选的
max_iterations
,或任意子智能体返回带有escalate=True
的Event
(在 Event Actions 中),则循环停止。 - 上下文与状态: 每次迭代传递相同的
InvocationContext
,允许状态变更(如计数器、标志)在循环间持久化。
- 终止条件: 如果达到可选的
# 概念示例:带条件的循环
from google.adk.agents import LoopAgent, LlmAgent, BaseAgent
from google.adk.events import Event, EventActions
from google.adk.agents.invocation_context import InvocationContext
from typing import AsyncGenerator
class CheckCondition(BaseAgent): # 自定义智能体用于检查状态
async def _run_async_impl(self, ctx: InvocationContext) -> AsyncGenerator[Event, None]:
status = ctx.session.state.get("status", "pending")
is_done = (status == "completed")
yield Event(author=self.name, actions=EventActions(escalate=is_done)) # 如果完成则升级
process_step = LlmAgent(name="ProcessingStep") # 可能会更新 state['status'] 的智能体
poller = LoopAgent(
name="StatusPoller",
max_iterations=10,
sub_agents=[process_step, CheckCondition(name="Checker")]
)
# 当 poller 运行时,会反复执行 process_step 和 Checker,
# 直到 Checker 升级(state['status'] == 'completed')或达到 10 次迭代。
// 概念示例:带条件的循环
// 自定义智能体用于检查状态并可能升级
public static class CheckConditionAgent extends BaseAgent {
public CheckConditionAgent(String name, String description) {
super(name, description, List.of(), null, null);
}
@Override
protected Flowable<Event> runAsyncImpl(InvocationContext ctx) {
String status = (String) ctx.session().state().getOrDefault("status", "pending");
boolean isDone = "completed".equalsIgnoreCase(status);
// 生成一个事件,如果条件满足则升级(退出循环)。
// 如果未完成,escalate 标志为 false 或缺失,循环继续。
Event checkEvent = Event.builder()
.author(name())
.id(Event.generateEventId()) // 重要:为事件生成唯一 ID
.actions(EventActions.builder().escalate(isDone).build()) // 如果完成则升级
.build();
return Flowable.just(checkEvent);
}
}
// 可能会更新 state.put("status") 的智能体
LlmAgent processingStepAgent = LlmAgent.builder().name("ProcessingStep").build();
// 检查条件的自定义智能体实例
CheckConditionAgent conditionCheckerAgent = new CheckConditionAgent(
"ConditionChecker",
"检查 status 是否为 'completed'。"
);
LoopAgent poller = LoopAgent.builder().name("StatusPoller").maxIterations(10).subAgents(processingStepAgent, conditionCheckerAgent).build();
// 当 poller 运行时,会反复执行 processingStepAgent 和 conditionCheckerAgent,
// 直到 Checker 升级(state.get("status") == "completed")或达到 10 次迭代。
1.3. Interaction & Communication Mechanisms¶
系统内的智能体通常需要相互交换数据或触发动作。ADK 通过以下方式实现这一点:
a) 共享会话状态(session.state
)¶
在同一调用中运行的智能体(因此通过 InvocationContext
共享相同的 Session
对象)被动通信的最基本方式。
- 机制: 一个智能体(或其工具/回调)写入一个值(
context.state['data_key'] = processed_data
),随后的智能体读取它(data = context.state.get('data_key')
)。状态变更通过CallbackContext
跟踪。 - 便利性:
LlmAgent
上的output_key
属性自动将智能体的最终响应文本(或结构化输出)保存到指定的状态键。 - 性质: 异步、被动通信。非常适合由
SequentialAgent
编排的管道或跨LoopAgent
迭代传递数据。 - 另见: 状态管理
# Conceptual Example: Using output_key and reading state
from google.adk.agents import LlmAgent, SequentialAgent
agent_A = LlmAgent(name="AgentA", instruction="Find the capital of France.", output_key="capital_city")
agent_B = LlmAgent(name="AgentB", instruction="Tell me about the city stored in state key 'capital_city'.")
pipeline = SequentialAgent(name="CityInfo", sub_agents=[agent_A, agent_B])
# AgentA runs, saves "Paris" to state['capital_city'].
# AgentB runs, its instruction processor reads state['capital_city'] to get "Paris".
// Conceptual Example: Using outputKey and reading state
import com.google.adk.agents.LlmAgent;
import com.google.adk.agents.SequentialAgent;
LlmAgent agentA = LlmAgent.builder()
.name("AgentA")
.instruction("Find the capital of France.")
.outputKey("capital_city")
.build();
LlmAgent agentB = LlmAgent.builder()
.name("AgentB")
.instruction("Tell me about the city stored in state key 'capital_city'.")
.outputKey("capital_city")
.build();
SequentialAgent pipeline = SequentialAgent.builder().name("CityInfo").subAgents(agentA, agentB).build();
// AgentA runs, saves "Paris" to state('capital_city').
// AgentB runs, its instruction processor reads state.get("capital_city") to get "Paris".
b) LLM 驱动的委托(智能体转移)¶
利用 LlmAgent
的理解能力来动态将任务路由到层次结构中其他合适的智能体。
- 机制: 智能体的 LLM 生成特定的函数调用:
transfer_to_agent(agent_name='target_agent_name')
。 - 处理: 默认情况下,当存在子智能体或未禁止转移时使用的
AutoFlow
会拦截此调用。它使用root_agent.find_agent()
识别目标智能体并更新InvocationContext
以切换执行焦点。 - 要求: 调用的
LlmAgent
需要明确的instructions
,说明何时转移,而潜在的目标智能体需要明确的description
,以便 LLM 做出明智的决定。可以在LlmAgent
上配置转移范围(父级、子智能体、同级)。 - 性质: 基于 LLM 解释的动态、灵活路由。
# Conceptual Setup: LLM Transfer
from google.adk.agents import LlmAgent
booking_agent = LlmAgent(name="Booker", description="Handles flight and hotel bookings.")
info_agent = LlmAgent(name="Info", description="Provides general information and answers questions.")
coordinator = LlmAgent(
name="Coordinator",
model="gemini-2.0-flash",
instruction="You are an assistant. Delegate booking tasks to Booker and info requests to Info.",
description="Main coordinator.",
# AutoFlow is typically used implicitly here
sub_agents=[booking_agent, info_agent]
)
# If coordinator receives "Book a flight", its LLM should generate:
# FunctionCall(name='transfer_to_agent', args={'agent_name': 'Booker'})
# ADK framework then routes execution to booking_agent.
// Conceptual Setup: LLM Transfer
import com.google.adk.agents.LlmAgent;
LlmAgent bookingAgent = LlmAgent.builder()
.name("Booker")
.description("Handles flight and hotel bookings.")
.build();
LlmAgent infoAgent = LlmAgent.builder()
.name("Info")
.description("Provides general information and answers questions.")
.build();
// Define the coordinator agent
LlmAgent coordinator = LlmAgent.builder()
.name("Coordinator")
.model("gemini-2.0-flash") // Or your desired model
.instruction("You are an assistant. Delegate booking tasks to Booker and info requests to Info.")
.description("Main coordinator.")
// AutoFlow will be used by default (implicitly) because subAgents are present
// and transfer is not disallowed.
.subAgents(bookingAgent, infoAgent)
.build();
// If coordinator receives "Book a flight", its LLM should generate:
// FunctionCall.builder.name("transferToAgent").args(ImmutableMap.of("agent_name", "Booker")).build()
// ADK framework then routes execution to bookingAgent.
c) 显式调用(AgentTool
)¶
允许 LlmAgent
将另一个 BaseAgent
实例视为可调用函数或工具。
- 机制: 将目标智能体实例包装在
AgentTool
中并将其包含在父LlmAgent
的tools
列表中。AgentTool
为 LLM 生成相应的函数声明。 - 处理: 当父 LLM 生成针对
AgentTool
的函数调用时,框架执行AgentTool.run_async
。此方法运行目标智能体,捕获其最终响应,将任何状态/制品更改转发回父级的上下文,并将响应作为工具的结果返回。 - 性质: 同步(在父级的流程内),明确、受控的调用,就像任何其他工具一样。
- (注意: 需要显式导入和使用
AgentTool
)。
# Conceptual Setup: Agent as a Tool
from google.adk.agents import LlmAgent, BaseAgent
from google.adk.tools import agent_tool
from pydantic import BaseModel
# Define a target agent (could be LlmAgent or custom BaseAgent)
class ImageGeneratorAgent(BaseAgent): # Example custom agent
name: str = "ImageGen"
description: str = "Generates an image based on a prompt."
# ... internal logic ...
async def _run_async_impl(self, ctx): # Simplified run logic
prompt = ctx.session.state.get("image_prompt", "default prompt")
# ... generate image bytes ...
image_bytes = b"..."
yield Event(author=self.name, content=types.Content(parts=[types.Part.from_bytes(image_bytes, "image/png")]))
image_agent = ImageGeneratorAgent()
image_tool = agent_tool.AgentTool(agent=image_agent) # Wrap the agent
# Parent agent uses the AgentTool
artist_agent = LlmAgent(
name="Artist",
model="gemini-2.0-flash",
instruction="Create a prompt and use the ImageGen tool to generate the image.",
tools=[image_tool] # Include the AgentTool
)
# Artist LLM generates a prompt, then calls:
# FunctionCall(name='ImageGen', args={'image_prompt': 'a cat wearing a hat'})
# Framework calls image_tool.run_async(...), which runs ImageGeneratorAgent.
# The resulting image Part is returned to the Artist agent as the tool result.
// Conceptual Setup: Agent as a Tool
import com.google.adk.agents.BaseAgent;
import com.google.adk.agents.LlmAgent;
import com.google.adk.tools.AgentTool;
// Example custom agent (could be LlmAgent or custom BaseAgent)
public class ImageGeneratorAgent extends BaseAgent {
public ImageGeneratorAgent(String name, String description) {
super(name, description, List.of(), null, null);
}
// ... internal logic ...
@Override
protected Flowable<Event> runAsyncImpl(InvocationContext invocationContext) { // Simplified run logic
invocationContext.session().state().get("image_prompt");
// Generate image bytes
// ...
Event responseEvent = Event.builder()
.author(this.name())
.content(Content.fromParts(Part.fromText("\b...")))
.build();
return Flowable.just(responseEvent);
}
@Override
protected Flowable<Event> runLiveImpl(InvocationContext invocationContext) {
return null;
}
}
// Wrap the agent using AgentTool
ImageGeneratorAgent imageAgent = new ImageGeneratorAgent("image_agent", "generates images");
AgentTool imageTool = AgentTool.create(imageAgent);
// Parent agent uses the AgentTool
LlmAgent artistAgent = LlmAgent.builder()
.name("Artist")
.model("gemini-2.0-flash")
.instruction(
"You are an artist. Create a detailed prompt for an image and then " +
"use the 'ImageGen' tool to generate the image. " +
"The 'ImageGen' tool expects a single string argument named 'request' " +
"containing the image prompt. The tool will return a JSON string in its " +
"'result' field, containing 'image_base64', 'mime_type', and 'status'."
)
.description("An agent that can create images using a generation tool.")
.tools(imageTool) // Include the AgentTool
.build();
// Artist LLM generates a prompt, then calls:
// FunctionCall(name='ImageGen', args={'imagePrompt': 'a cat wearing a hat'})
// Framework calls imageTool.runAsync(...), which runs ImageGeneratorAgent.
// The resulting image Part is returned to the Artist agent as the tool result.
这些原语提供了设计多智能体交互的灵活性,范围从紧密耦合的顺序工作流到动态、LLM 驱动的委托网络。
2. Common Multi-Agent Patterns using ADK Primitives¶
通过组合 ADK 的组合原语,你可以实现多智能体协作的各种已建立模式。
协调员/调度员模式¶
- 结构: 一个中央
LlmAgent
(协调员)管理几个专业sub_agents
。 - 目标: 将传入请求路由到适当的专家智能体。
- 使用的 ADK 原语:
- 层次结构: 协调员在
sub_agents
中列出专家。 - 交互: 主要使用 LLM 驱动的委托(要求子智能体有明确的
description
和协调员有适当的instruction
)或 显式调用(AgentTool
)(协调员在其tools
中包含AgentTool
包装的专家)。
- 层次结构: 协调员在
# Conceptual Code: Coordinator using LLM Transfer
from google.adk.agents import LlmAgent
billing_agent = LlmAgent(name="Billing", description="Handles billing inquiries.")
support_agent = LlmAgent(name="Support", description="Handles technical support requests.")
coordinator = LlmAgent(
name="HelpDeskCoordinator",
model="gemini-2.0-flash",
instruction="Route user requests: Use Billing agent for payment issues, Support agent for technical problems.",
description="Main help desk router.",
# allow_transfer=True is often implicit with sub_agents in AutoFlow
sub_agents=[billing_agent, support_agent]
)
# User asks "My payment failed" -> Coordinator's LLM should call transfer_to_agent(agent_name='Billing')
# User asks "I can't log in" -> Coordinator's LLM should call transfer_to_agent(agent_name='Support')
// Conceptual Code: Coordinator using LLM Transfer
import com.google.adk.agents.LlmAgent;
LlmAgent billingAgent = LlmAgent.builder()
.name("Billing")
.description("Handles billing inquiries and payment issues.")
.build();
LlmAgent supportAgent = LlmAgent.builder()
.name("Support")
.description("Handles technical support requests and login problems.")
.build();
LlmAgent coordinator = LlmAgent.builder()
.name("HelpDeskCoordinator")
.model("gemini-2.0-flash")
.instruction("Route user requests: Use Billing agent for payment issues, Support agent for technical problems.")
.description("Main help desk router.")
.subAgents(billingAgent, supportAgent)
// Agent transfer is implicit with sub agents in the Autoflow, unless specified
// using .disallowTransferToParent or disallowTransferToPeers
.build();
// User asks "My payment failed" -> Coordinator's LLM should call
// transferToAgent(agentName='Billing')
// User asks "I can't log in" -> Coordinator's LLM should call
// transferToAgent(agentName='Support')
顺序管道模式¶
- 结构: 一个
SequentialAgent
包含按固定顺序执行的sub_agents
。 - 目标: 实现一个多步骤流程,其中一个步骤的输出作为下一个步骤的输入。
- 使用的 ADK 原语:
- 工作流:
SequentialAgent
定义顺序。 - 通信: 主要使用共享会话状态。较早的智能体写入结果(通常通过
output_key
),较晚的智能体从context.state
读取这些结果。
- 工作流:
# Conceptual Code: Sequential Data Pipeline
from google.adk.agents import SequentialAgent, LlmAgent
validator = LlmAgent(name="ValidateInput", instruction="Validate the input.", output_key="validation_status")
processor = LlmAgent(name="ProcessData", instruction="Process data if state key 'validation_status' is 'valid'.", output_key="result")
reporter = LlmAgent(name="ReportResult", instruction="Report the result from state key 'result'.")
data_pipeline = SequentialAgent(
name="DataPipeline",
sub_agents=[validator, processor, reporter]
)
# validator runs -> saves to state['validation_status']
# processor runs -> reads state['validation_status'], saves to state['result']
# reporter runs -> reads state['result']
// Conceptual Code: Sequential Data Pipeline
import com.google.adk.agents.SequentialAgent;
LlmAgent validator = LlmAgent.builder()
.name("ValidateInput")
.instruction("Validate the input")
.outputKey("validation_status") // Saves its main text output to session.state["validation_status"]
.build();
LlmAgent processor = LlmAgent.builder()
.name("ProcessData")
.instruction("Process data if state key 'validation_status' is 'valid'")
.outputKey("result") // Saves its main text output to session.state["result"]
.build();
LlmAgent reporter = LlmAgent.builder()
.name("ReportResult")
.instruction("Report the result from state key 'result'")
.build();
SequentialAgent dataPipeline = SequentialAgent.builder()
.name("DataPipeline")
.subAgents(validator, processor, reporter)
.build();
// validator runs -> saves to state['validation_status']
// processor runs -> reads state['validation_status'], saves to state['result']
// reporter runs -> reads state['result']
并行扇出/聚合模式¶
- 结构: 一个
ParallelAgent
并发运行多个sub_agents
,通常后面跟着一个 (在SequentialAgent
中的) 聚合结果的智能体。 - 目标: 同时执行独立任务以减少延迟,然后合并它们的输出。
- 使用的 ADK 原语:
- 工作流: 使用
ParallelAgent
进行并发执行 (扇出)。通常嵌套在SequentialAgent
中以处理后续的聚合步骤 (聚合)。 - 通信: 子智能体将结果写入共享会话状态中的不同键。随后的"聚合"智能体读取多个状态键。
- 工作流: 使用
# Conceptual Code: Parallel Information Gathering
from google.adk.agents import SequentialAgent, ParallelAgent, LlmAgent
fetch_api1 = LlmAgent(name="API1Fetcher", instruction="Fetch data from API 1.", output_key="api1_data")
fetch_api2 = LlmAgent(name="API2Fetcher", instruction="Fetch data from API 2.", output_key="api2_data")
gather_concurrently = ParallelAgent(
name="ConcurrentFetch",
sub_agents=[fetch_api1, fetch_api2]
)
synthesizer = LlmAgent(
name="Synthesizer",
instruction="Combine results from state keys 'api1_data' and 'api2_data'."
)
overall_workflow = SequentialAgent(
name="FetchAndSynthesize",
sub_agents=[gather_concurrently, synthesizer] # Run parallel fetch, then synthesize
)
# fetch_api1 and fetch_api2 run concurrently, saving to state.
# synthesizer runs afterwards, reading state['api1_data'] and state['api2_data'].
// Conceptual Code: Parallel Information Gathering
import com.google.adk.agents.LlmAgent;
import com.google.adk.agents.ParallelAgent;
import com.google.adk.agents.SequentialAgent;
LlmAgent fetchApi1 = LlmAgent.builder()
.name("API1Fetcher")
.instruction("Fetch data from API 1.")
.outputKey("api1_data")
.build();
LlmAgent fetchApi2 = LlmAgent.builder()
.name("API2Fetcher")
.instruction("Fetch data from API 2.")
.outputKey("api2_data")
.build();
ParallelAgent gatherConcurrently = ParallelAgent.builder()
.name("ConcurrentFetcher")
.subAgents(fetchApi2, fetchApi1)
.build();
LlmAgent synthesizer = LlmAgent.builder()
.name("Synthesizer")
.instruction("Combine results from state keys 'api1_data' and 'api2_data'.")
.build();
SequentialAgent overallWorfklow = SequentialAgent.builder()
.name("FetchAndSynthesize") // Run parallel fetch, then synthesize
.subAgents(gatherConcurrently, synthesizer)
.build();
// fetch_api1 and fetch_api2 run concurrently, saving to state.
// synthesizer runs afterwards, reading state['api1_data'] and state['api2_data'].
层次任务分解¶
- Structure: A multi-level tree of agents where higher-level agents break down complex goals and delegate sub-tasks to lower-level agents.
- Goal: Solve complex problems by recursively breaking them down into simpler, executable steps.
- ADK Primitives Used:
- Hierarchy: Multi-level
parent_agent
/sub_agents
structure. - Interaction: Primarily LLM-Driven Delegation or Explicit Invocation (
AgentTool
) used by parent agents to assign tasks to subagents. Results are returned up the hierarchy (via tool responses or state).
- Hierarchy: Multi-level
# Conceptual Code: Hierarchical Research Task
from google.adk.agents import LlmAgent
from google.adk.tools import agent_tool
# Low-level tool-like agents
web_searcher = LlmAgent(name="WebSearch", description="Performs web searches for facts.")
summarizer = LlmAgent(name="Summarizer", description="Summarizes text.")
# Mid-level agent combining tools
research_assistant = LlmAgent(
name="ResearchAssistant",
model="gemini-2.0-flash",
description="Finds and summarizes information on a topic.",
tools=[agent_tool.AgentTool(agent=web_searcher), agent_tool.AgentTool(agent=summarizer)]
)
# High-level agent delegating research
report_writer = LlmAgent(
name="ReportWriter",
model="gemini-2.0-flash",
instruction="Write a report on topic X. Use the ResearchAssistant to gather information.",
tools=[agent_tool.AgentTool(agent=research_assistant)]
# Alternatively, could use LLM Transfer if research_assistant is a sub_agent
)
# User interacts with ReportWriter.
# ReportWriter calls ResearchAssistant tool.
# ResearchAssistant calls WebSearch and Summarizer tools.
# Results flow back up.
// Conceptual Code: Hierarchical Research Task
import com.google.adk.agents.LlmAgent;
import com.google.adk.tools.AgentTool;
// Low-level tool-like agents
LlmAgent webSearcher = LlmAgent.builder()
.name("WebSearch")
.description("Performs web searches for facts.")
.build();
LlmAgent summarizer = LlmAgent.builder()
.name("Summarizer")
.description("Summarizes text.")
.build();
// Mid-level agent combining tools
LlmAgent researchAssistant = LlmAgent.builder()
.name("ResearchAssistant")
.model("gemini-2.0-flash")
.description("Finds and summarizes information on a topic.")
.tools(AgentTool.create(webSearcher), AgentTool.create(summarizer))
.build();
// High-level agent delegating research
LlmAgent reportWriter = LlmAgent.builder()
.name("ReportWriter")
.model("gemini-2.0-flash")
.instruction("Write a report on topic X. Use the ResearchAssistant to gather information.")
.tools(AgentTool.create(researchAssistant))
// Alternatively, could use LLM Transfer if research_assistant is a subAgent
.build();
// User interacts with ReportWriter.
// ReportWriter calls ResearchAssistant tool.
// ResearchAssistant calls WebSearch and Summarizer tools.
// Results flow back up.
评审/批评模式(生成 - 评论者)¶
- 结构: 通常在
SequentialAgent
中包含两个智能体:一个生成者和一个评论者/审核者。 - 目标: 通过专门的智能体审核来提高生成输出的质量或有效性。
- 使用的 ADK 原语:
- 工作流:
SequentialAgent
确保生成在审核之前发生。 - 通信: 共享会话状态(生成者使用
output_key
保存输出;审核者读取该状态键)。审核者可能将其反馈保存到另一个状态键,供后续步骤使用。
- 工作流:
# Conceptual Code: Generator-Critic
from google.adk.agents import SequentialAgent, LlmAgent
generator = LlmAgent(
name="DraftWriter",
instruction="Write a short paragraph about subject X.",
output_key="draft_text"
)
reviewer = LlmAgent(
name="FactChecker",
instruction="Review the text in state key 'draft_text' for factual accuracy. Output 'valid' or 'invalid' with reasons.",
output_key="review_status"
)
# Optional: Further steps based on review_status
review_pipeline = SequentialAgent(
name="WriteAndReview",
sub_agents=[generator, reviewer]
)
# generator runs -> saves draft to state['draft_text']
# reviewer runs -> reads state['draft_text'], saves status to state['review_status']
// Conceptual Code: Generator-Critic
import com.google.adk.agents.LlmAgent;
import com.google.adk.agents.SequentialAgent;
LlmAgent generator = LlmAgent.builder()
.name("DraftWriter")
.instruction("Write a short paragraph about subject X.")
.outputKey("draft_text")
.build();
LlmAgent reviewer = LlmAgent.builder()
.name("FactChecker")
.instruction("Review the text in state key 'draft_text' for factual accuracy. Output 'valid' or 'invalid' with reasons.")
.outputKey("review_status")
.build();
// Optional: Further steps based on review_status
SequentialAgent reviewPipeline = SequentialAgent.builder()
.name("WriteAndReview")
.subAgents(generator, reviewer)
.build();
// generator runs -> saves draft to state['draft_text']
// reviewer runs -> reads state['draft_text'], saves status to state['review_status']
迭代改进模式¶
- Structure: Uses a
LoopAgent
containing one or more agents that work on a task over multiple iterations. - Goal: Progressively improve a result (e.g., code, text, plan) stored in the session state until a quality threshold is met or a maximum number of iterations is reached.
- ADK Primitives Used:
- Workflow:
LoopAgent
manages the repetition. - Communication: Shared Session State is essential for agents to read the previous iteration's output and save the refined version.
- Termination: The loop typically ends based on
max_iterations
or a dedicated checking agent settingescalate=True
in theEvent Actions
when the result is satisfactory.
- Workflow:
# Conceptual Code: Iterative Code Refinement
from google.adk.agents import LoopAgent, LlmAgent, BaseAgent
from google.adk.events import Event, EventActions
from google.adk.agents.invocation_context import InvocationContext
from typing import AsyncGenerator
# Agent to generate/refine code based on state['current_code'] and state['requirements']
code_refiner = LlmAgent(
name="CodeRefiner",
instruction="Read state['current_code'] (if exists) and state['requirements']. Generate/refine Python code to meet requirements. Save to state['current_code'].",
output_key="current_code" # Overwrites previous code in state
)
# Agent to check if the code meets quality standards
quality_checker = LlmAgent(
name="QualityChecker",
instruction="Evaluate the code in state['current_code'] against state['requirements']. Output 'pass' or 'fail'.",
output_key="quality_status"
)
# Custom agent to check the status and escalate if 'pass'
class CheckStatusAndEscalate(BaseAgent):
async def _run_async_impl(self, ctx: InvocationContext) -> AsyncGenerator[Event, None]:
status = ctx.session.state.get("quality_status", "fail")
should_stop = (status == "pass")
yield Event(author=self.name, actions=EventActions(escalate=should_stop))
refinement_loop = LoopAgent(
name="CodeRefinementLoop",
max_iterations=5,
sub_agents=[code_refiner, quality_checker, CheckStatusAndEscalate(name="StopChecker")]
)
# Loop runs: Refiner -> Checker -> StopChecker
# State['current_code'] is updated each iteration.
# Loop stops if QualityChecker outputs 'pass' (leading to StopChecker escalating) or after 5 iterations.
// Conceptual Code: Iterative Code Refinement
import com.google.adk.agents.BaseAgent;
import com.google.adk.agents.LlmAgent;
import com.google.adk.agents.LoopAgent;
import com.google.adk.events.Event;
import com.google.adk.events.EventActions;
import com.google.adk.agents.InvocationContext;
import io.reactivex.rxjava3.core.Flowable;
import java.util.List;
// Agent to generate/refine code based on state['current_code'] and state['requirements']
LlmAgent codeRefiner = LlmAgent.builder()
.name("CodeRefiner")
.instruction("Read state['current_code'] (if exists) and state['requirements']. Generate/refine Java code to meet requirements. Save to state['current_code'].")
.outputKey("current_code") // Overwrites previous code in state
.build();
// Agent to check if the code meets quality standards
LlmAgent qualityChecker = LlmAgent.builder()
.name("QualityChecker")
.instruction("Evaluate the code in state['current_code'] against state['requirements']. Output 'pass' or 'fail'.")
.outputKey("quality_status")
.build();
BaseAgent checkStatusAndEscalate = new BaseAgent(
"StopChecker","Checks quality_status and escalates if 'pass'.", List.of(), null, null) {
@Override
protected Flowable<Event> runAsyncImpl(InvocationContext invocationContext) {
String status = (String) invocationContext.session().state().getOrDefault("quality_status", "fail");
boolean shouldStop = "pass".equals(status);
EventActions actions = EventActions.builder().escalate(shouldStop).build();
Event event = Event.builder()
.author(this.name())
.actions(actions)
.build();
return Flowable.just(event);
}
};
LoopAgent refinementLoop = LoopAgent.builder()
.name("CodeRefinementLoop")
.maxIterations(5)
.subAgents(codeRefiner, qualityChecker, checkStatusAndEscalate)
.build();
// Loop runs: Refiner -> Checker -> StopChecker
// State['current_code'] is updated each iteration.
// Loop stops if QualityChecker outputs 'pass' (leading to StopChecker escalating) or after 5
// iterations.
人机协作模式¶
- 结构: 在智能体工作流中集成人类干预点。
- 目标: 允许人类监督、批准、纠正或执行 AI 无法完成的任务。
- 使用的 ADK 原语(概念):
- 交互: 可以通过自定义工具实现,该工具暂停执行并向外部系统(例如 UI、工单系统)发送请求,等待人类输入。然后该工具将人类的响应返回给智能体。
- 工作流: 可以使用LLM 驱动的委托(
transfer_to_agent
)指向概念性的"人类智能体"触发外部工作流,或在LlmAgent
中使用自定义工具。 - 状态/回调: 状态可以保存人类任务详情;回调可以管理交互流程。
- 注意: ADK 没有内置的"人类智能体"类型,所以这需要自定义集成。
# Conceptual Code: Using a Tool for Human Approval
from google.adk.agents import LlmAgent, SequentialAgent
from google.adk.tools import FunctionTool
# --- Assume external_approval_tool exists ---
# This tool would:
# 1. Take details (e.g., request_id, amount, reason).
# 2. Send these details to a human review system (e.g., via API).
# 3. Poll or wait for the human response (approved/rejected).
# 4. Return the human's decision.
# async def external_approval_tool(amount: float, reason: str) -> str: ...
approval_tool = FunctionTool(func=external_approval_tool)
# Agent that prepares the request
prepare_request = LlmAgent(
name="PrepareApproval",
instruction="Prepare the approval request details based on user input. Store amount and reason in state.",
# ... likely sets state['approval_amount'] and state['approval_reason'] ...
)
# Agent that calls the human approval tool
request_approval = LlmAgent(
name="RequestHumanApproval",
instruction="Use the external_approval_tool with amount from state['approval_amount'] and reason from state['approval_reason'].",
tools=[approval_tool],
output_key="human_decision"
)
# Agent that proceeds based on human decision
process_decision = LlmAgent(
name="ProcessDecision",
instruction="Check state key 'human_decision'. If 'approved', proceed. If 'rejected', inform user."
)
approval_workflow = SequentialAgent(
name="HumanApprovalWorkflow",
sub_agents=[prepare_request, request_approval, process_decision]
)
// Conceptual Code: Using a Tool for Human Approval
import com.google.adk.agents.LlmAgent;
import com.google.adk.agents.SequentialAgent;
import com.google.adk.tools.FunctionTool;
// --- Assume external_approval_tool exists ---
// This tool would:
// 1. Take details (e.g., request_id, amount, reason).
// 2. Send these details to a human review system (e.g., via API).
// 3. Poll or wait for the human response (approved/rejected).
// 4. Return the human's decision.
// public boolean externalApprovalTool(float amount, String reason) { ... }
FunctionTool approvalTool = FunctionTool.create(externalApprovalTool);
// Agent that prepares the request
LlmAgent prepareRequest = LlmAgent.builder()
.name("PrepareApproval")
.instruction("Prepare the approval request details based on user input. Store amount and reason in state.")
// ... likely sets state['approval_amount'] and state['approval_reason'] ...
.build();
// Agent that calls the human approval tool
LlmAgent requestApproval = LlmAgent.builder()
.name("RequestHumanApproval")
.instruction("Use the external_approval_tool with amount from state['approval_amount'] and reason from state['approval_reason'].")
.tools(approvalTool)
.outputKey("human_decision")
.build();
// Agent that proceeds based on human decision
LlmAgent processDecision = LlmAgent.builder()
.name("ProcessDecision")
.instruction("Check state key 'human_decision'. If 'approved', proceed. If 'rejected', inform user.")
.build();
SequentialAgent approvalWorkflow = SequentialAgent.builder()
.name("HumanApprovalWorkflow")
.subAgents(prepareRequest, requestApproval, processDecision)
.build();
这些模式为构建多智能体系统提供了起点。你可以根据需要混合和匹配它们,以创建最适合你特定应用的架构。