函数工具¶
当预构建的 ADK 工具无法满足你的需求时,你可以创建自定义的函数工具。构建函数工具允许你创建定制功能,例如连接到专有数据库或实现独特算法。
例如,函数工具 myfinancetool 可能是计算特定财务指标的函数。ADK 还支持长时间运行的函数,因此如果该计算需要一段时间,智能体可以继续处理其他任务。
ADK 提供了几种创建函数工具的方法,每种方法适合不同的复杂性和控制级别:
函数工具¶
将 Python 函数转换为工具是将自定义逻辑集成到智能体中的直接方法。当你将函数分配给智能体的 tools 列表时,框架会自动将其包装为 FunctionTool。
工作原理¶
ADK 框架会自动检查你的 Python 函数签名——包括其名称、文档字符串、参数、类型提示和默认值——以生成模式。这个模式是 LLM 用来理解工具目的、何时使用它以及需要什么参数的内容。
定义函数签名¶
定义良好的函数签名对于 LLM 正确使用你的工具至关重要。
参数¶
你可以定义具有必需参数、可选参数和可变参数的函数。以下是每种处理方式:
必需参数¶
如果参数有类型提示但没有默认值,则被视为必需参数。LLM 在调用工具时必须为此参数提供值。
示例:必需参数
在此示例中,city 和 unit 都是必需的。如果 LLM 尝试在没有其中一个参数的情况下调用 get_weather,ADK 将向 LLM 返回错误,提示其更正调用。
带有默认值的可选参数¶
如果你提供默认值,则参数被视为可选。这是定义可选参数的标准 Python 方式。ADK 正确解释这些参数,并且不会将它们列在发送给 LLM 的工具模式的 required 字段中。
示例:带有默认值的可选参数
def search_flights(destination: str, departure_date: str, flexible_days: int = 0):
"""
搜索航班。
Args:
destination (str): 目的地城市。
departure_date (str): 期望的出发日期。
flexible_days (int, optional): 搜索的灵活天数。默认为 0。
"""
# ... 函数逻辑 ...
if flexible_days > 0:
return {"status": "success", "report": f"找到到 {destination} 的灵活航班。"}
return {"status": "success", "report": f"找到 {departure_date} 到 {destination} 的航班。"}
这里,flexible_days 是可选的。LLM 可以选择提供它,但不是必需的。
使用 typing.Optional 的可选参数¶
你还可以使用 typing.Optional[SomeType] 或 | None 语法(Python 3.10+)将参数标记为可选。这表示参数可以是 None。当与 None 的默认值结合使用时,它的行为就像标准可选参数。
示例:typing.Optional
from typing import Optional
def create_user_profile(username: str, bio: Optional[str] = None):
"""
创建新的用户配置文件。
Args:
username (str): 用户的唯一用户名。
bio (str, optional): 用户的简短传记。默认为 None。
"""
# ... 函数逻辑 ...
if bio:
return {"status": "success", "message": f"为 {username} 创建了带传记的配置文件。"}
return {"status": "success", "message": f"为 {username} 创建了配置文件。"}
可变参数(*args 和 **kwargs){: #variadic-parameters-args-and-kwargs}¶
虽然你可以在函数签名中包含 *args(可变位置参数)和 **kwargs(可变关键字参数)用于其他目的,但它们在为 LLM 生成工具模式时被 ADK 框架忽略。LLM 不会意识到它们,也无法向它们传递参数。最好依赖明确定义的参数来处理你期望从 LLM 获得的所有数据。
返回类型¶
函数工具的首选返回类型是 Python 中的字典或 Java 中的Map。这允许你用键值对结构化响应,为 LLM 提供上下文和清晰度。如果你的函数返回的不是字典类型,框架会自动将其包装为一个以 "result" 为键的字典。
努力使你的返回值尽可能具有描述性。例如,不要返回数字错误代码,而是返回包含人类可读解释的 "error_message" 键的字典。请记住,LLM,而不是代码片段,需要理解结果。作为最佳实践,在你的返回字典中包含 "status" 键以指示整体结果(例如,"success"、"error"、"pending"),为 LLM 提供关于操作状态的清晰信号。
文档字符串¶
在工具之间传递数据¶
当智能体按顺序调用多个工具时,你可能需要在工具之间传递数据。推荐的方法是使用会话状态中的 temp: 前缀。
工具可以将数据写入 temp: 变量,后续工具可以读取它。此数据仅在当前调用期间可用,之后会被丢弃。
共享调用上下文
单个智能体轮次内的所有工具调用共享相同的 InvocationContext。这意味着它们也共享相同的临时(temp:)状态,这就是数据如何在它们之间传递的方式。
示例¶
??? "示例" {: #example}
=== "Python"
该工具是一个 Python 函数,用于获取给定股票代码/符号的股票价格。
<u>注意</u>:使用此工具前需先 `pip install yfinance`。
```py
# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from google.adk.agents import Agent
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.genai import types
import yfinance as yf
APP_NAME = "stock_app"
USER_ID = "1234"
SESSION_ID = "session1234"
def get_stock_price(symbol: str):
"""
Retrieves the current stock price for a given symbol.
Args:
symbol (str): The stock symbol (e.g., "AAPL", "GOOG").
Returns:
float: The current stock price, or None if an error occurs.
"""
try:
stock = yf.Ticker(symbol)
historical_data = stock.history(period="1d")
if not historical_data.empty:
current_price = historical_data['Close'].iloc[-1]
return current_price
else:
return None
except Exception as e:
print(f"Error retrieving stock price for {symbol}: {e}")
return None
stock_price_agent = Agent(
model='gemini-2.0-flash',
name='stock_agent',
instruction= 'You are an agent who retrieves stock prices. If a ticker symbol is provided, fetch the current price. If only a company name is given, first perform a Google search to find the correct ticker symbol before retrieving the stock price. If the provided ticker symbol is invalid or data cannot be retrieved, inform the user that the stock price could not be found.',
description='This agent specializes in retrieving real-time stock prices. Given a stock ticker symbol (e.g., AAPL, GOOG, MSFT) or the stock name, use the tools and reliable data sources to provide the most up-to-date price.',
tools=[get_stock_price], # You can add Python functions directly to the tools list; they will be automatically wrapped as FunctionTools.
)
# Session and Runner
async def setup_session_and_runner():
session_service = InMemorySessionService()
session = await session_service.create_session(app_name=APP_NAME, user_id=USER_ID, session_id=SESSION_ID)
runner = Runner(agent=stock_price_agent, app_name=APP_NAME, session_service=session_service)
return session, runner
# Agent Interaction
async def call_agent_async(query):
content = types.Content(role='user', parts=[types.Part(text=query)])
session, runner = await setup_session_and_runner()
events = runner.run_async(user_id=USER_ID, session_id=SESSION_ID, new_message=content)
async for event in events:
if event.is_final_response():
final_response = event.content.parts[0].text
print("Agent Response: ", final_response)
# Note: In Colab, you can directly use 'await' at the top level.
# If running this code as a standalone Python script, you'll need to use asyncio.run() or manage the event loop.
await call_agent_async("stock price of GOOG")
```
此工具的返回值会被包装为字典。
```json
{"result": "$123"}
```
=== "Java"
该工具用于获取股票价格的模拟值。
```java
import com.google.adk.agents.LlmAgent;
import com.google.adk.events.Event;
import com.google.adk.runner.InMemoryRunner;
import com.google.adk.sessions.Session;
import com.google.adk.tools.Annotations.Schema;
import com.google.adk.tools.FunctionTool;
import com.google.genai.types.Content;
import com.google.genai.types.Part;
import io.reactivex.rxjava3.core.Flowable;
import java.util.HashMap;
import java.util.Map;
public class StockPriceAgent {
private static final String APP_NAME = "stock_agent";
private static final String USER_ID = "user1234";
// Mock data for various stocks functionality
// NOTE: This is a MOCK implementation. In a real Java application,
// you would use a financial data API or library.
private static final Map<String, Double> mockStockPrices = new HashMap<>();
static {
mockStockPrices.put("GOOG", 1.0);
mockStockPrices.put("AAPL", 1.0);
mockStockPrices.put("MSFT", 1.0);
}
@Schema(description = "Retrieves the current stock price for a given symbol.")
public static Map<String, Object> getStockPrice(
@Schema(description = "The stock symbol (e.g., \"AAPL\", \"GOOG\")",
name = "symbol")
String symbol) {
try {
if (mockStockPrices.containsKey(symbol.toUpperCase())) {
double currentPrice = mockStockPrices.get(symbol.toUpperCase());
System.out.println("Tool: Found price for " + symbol + ": " + currentPrice);
return Map.of("symbol", symbol, "price", currentPrice);
} else {
return Map.of("symbol", symbol, "error", "No data found for symbol");
}
} catch (Exception e) {
return Map.of("symbol", symbol, "error", e.getMessage());
}
}
public static void callAgent(String prompt) {
// Create the FunctionTool from the Java method
FunctionTool getStockPriceTool = FunctionTool.create(StockPriceAgent.class, "getStockPrice");
LlmAgent stockPriceAgent =
LlmAgent.builder()
.model("gemini-2.0-flash")
.name("stock_agent")
.instruction(
"You are an agent who retrieves stock prices. If a ticker symbol is provided, fetch the current price. If only a company name is given, first perform a Google search to find the correct ticker symbol before retrieving the stock price. If the provided ticker symbol is invalid or data cannot be retrieved, inform the user that the stock price could not be found.")
.description(
"This agent specializes in retrieving real-time stock prices. Given a stock ticker symbol (e.g., AAPL, GOOG, MSFT) or the stock name, use the tools and reliable data sources to provide the most up-to-date price.")
.tools(getStockPriceTool) // Add the Java FunctionTool
.build();
// Create an InMemoryRunner
InMemoryRunner runner = new InMemoryRunner(stockPriceAgent, APP_NAME);
// InMemoryRunner automatically creates a session service. Create a session using the service
Session session = runner.sessionService().createSession(APP_NAME, USER_ID).blockingGet();
Content userMessage = Content.fromParts(Part.fromText(prompt));
// Run the agent
Flowable<Event> eventStream = runner.runAsync(USER_ID, session.id(), userMessage);
// Stream event response
eventStream.blockingForEach(
event -> {
if (event.finalResponse()) {
System.out.println(event.stringifyContent());
}
});
}
public static void main(String[] args) {
callAgent("stock price of GOOG");
callAgent("What's the price of MSFT?");
callAgent("Can you find the stock price for an unknown company XYZ?");
}
}
```
此工具的返回值会被包装为 Map<String, Object>。
```json
对于输入 `GOOG`:{"symbol": "GOOG", "price": "1.0"}
```
最佳实践¶
虽然你在定义函数方面有相当大的灵活性,但请记住,简单性增强了 LLM 的可用性。考虑这些指导原则:
- 参数越少越好: 最小化参数数量以减少复杂性。
- 简单数据类型: 尽可能使用
str和int等原始数据类型,而不是自定义类。 - 有意义的名称: 函数的名称和参数名称显著影响 LLM 如何解释和使用工具。选择能清楚反映函数目的和其输入含义的名称。避免像
do_stuff()或beAgent()这样的通用名称。 - 为并行执行而构建: 通过构建异步操作来改善多个工具运行时的函数调用性能。有关启用工具并行执行的信息,请参阅提高工具性能与并行执行。
长时间运行函数工具¶
这个工具旨在帮助你启动和管理在你的智能体工作流操作之外处理的任务,这些任务需要大量的处理时间,而不会阻塞智能体的执行。这个工具是 FunctionTool 的子类。
在使用 LongRunningFunctionTool 时,你的函数可以启动长时间运行的操作,并可选地返回一个初始结果,例如一个长时间运行的操作 ID。一旦调用了一个长时间运行的函数工具,智能体运行器会暂停智能体运行,让智能体客户端决定是继续执行还是等待长时间运行的操作完成。智能体客户端可以查询长时间运行操作的进度,并返回中间或最终的响应。然后智能体可以继续处理其他任务。一个例子是人机协同场景,智能体在执行任务前需要获得人的批准。
警告:执行处理
长时间运行的函数工具旨在帮助你启动和 管理 长时间运行的任务作为你智能体工作流的一部分,但 不执行 实际的长时间任务。 对于需要大量时间完成的任务,你应该实现一个独立的服务器来执行这些任务。
提示:并行执行
根据你构建的工具类型,设计异步操作可能比创建长时间运行工具更好的解决方案。如需了解更多信息,请参阅提高工具性能与并行执行。
工作原理¶
在 Python 中,你用 LongRunningFunctionTool 包装函数。在 Java 中,你将方法名传递给 LongRunningFunctionTool.create()。
- 启动: 当 LLM 调用工具时,你的函数启动长时间运行操作。
- 初始更新: 你的函数可以选择性地返回一个初始结果(如长时间运行操作 id)。ADK 框架会将该结果打包在
FunctionResponse中返回给 LLM。这允许 LLM 通知用户(如状态、完成百分比、消息等),然后智能体运行会被结束/暂停。 - 继续或等待: 每次智能体运行结束后,智能体客户端可以查询长时间运行操作的进度,并决定是用中间响应(用于进度更新)继续智能体运行,还是等待最终响应。智能体客户端应将中间或最终响应返回给智能体以进行下一次运行。
- 框架处理: ADK 框架会管理整个执行流程。它会将智能体客户端发送的中间或最终
FunctionResponse传递给 LLM,以生成用户友好的消息。
创建工具¶
定义你的工具函数,并用 LongRunningFunctionTool 类进行包装:
# 1. Define the long running function
def ask_for_approval(
purpose: str, amount: float
) -> dict[str, Any]:
"""Ask for approval for the reimbursement."""
# create a ticket for the approval
# Send a notification to the approver with the link of the ticket
return {'status': 'pending', 'approver': 'Sean Zhou', 'purpose' : purpose, 'amount': amount, 'ticket-id': 'approval-ticket-1'}
def reimburse(purpose: str, amount: float) -> str:
"""Reimburse the amount of money to the employee."""
# send the reimbrusement request to payment vendor
return {'status': 'ok'}
# 2. Wrap the function with LongRunningFunctionTool
long_running_tool = LongRunningFunctionTool(func=ask_for_approval)
import com.google.adk.agents.LlmAgent;
import com.google.adk.tools.LongRunningFunctionTool;
import java.util.HashMap;
import java.util.Map;
public class ExampleLongRunningFunction {
// 定义你的长时间运行函数。
// 请求报销审批。
public static Map<String, Object> askForApproval(String purpose, double amount) {
// 模拟创建工单并发送通知
System.out.println(
"模拟为目的 " + purpose + ", 金额 " + amount + " 创建工单");
// 向审批人发送带有工单链接的通知
Map<String, Object> result = new HashMap<>();
result.put("status", "pending");
result.put("approver", "Sean Zhou");
result.put("purpose", purpose);
result.put("amount", amount);
result.put("ticket-id", "approval-ticket-1");
return result;
}
public static void main(String[] args) throws NoSuchMethodException {
// 将方法传递给 LongRunningFunctionTool.create
LongRunningFunctionTool approveTool =
LongRunningFunctionTool.create(ExampleLongRunningFunction.class, "askForApproval");
// 在智能体中包含该工具
LlmAgent approverAgent =
LlmAgent.builder()
// ...
.tools(approveTool)
.build();
}
}
中间/最终结果更新¶
智能体客户端收到包含长时间运行函数调用的事件后,会检查工单状态。然后,智能体客户端可以发送中间或最终响应以更新进度。框架会将该值(即使为 None)打包到返回给 LLM 的 FunctionResponse 内容中。
注意:与 Resume 功能的长时间运行函数响应
如果你的 ADK 智能体工作流配置了恢复停止的智能体功能,你还必须在长时间运行函数响应中包含调用 ID (invocation_id) 参数。你提供的调用 ID 必须是生成长时间运行函数请求的相同调用,否则系统将使用响应启动新的调用。如果你的智能体使用 Resume 功能,请考虑在长时间运行函数请求中包含调用 ID 作为参数,以便它可以与响应一起包含。有关使用 Resume 功能的更多详细信息,请参见恢复停止的智能体。
仅适用于 Java ADK
当用函数工具传递 ToolContext 时,需确保以下之一:
-
在函数签名中用注解 Schema 传递 ToolContext 参数,如:
或 -
在 mvn 编译插件中设置
-parameters标志
# Agent Interaction
async def call_agent_async(query):
def get_long_running_function_call(event: Event) -> types.FunctionCall:
# Get the long running function call from the event
if not event.long_running_tool_ids or not event.content or not event.content.parts:
return
for part in event.content.parts:
if (
part
and part.function_call
and event.long_running_tool_ids
and part.function_call.id in event.long_running_tool_ids
):
return part.function_call
def get_function_response(event: Event, function_call_id: str) -> types.FunctionResponse:
# Get the function response for the fuction call with specified id.
if not event.content or not event.content.parts:
return
for part in event.content.parts:
if (
part
and part.function_response
and part.function_response.id == function_call_id
):
return part.function_response
content = types.Content(role='user', parts=[types.Part(text=query)])
session, runner = await setup_session_and_runner()
print("\nRunning agent...")
events_async = runner.run_async(
session_id=session.id, user_id=USER_ID, new_message=content
)
long_running_function_call, long_running_function_response, ticket_id = None, None, None
async for event in events_async:
# Use helper to check for the specific auth request event
if not long_running_function_call:
long_running_function_call = get_long_running_function_call(event)
else:
_potential_response = get_function_response(event, long_running_function_call.id)
if _potential_response: # Only update if we get a non-None response
long_running_function_response = _potential_response
ticket_id = long_running_function_response.response['ticket-id']
if event.content and event.content.parts:
if text := ''.join(part.text or '' for part in event.content.parts):
print(f'[{event.author}]: {text}')
if long_running_function_response:
# query the status of the correpsonding ticket via tciket_id
# send back an intermediate / final response
updated_response = long_running_function_response.model_copy(deep=True)
updated_response.response = {'status': 'approved'}
async for event in runner.run_async(
session_id=session.id, user_id=USER_ID, new_message=types.Content(parts=[types.Part(function_response = updated_response)], role='user')
):
if event.content and event.content.parts:
if text := ''.join(part.text or '' for part in event.content.parts):
print(f'[{event.author}]: {text}')
import com.google.adk.agents.LlmAgent;
import com.google.adk.events.Event;
import com.google.adk.runner.InMemoryRunner;
import com.google.adk.runner.Runner;
import com.google.adk.sessions.Session;
import com.google.adk.tools.Annotations.Schema;
import com.google.adk.tools.LongRunningFunctionTool;
import com.google.adk.tools.ToolContext;
import com.google.common.collect.ImmutableList;
import com.google.common.collect.ImmutableMap;
import com.google.genai.types.Content;
import com.google.genai.types.FunctionCall;
import com.google.genai.types.FunctionResponse;
import com.google.genai.types.Part;
import java.util.Optional;
import java.util.UUID;
import java.util.concurrent.atomic.AtomicReference;
import java.util.stream.Collectors;
public class LongRunningFunctionExample {
private static String USER_ID = "user123";
@Schema(
name = "create_ticket_long_running",
description = """
Creates a new support ticket with a specified urgency level.
Examples of urgency are 'high', 'medium', or 'low'.
The ticket creation is a long-running process, and its ID will be provided when ready.
""")
public static void createTicketAsync(
@Schema(
name = "urgency",
description =
"The urgency level for the new ticket, such as 'high', 'medium', or 'low'.")
String urgency,
@Schema(name = "toolContext") // Ensures ADK injection
ToolContext toolContext) {
System.out.printf(
"TOOL_EXEC: 'create_ticket_long_running' called with urgency: %s (Call ID: %s)%n",
urgency, toolContext.functionCallId().orElse("N/A"));
}
public static void main(String[] args) {
LlmAgent agent =
LlmAgent.builder()
.name("ticket_agent")
.description("Agent for creating tickets via a long-running task.")
.model("gemini-2.0-flash")
.tools(
ImmutableList.of(
LongRunningFunctionTool.create(
LongRunningFunctionExample.class, "createTicketAsync")))
.build();
Runner runner = new InMemoryRunner(agent);
Session session =
runner.sessionService().createSession(agent.name(), USER_ID, null, null).blockingGet();
// --- Turn 1: User requests ticket ---
System.out.println("\n--- Turn 1: User Request ---");
Content initialUserMessage =
Content.fromParts(Part.fromText("Create a high urgency ticket for me."));
AtomicReference<String> funcCallIdRef = new AtomicReference<>();
runner
.runAsync(USER_ID, session.id(), initialUserMessage)
.blockingForEach(
event -> {
printEventSummary(event, "T1");
if (funcCallIdRef.get() == null) { // Capture the first relevant function call ID
event.content().flatMap(Content::parts).orElse(ImmutableList.of()).stream()
.map(Part::functionCall)
.flatMap(Optional::stream)
.filter(fc -> "create_ticket_long_running".equals(fc.name().orElse("")))
.findFirst()
.flatMap(FunctionCall::id)
.ifPresent(funcCallIdRef::set);
}
});
if (funcCallIdRef.get() == null) {
System.out.println("ERROR: Tool 'create_ticket_long_running' not called in Turn 1.");
return;
}
System.out.println("ACTION: Captured FunctionCall ID: " + funcCallIdRef.get());
// --- Turn 2: App provides initial ticket_id (simulating async tool completion) ---
System.out.println("\n--- Turn 2: App provides ticket_id ---");
String ticketId = "TICKET-" + UUID.randomUUID().toString().substring(0, 8).toUpperCase();
FunctionResponse ticketCreatedFuncResponse =
FunctionResponse.builder()
.name("create_ticket_long_running")
.id(funcCallIdRef.get())
.response(ImmutableMap.of("ticket_id", ticketId))
.build();
Content appResponseWithTicketId =
Content.builder()
.parts(
ImmutableList.of(
Part.builder().functionResponse(ticketCreatedFuncResponse).build()))
.role("user")
.build();
runner
.runAsync(USER_ID, session.id(), appResponseWithTicketId)
.blockingForEach(event -> printEventSummary(event, "T2"));
System.out.println("ACTION: Sent ticket_id " + ticketId + " to agent.");
// --- Turn 3: App provides ticket status update ---
System.out.println("\n--- Turn 3: App provides ticket status ---");
FunctionResponse ticketStatusFuncResponse =
FunctionResponse.builder()
.name("create_ticket_long_running")
.id(funcCallIdRef.get())
.response(ImmutableMap.of("status", "approved", "ticket_id", ticketId))
.build();
Content appResponseWithStatus =
Content.builder()
.parts(
ImmutableList.of(Part.builder().functionResponse(ticketStatusFuncResponse).build()))
.role("user")
.build();
runner
.runAsync(USER_ID, session.id(), appResponseWithStatus)
.blockingForEach(event -> printEventSummary(event, "T3_FINAL"));
System.out.println("Long running function completed successfully.");
}
private static void printEventSummary(Event event, String turnLabel) {
event
.content()
.ifPresent(
content -> {
String text =
content.parts().orElse(ImmutableList.of()).stream()
.map(part -> part.text().orElse(""))
.filter(s -> !s.isEmpty())
.collect(Collectors.joining(" "));
if (!text.isEmpty()) {
System.out.printf("[%s][%s_TEXT]: %s%n", turnLabel, event.author(), text);
}
content.parts().orElse(ImmutableList.of()).stream()
.map(Part::functionCall)
.flatMap(Optional::stream)
.findFirst() // Assuming one function call per relevant event for simplicity
.ifPresent(
fc ->
System.out.printf(
"[%s][%s_CALL]: %s(%s) ID: %s%n",
turnLabel,
event.author(),
fc.name().orElse("N/A"),
fc.args().orElse(ImmutableMap.of()),
fc.id().orElse("N/A")));
});
}
}
Python 完整示例:文件处理模拟
# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import asyncio
from typing import Any
from google.adk.agents import Agent
from google.adk.events import Event
from google.adk.runners import Runner
from google.adk.tools import LongRunningFunctionTool
from google.adk.sessions import InMemorySessionService
from google.genai import types
# 1. Define the long running function
def ask_for_approval(
purpose: str, amount: float
) -> dict[str, Any]:
"""Ask for approval for the reimbursement."""
# create a ticket for the approval
# Send a notification to the approver with the link of the ticket
return {'status': 'pending', 'approver': 'Sean Zhou', 'purpose' : purpose, 'amount': amount, 'ticket-id': 'approval-ticket-1'}
def reimburse(purpose: str, amount: float) -> str:
"""Reimburse the amount of money to the employee."""
# send the reimbrusement request to payment vendor
return {'status': 'ok'}
# 2. Wrap the function with LongRunningFunctionTool
long_running_tool = LongRunningFunctionTool(func=ask_for_approval)
# 3. Use the tool in an Agent
file_processor_agent = Agent(
# Use a model compatible with function calling
model="gemini-2.0-flash",
name='reimbursement_agent',
instruction="""
You are an agent whose job is to handle the reimbursement process for
the employees. If the amount is less than $100, you will automatically
approve the reimbursement.
If the amount is greater than $100, you will
ask for approval from the manager. If the manager approves, you will
call reimburse() to reimburse the amount to the employee. If the manager
rejects, you will inform the employee of the rejection.
""",
tools=[reimburse, long_running_tool]
)
APP_NAME = "human_in_the_loop"
USER_ID = "1234"
SESSION_ID = "session1234"
# Session and Runner
async def setup_session_and_runner():
session_service = InMemorySessionService()
session = await session_service.create_session(app_name=APP_NAME, user_id=USER_ID, session_id=SESSION_ID)
runner = Runner(agent=file_processor_agent, app_name=APP_NAME, session_service=session_service)
return session, runner
# Agent Interaction
async def call_agent_async(query):
def get_long_running_function_call(event: Event) -> types.FunctionCall:
# Get the long running function call from the event
if not event.long_running_tool_ids or not event.content or not event.content.parts:
return
for part in event.content.parts:
if (
part
and part.function_call
and event.long_running_tool_ids
and part.function_call.id in event.long_running_tool_ids
):
return part.function_call
def get_function_response(event: Event, function_call_id: str) -> types.FunctionResponse:
# Get the function response for the fuction call with specified id.
if not event.content or not event.content.parts:
return
for part in event.content.parts:
if (
part
and part.function_response
and part.function_response.id == function_call_id
):
return part.function_response
content = types.Content(role='user', parts=[types.Part(text=query)])
session, runner = await setup_session_and_runner()
print("\nRunning agent...")
events_async = runner.run_async(
session_id=session.id, user_id=USER_ID, new_message=content
)
long_running_function_call, long_running_function_response, ticket_id = None, None, None
async for event in events_async:
# Use helper to check for the specific auth request event
if not long_running_function_call:
long_running_function_call = get_long_running_function_call(event)
else:
_potential_response = get_function_response(event, long_running_function_call.id)
if _potential_response: # Only update if we get a non-None response
long_running_function_response = _potential_response
ticket_id = long_running_function_response.response['ticket-id']
if event.content and event.content.parts:
if text := ''.join(part.text or '' for part in event.content.parts):
print(f'[{event.author}]: {text}')
if long_running_function_response:
# query the status of the correpsonding ticket via tciket_id
# send back an intermediate / final response
updated_response = long_running_function_response.model_copy(deep=True)
updated_response.response = {'status': 'approved'}
async for event in runner.run_async(
session_id=session.id, user_id=USER_ID, new_message=types.Content(parts=[types.Part(function_response = updated_response)], role='user')
):
if event.content and event.content.parts:
if text := ''.join(part.text or '' for part in event.content.parts):
print(f'[{event.author}]: {text}')
# Note: In Colab, you can directly use 'await' at the top level.
# If running this code as a standalone Python script, you'll need to use asyncio.run() or manage the event loop.
# reimbursement that doesn't require approval
# asyncio.run(call_agent_async("Please reimburse 50$ for meals"))
await call_agent_async("Please reimburse 50$ for meals") # For Notebooks, uncomment this line and comment the above line
# reimbursement that requires approval
# asyncio.run(call_agent_async("Please reimburse 200$ for meals"))
await call_agent_async("Please reimburse 200$ for meals") # For Notebooks, uncomment this line and comment the above line
此示例的关键方面¶
-
LongRunningFunctionTool: 包装所提供的方法/函数;框架会将产生的进度更新和最终返回值作为一系列 FunctionResponse 顺序发送。 -
智能体指令:指导 LLM 使用工具并理解传入的 FunctionResponse 流(进度与完成)以更新用户。
-
最终返回:函数返回最终结果字典,它在结束的 FunctionResponse 中发送,表示完成。
智能体即工具¶
这个强大的功能允许你通过将系统中的其他智能体作为工具调用来利用它们的能力。智能体作为工具使你能够调用另一个智能体执行特定任务,有效地委托责任。这在概念上类似于创建一个调用另一个智能体并使用该智能体的响应作为函数返回值的 Python 函数。
与子智能体的关键区别¶
重要的是区分智能体作为工具和子智能体。
-
智能体作为工具: 当智能体 A 调用智能体 B 作为工具(使用智能体作为工具)时,智能体 B 的答案传回给智能体 A,然后智能体 A 对答案进行总结并生成对用户的响应。智能体 A 保持控制并继续处理未来的用户输入。
-
子智能体: 当智能体 A 调用智能体 B 作为子智能体时,回答用户的责任完全转移给智能体 B。智能体 A 实际上脱离了循环。所有后续用户输入都将由智能体 B 回答。
使用方法¶
要将智能体用作工具,请使用 AgentTool 类包装智能体。
自定义¶
AgentTool 类提供以下属性来自定义其行为:
- skip_summarization: bool: 如果设置为 True,框架将绕过对工具智能体响应的基于 LLM 的总结。当工具的响应已经格式良好且不需要进一步处理时,这很有用。
??? "示例" {: #example-1}
=== "Python"
```py
# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from google.adk.agents import Agent
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.adk.tools.agent_tool import AgentTool
from google.genai import types
APP_NAME="summary_agent"
USER_ID="user1234"
SESSION_ID="1234"
summary_agent = Agent(
model="gemini-2.0-flash",
name="summary_agent",
instruction="""You are an expert summarizer. Please read the following text and provide a concise summary.""",
description="Agent to summarize text",
)
root_agent = Agent(
model='gemini-2.0-flash',
name='root_agent',
instruction="""You are a helpful assistant. When the user provides a text, use the 'summarize' tool to generate a summary. Always forward the user's message exactly as received to the 'summarize' tool, without modifying or summarizing it yourself. Present the response from the tool to the user.""",
tools=[AgentTool(agent=summary_agent, skip_summarization=True)]
)
# Session and Runner
async def setup_session_and_runner():
session_service = InMemorySessionService()
session = await session_service.create_session(app_name=APP_NAME, user_id=USER_ID, session_id=SESSION_ID)
runner = Runner(agent=root_agent, app_name=APP_NAME, session_service=session_service)
return session, runner
# Agent Interaction
async def call_agent_async(query):
content = types.Content(role='user', parts=[types.Part(text=query)])
session, runner = await setup_session_and_runner()
events = runner.run_async(user_id=USER_ID, session_id=SESSION_ID, new_message=content)
async for event in events:
if event.is_final_response():
final_response = event.content.parts[0].text
print("Agent Response: ", final_response)
long_text = """Quantum computing represents a fundamentally different approach to computation,
leveraging the bizarre principles of quantum mechanics to process information. Unlike classical computers
that rely on bits representing either 0 or 1, quantum computers use qubits which can exist in a state of superposition - effectively
being 0, 1, or a combination of both simultaneously. Furthermore, qubits can become entangled,
meaning their fates are intertwined regardless of distance, allowing for complex correlations. This parallelism and
interconnectedness grant quantum computers the potential to solve specific types of incredibly complex problems - such
as drug discovery, materials science, complex system optimization, and breaking certain types of cryptography - far
faster than even the most powerful classical supercomputers could ever achieve, although the technology is still largely in its developmental stages."""
# Note: In Colab, you can directly use 'await' at the top level.
# If running this code as a standalone Python script, you'll need to use asyncio.run() or manage the event loop.
await call_agent_async(long_text)
```
=== "Java"
```java
import com.google.adk.agents.LlmAgent;
import com.google.adk.events.Event;
import com.google.adk.runner.InMemoryRunner;
import com.google.adk.sessions.Session;
import com.google.adk.tools.AgentTool;
import com.google.genai.types.Content;
import com.google.genai.types.Part;
import io.reactivex.rxjava3.core.Flowable;
public class AgentToolCustomization {
private static final String APP_NAME = "summary_agent";
private static final String USER_ID = "user1234";
public static void initAgentAndRun(String prompt) {
LlmAgent summaryAgent =
LlmAgent.builder()
.model("gemini-2.0-flash")
.name("summaryAgent")
.instruction(
"You are an expert summarizer. Please read the following text and provide a concise summary.")
.description("Agent to summarize text")
.build();
// Define root_agent
LlmAgent rootAgent =
LlmAgent.builder()
.model("gemini-2.0-flash")
.name("rootAgent")
.instruction(
"You are a helpful assistant. When the user provides a text, always use the 'summaryAgent' tool to generate a summary. Always forward the user's message exactly as received to the 'summaryAgent' tool, without modifying or summarizing it yourself. Present the response from the tool to the user.")
.description("Assistant agent")
.tools(AgentTool.create(summaryAgent, true)) // Set skipSummarization to true
.build();
// Create an InMemoryRunner
InMemoryRunner runner = new InMemoryRunner(rootAgent, APP_NAME);
// InMemoryRunner automatically creates a session service. Create a session using the service
Session session = runner.sessionService().createSession(APP_NAME, USER_ID).blockingGet();
Content userMessage = Content.fromParts(Part.fromText(prompt));
// Run the agent
Flowable<Event> eventStream = runner.runAsync(USER_ID, session.id(), userMessage);
// Stream event response
eventStream.blockingForEach(
event -> {
if (event.finalResponse()) {
System.out.println(event.stringifyContent());
}
});
}
public static void main(String[] args) {
String longText =
"""
Quantum computing represents a fundamentally different approach to computation,
leveraging the bizarre principles of quantum mechanics to process information. Unlike classical computers
that rely on bits representing either 0 or 1, quantum computers use qubits which can exist in a state of superposition - effectively
being 0, 1, or a combination of both simultaneously. Furthermore, qubits can become entangled,
meaning their fates are intertwined regardless of distance, allowing for complex correlations. This parallelism and
interconnectedness grant quantum computers the potential to solve specific types of incredibly complex problems - such
as drug discovery, materials science, complex system optimization, and breaking certain types of cryptography - far
faster than even the most powerful classical supercomputers could ever achieve, although the technology is still largely in its developmental stages.""";
initAgentAndRun(longText);
}
}
```
工作原理¶
- 当
main_agent收到长文本时,其指令告诉它对长文本使用 'summarize' 工具。 - 框架将 'summarize' 识别为包装
summary_agent的AgentTool。 - 在后台,
main_agent将使用长文本作为输入调用summary_agent。 summary_agent将根据其指令处理文本并生成摘要。- 然后将来自
summary_agent的响应传回给main_agent。 main_agent然后可以获取摘要并形成对用户的最终响应(例如,"这是文本的摘要:...")