Memory：使用 `MemoryService` 的长期知识¶

Supported in ADKPython v0.1.0Java v0.2.0

我们已经了解了 Session 如何为单个、正在进行的对话跟踪历史记录（events）和临时数据（state）。但是，如果智能体需要从过去的对话中回忆信息怎么办？这就是长期知识和MemoryService 的概念发挥作用的地方。

可以这样理解：

Session / State： 就像你在一次特定聊天中的短期记忆。
长期知识（MemoryService）：就像智能体可以查询的可搜索档案或知识库，可能包含来自许多过去聊天或其他来源的信息。

`MemoryService` 角色¶

BaseMemoryService 定义了管理这种可搜索、长期知识存储的接口。其主要职责是：

摄入信息（add_session_to_memory）： 获取（通常是已完成的）Session 的内容，并将相关信息添加到长期知识存储中。
搜索信息（search_memory）： 允许智能体（通常通过 Tool）查询知识存储并基于搜索查询检索相关片段或上下文。

选择合适的记忆服务¶

ADK 提供两种不同的 MemoryService 实现，每种都针对不同的用例。使用下表来决定哪种最适合你的智能体。

Feature	InMemoryMemoryService	VertexAiMemoryBankService
Persistence	None (data is lost on restart)	Yes (Managed by Vertex AI)
Primary Use Case	Prototyping, local development, and simple testing.	Building meaningful, evolving memories from user conversations.
Memory Extraction	Stores full conversation	Extracts meaningful information from conversations and consolidates it with existing memories (powered by LLM)
Search Capability	Basic keyword matching.	Advanced semantic search.
Setup Complexity	None. It's the default.	Low. Requires an Agent Engine instance in Vertex AI.
Dependencies	None.	Google Cloud Project, Vertex AI API
When to use it	When you want to search across multiple sessions’ chat histories for prototyping.	When you want your agent to remember and learn from past interactions.

内存中的记忆¶

InMemoryMemoryService 将会话信息存储在应用程序的内存中，并对搜索执行基本的关键词匹配。它不需要设置，最适合原型设计和不需要持久性的简单测试场景。

from google.adk.memory import InMemoryMemoryService
memory_service = InMemoryMemoryService()

示例：添加和搜索记忆

此示例演示了使用 InMemoryMemoryService 的基本流程以便简化理解。

完整代码

import asyncio
from google.adk.agents import LlmAgent
from google.adk.sessions import InMemorySessionService, Session
from google.adk.memory import InMemoryMemoryService # 导入 MemoryService
from google.adk.runners import Runner
from google.adk.tools import load_memory # 查询内存的工具
from google.genai.types import Content, Part

# --- 常量 ---
APP_NAME = "memory_example_app"
USER_ID = "mem_user"
MODEL = "gemini-2.0-flash" # 使用有效模型

# --- 智能体定义 ---
# 智能体 1：简单的信息捕获智能体
info_capture_agent = LlmAgent(
    model=MODEL,
    name="InfoCaptureAgent",
    instruction="确认用户的陈述。",
)

# 智能体 2：可以使用内存的智能体
memory_recall_agent = LlmAgent(
    model=MODEL,
    name="MemoryRecallAgent",
    instruction="回答用户的问题。如果答案可能在过去的对话中，请使用 'load_memory' 工具",
    tools=[load_memory] # 给智能体提供工具
)

# --- 服务 ---
# 服务必须在运行器之间共享以共享状态和记忆
session_service = InMemorySessionService()
memory_service = InMemoryMemoryService() # 用于演示的内存服务

async def run_scenario():
    # --- 场景 ---

    # 第1轮：在会话中捕获一些信息
    print("--- 第1轮：捕获信息 ---")
    runner1 = Runner(
        # 从信息捕获智能体开始
        agent=info_capture_agent,
        app_name=APP_NAME,
        session_service=session_service,
        memory_service=memory_service # 为运行器提供记忆服务
    )
    session1_id = "session_info"
    await runner1.session_service.create_session(app_name=APP_NAME, user_id=USER_ID, session_id=session1_id)
    user_input1 = Content(parts=[Part(text="我最喜欢的项目是 Alpha 项目。")], role="user")

    # 运行智能体
    final_response_text = "(无最终响应)"
    async for event in runner1.run_async(user_id=USER_ID, session_id=session1_id, new_message=user_input1):
        if event.is_final_response() and event.content and event.content.parts:
            final_response_text = event.content.parts[0].text
    print(f"智能体 1 响应: {final_response_text}")

    # 获取完成的会话
    completed_session1 = await runner1.session_service.get_session(app_name=APP_NAME, user_id=USER_ID, session_id=session1_id)

    # 将此会话的内容添加到记忆服务
    print("\n--- 将会话1添加到记忆 ---")
    await memory_service.add_session_to_memory(completed_session1)
    print("会话已添加到记忆中。")

    # 第2轮：在新会话中回忆信息
    print("\n--- 第2轮：回忆信息 ---")
    runner2 = Runner(
        # 使用第二个智能体，它有记忆工具
        agent=memory_recall_agent,
        app_name=APP_NAME,
        session_service=session_service, # 重用相同的服务
        memory_service=memory_service   # 重用相同的服务
    )
    session2_id = "session_recall"
    await runner2.session_service.create_session(app_name=APP_NAME, user_id=USER_ID, session_id=session2_id)
    user_input2 = Content(parts=[Part(text="我最喜欢的项目是什么？")], role="user")

    # 运行第二个智能体
    final_response_text_2 = "(无最终响应)"
    async for event in runner2.run_async(user_id=USER_ID, session_id=session2_id, new_message=user_input2):
        if event.is_final_response() and event.content and event.content.parts:
            final_response_text_2 = event.content.parts[0].text
    print(f"智能体 2 响应: {final_response_text_2}")

# 要运行此示例，你可以使用以下代码片段：
# asyncio.run(run_scenario())

# await run_scenario()

Vertex AI 记忆库¶

VertexAiMemoryBankService 将你的智能体连接到 Vertex AI 记忆库，这是一个完全托管的 Google Cloud 服务，为对话智能体提供复杂的持久记忆功能。

工作原理¶

该服务处理两个关键操作：

生成记忆： 在对话结束时，你可以将会话的事件发送到记忆库，它会智能地处理并将信息存储为"记忆"。
检索记忆： 你的智能体代码可以针对记忆库发出搜索查询，以检索过去对话中的相关记忆。

先决条件¶

在使用此功能之前，你必须具备：

Google Cloud 项目： 启用了 Vertex AI API。
智能体引擎： 你需要在 Vertex AI 中创建智能体引擎。你不需要将你的智能体部署到智能体引擎运行时来使用记忆库。这将为你提供配置所需的智能体引擎 ID。
身份验证： 确保你的本地环境已通过身份验证以访问 Google Cloud 服务。最简单的方法是运行：
```
gcloud auth application-default login
```
环境变量： 该服务需要你的 Google Cloud 项目 ID 和位置。将它们设置为环境变量：
```
export GOOGLE_CLOUD_PROJECT="your-gcp-project-id"
export GOOGLE_CLOUD_LOCATION="your-gcp-location"
```

配置¶

要将你的智能体连接到记忆库，你需要在启动 ADK 服务器（adk web 或 adk api_server）时使用 --memory_service_uri 标志。URI 必须采用 agentengine://<agent_engine_id> 格式。

adk web path/to/your/agents_dir --memory_service_uri="agentengine://1234567890"

或者，你可以通过手动实例化 VertexAiMemoryBankService 并将其传递给 Runner 来配置智能体使用记忆库。

from google.adk.memory import VertexAiMemoryBankService

agent_engine_id = agent_engine.api_resource.name.split("/")[-1]

memory_service = VertexAiMemoryBankService(
    project="PROJECT_ID",
    location="LOCATION",
    agent_engine_id=agent_engine_id
)

runner = adk.Runner(
    ...
    memory_service=memory_service
)

在你的智能体中使用记忆¶

当配置了记忆服务时，你的智能体可以使用工具或回调来检索记忆。ADK 包含两个用于检索记忆的预构建工具：

PreloadMemory: 在每个回合开始时始终检索记忆（类似于回调）。
LoadMemory: 当你的智能体决定检索记忆会有帮助时才检索。

示例：

from google.adk.agents import Agent
from google.adk.tools.preload_memory_tool import PreloadMemoryTool

agent = Agent(
    model=MODEL_ID,
    name='weather_sentiment_agent',
    instruction="...",
    tools=[PreloadMemoryTool()]
)

要从你的会话中提取记忆，你需要调用 add_session_to_memory。例如，你可以通过回调来自动化：

from google import adk

async def auto_save_session_to_memory_callback(callback_context):
    await callback_context._invocation_context.memory_service.add_session_to_memory(
        callback_context._invocation_context.session)

agent = Agent(
    model=MODEL,
    name="Generic_QA_Agent",
    instruction="回答用户的问题",
    tools=[adk.tools.preload_memory_tool.PreloadMemoryTool()],
    after_agent_callback=auto_save_session_to_memory_callback,
)

高级概念¶

记忆在实践中的工作原理¶

记忆工作流程在内部涉及以下步骤：

会话交互： 用户通过由 SessionService 管理的 Session 与智能体交互。事件被添加，状态可能会更新。
记忆摄入： 在某个时刻（通常是在会话被认为完成或产生了重要信息时），你的应用程序调用 memory_service.add_session_to_memory(session)。这会从会话的事件中提取相关信息并将其添加到长期知识存储中（内存字典或智能体引擎记忆库）。
后续查询： 在不同（或相同）的会话中，用户可能会问一个需要过去上下文的问题（例如，"我们上周讨论了关于项目 X 的什么内容？"）。
智能体使用记忆工具： 配备了记忆检索工具（如内置的 load_memory 工具）的智能体识别到需要过去上下文。它调用该工具，提供一个搜索查询（例如，"讨论项目 X 上周"）。
搜索执行： 该工具在内部调用 memory_service.search_memory(app_name, user_id, query)。
结果返回： MemoryService 搜索其存储（使用关键词匹配或语义搜索）并将相关片段作为 SearchMemoryResponse 返回，其中包含 MemoryResult 对象列表（每个对象可能包含相关过去会话的事件）。
智能体使用结果： 该工具将这些结果返回给智能体，通常作为上下文或函数响应的一部分。智能体然后可以使用这些检索到的信息来制定对用户的最终答案。

智能体可以访问多个记忆服务吗？¶

通过标准配置：否。 框架（adk web、adk api_server）设计为一次只能通过 --memory_service_uri 标志配置一个记忆服务。然后将这个单一服务提供给智能体，并通过内置的 self.search_memory() 方法访问。从配置角度来看，你只能为该进程服务的所有智能体选择一个后端（InMemory、VertexAiMemoryBankService）。
在智能体代码中：是的，绝对可以。 没有什么可以阻止你直接在智能体代码内手动导入和实例化另一个记忆服务。这允许你在单个智能体轮次内访问多个记忆源。

例如，你的智能体可以使用框架配置的 VertexAiMemoryBankService 来回忆对话历史，同时也手动实例化一个 InMemoryMemoryService 来查找技术手册中的信息。

示例：使用两个记忆服务¶

以下是如何在智能体代码中实现这一点：

from google.adk.agents import Agent
from google.adk.memory import InMemoryMemoryService, VertexAiMemoryBankService
from google.genai import types

class MultiMemoryAgent(Agent):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)

        self.memory_service = InMemoryMemoryService()
        # 手动实例化第二个记忆服务用于文档查找
        self.vertexai_memorybank_service = VertexAiMemoryBankService(
            project="PROJECT_ID",
            location="LOCATION",
            agent_engine_id="AGENT_ENGINE_ID"
        )

    async def run(self, request: types.Content, **kwargs) -> types.Content:
        user_query = request.parts[0].text

        # 1. 使用框架提供的记忆服务搜索对话历史
        #    (如果已配置，这将是 InMemoryMemoryService)
        conversation_context = await self.memory_service.search_memory(query=user_query)

        # 2. 使用手动创建的服务搜索文档知识库
        document_context = await self.vertexai_memorybank_service.search_memory(query=user_query)

        # 结合两个来源的上下文生成更好的响应
        prompt = "从我们过去的对话中，我记得：\n"
        prompt += f"{conversation_context.memories}\n\n"
        prompt += "从技术手册中，我找到了：\n"
        prompt += f"{document_context.memories}\n\n"
        prompt += f"基于这些信息，以下是我对'{user_query}'的回答："

        return await self.llm.generate_content_async(prompt)

Memory：使用 MemoryService 的长期知识¶

MemoryService 角色¶