使用 Gemini 的 Computer Use 工具集¶

Supported in ADKPython v1.17.0Preview

Computer Use 工具集允许智能体操作计算机的用户界面（如浏览器）来完成任务。此工具使用特定的 Gemini 模型和 Playwright 测试工具来控制 Chromium 浏览器，并可以通过截屏、点击、输入和导航与网页进行交互。

有关 Computer Use 模型的更多信息，请参见 Gemini API Computer Use 或 Google Cloud Vertex AI APIComputer Use。

预览版发布

Computer Use 模型和工具是预览版发布。有关更多信息，请参见发布阶段描述。

设置¶

你必须安装 Playwright 及其依赖项，包括 Chromium，才能使用 Computer Use 工具集。

推荐：创建并激活 Python 虚拟环境

创建 Python 虚拟环境：

python -m venv .venv

激活 Python 虚拟环境：

Windows CMDWindows PowershellMacOS / Linux

.venv\Scripts\activate.bat

.venv\Scripts\Activate.ps1

source .venv/bin/activate

为 Computer Use 工具集设置所需的软件库：

安装 Python 依赖项：

pip install termcolor==3.1.0
pip install playwright==1.52.0
pip install browserbase==1.3.0
pip install rich

安装 Playwright 依赖项，包括 Chromium 浏览器：

playwright install-deps chromium
playwright install chromium

使用工具¶

通过将 Computer Use 工具集作为工具添加到你的智能体来使用它。当你配置工具时，你必须提供 BaseComputer 类的实现，该类定义了智能体使用计算机的接口。在以下示例中，为此目的定义了 PlaywrightComputer 类。你可以在 computer_use 智能体示例项目的 playwright.py 文件中找到此实现的代码。

from google.adk import Agent
from google.adk.models.google_llm import Gemini
from google.adk.tools.computer_use.computer_use_toolset import ComputerUseToolset
from typing_extensions import override

from .playwright import PlaywrightComputer

root_agent = Agent(
    model='gemini-2.5-computer-use-preview-10-2025',
    name='hello_world_agent',
    description=(
        '能够操作计算机上的浏览器以完成用户任务的 Computer Use 智能体'
    ),
    instruction='你是一个 Computer Use 智能体',
    tools=[
        ComputerUseToolset(computer=PlaywrightComputer(screen_size=(1280, 936)))
    ],
)

有关完整代码示例，请参见 computer_use 智能体示例项目。