ADK 智能体的 Ollama 模型托管¶

Supported in ADKPython v0.1.0

Ollama 是一个允许你在本地托管和运行开源模型的工具。ADK 通过 LiteLLM 模型连接器库与 Ollama 托管的模型集成。

入门¶

使用 LiteLLM 包装器创建使用 Ollama 托管模型的智能体。以下代码示例展示了在你的智能体中使用 Gemma 开源模型的基本实现：

root_agent = Agent(
    model=LiteLlm(model="ollama_chat/gemma3:latest"),
    name="dice_agent",
    description=(
        "hello world agent that can roll a dice of 8 sides and check prime"
        " numbers."
    ),
    instruction="""
      You roll dice and answer questions about the outcome of the dice rolls.
    """,
    tools=[
        roll_die,
        check_prime,
    ],
)

警告：使用 ollama_chat 接口

确保你设置提供商为 ollama_chat 而不是 ollama。使用 ollama 可能会导致意外行为,例如无限工具调用循环和忽略先前的上下文。

使用 OLLAMA_API_BASE 环境变量

虽然你可以在 LiteLLM 中为生成指定 api_base 参数,但从 v1.65.5 开始,该库依赖环境变量进行其他 API 调用。因此,你应该为你的 Ollama 服务器 URL 设置 OLLAMA_API_BASE 环境变量,以确保所有请求都被正确路由。

export OLLAMA_API_BASE="http://localhost:11434"
adk web

模型选择¶

如果你的智能体依赖工具，请确保从 Ollama 网站选择支持工具的模型。为了获得可靠的结果，请使用支持工具的模型。你可以使用以下命令检查模型的工具支持：

ollama show mistral-small3.1
  Model
    architecture        mistral3
    parameters          24.0B
    context length      131072
    embedding length    5120
    quantization        Q4_K_M

  Capabilities
    completion
    vision
    tools

你应该在 capabilities 下看到 tools 列出。你还可以查看模型正在使用的模板，并根据你的需求进行调整。

ollama show --modelfile llama3.2 > model_file_to_modify

例如，上述模型的默认模板本质上建议模型应始终调用函数。这可能会导致无限的函数调用循环。

Given the following functions, please respond with a JSON for a function call
with its proper arguments that best answers the given prompt.

Respond in the format {"name": function name, "parameters": dictionary of
argument name and its value}. Do not use variables.

你可以将此类提示替换为更具描述性的提示，以防止无限工具调用循环，例如：

Review the user's prompt and the available functions listed below.

First, determine if calling one of these functions is the most appropriate way
to respond. A function call is likely needed if the prompt asks for a specific
action, requires external data lookup, or involves calculations handled by the
functions. If the prompt is a general question or can be answered directly, a
function call is likely NOT needed.

If you determine a function call IS required: Respond ONLY with a JSON object in
the format {"name": "function_name", "parameters": {"argument_name": "value"}}.
Ensure parameter values are concrete, not variables.

If you determine a function call IS NOT required: Respond directly to the user's
prompt in plain text, providing the answer or information requested. Do not
output any JSON.

然后你可以使用以下命令创建新模型：

ollama create llama3.2-modified -f model_file_to_modify

使用 OpenAI 提供商¶

或者，你可以使用 openai 作为提供商名称。这种方法需要设置 OPENAI_API_BASE=http://localhost:11434/v1 和 OPENAI_API_KEY=anything 环境变量，而不是 OLLAMA_API_BASE。请注意，API_BASE 值末尾有 /v1。

root_agent = Agent(
    model=LiteLlm(model="openai/mistral-small3.1"),
    name="dice_agent",
    description=(
        "hello world agent that can roll a dice of 8 sides and check prime"
        " numbers."
    ),
    instruction="""
      You roll dice and answer questions about the outcome of the dice rolls.
    """,
    tools=[
        roll_die,
        check_prime,
    ],
)

export OPENAI_API_BASE=http://localhost:11434/v1
export OPENAI_API_KEY=anything
adk web

调试¶

你可以通过在导入后的智能体代码中添加以下内容来查看发送到 Ollama 服务器的请求。

import litellm
litellm._turn_on_debug()

查找类似以下的行：

Request Sent from LiteLLM:
curl -X POST \
http://localhost:11434/api/chat \
-d '{"model": "mistral-small3.1", "messages": [{"role": "system", "content": ...