ChatGoogleGenerativeAI 集成 - Docs by LangChain

通过 Gemini Developer API 或 Vertex AI 访问 Google 的生成式 AI 模型，包括 Gemini 系列。Gemini Developer API 提供快速设置，使用 API 密钥，非常适合个人开发者。Vertex AI 提供企业级功能，并与 Google Cloud Platform 集成。有关最新模型、模型 ID、其功能、上下文窗口等信息，请访问 Google AI 文档。

Vertex AI 整合与兼容性自 langchain-google-genai 4.0.0 起，此包使用整合的 google-genai SDK，取代了旧的 google-ai-generativelanguage SDK。此迁移通过 Gemini Developer API 和 Vertex AI 中的 Gemini API 提供对 Gemini 模型的支持，取代了 langchain-google-vertexai 中的某些类，例如 ChatVertexAI。请阅读完整公告和迁移指南。

API 参考有关所有功能和配置选项的详细文档，请访问 ChatGoogleGenerativeAI API 参考。

概述

集成详情

类	包	可序列化	JS 支持	下载量	版本
`ChatGoogleGenerativeAI`	`langchain-google-genai`	beta	✅

模型功能

工具调用	结构化输出	图像输入	音频输入	视频输入	令牌级流式传输	原生异步	令牌使用情况	Logprobs
✅	✅	✅	✅	✅	✅	✅	✅	⚠️

设置

要访问 Google AI 模型，您需要创建一个 Google 帐户、获取 Google AI API 密钥，并安装 langchain-google-genai 集成包。

安装

pip install -U langchain-google-genai

凭据

此集成支持两个后端：Gemini Developer API 和 Vertex AI。后端会根据您的配置自动选择。

后端选择

后端确定如下：

如果设置了 GOOGLE_GENAI_USE_VERTEXAI 环境变量，则使用该值
如果提供了 credentials 参数，则使用 Vertex AI
如果提供了 project 参数，则使用 Vertex AI
否则，使用 Gemini Developer API

您也可以显式设置 vertexai=True 或 vertexai=False 来覆盖自动检测。

Gemini Developer API
使用 API 密钥的 Vertex AI
使用凭据的 Vertex AI

使用 API 密钥快速设置推荐用于个人开发者/新用户。前往 Google AI Studio 生成 API 密钥：

import getpass
import os

if "GOOGLE_API_KEY" not in os.environ:
    os.environ["GOOGLE_API_KEY"] = getpass.getpass("Enter your Google AI API key: ")

集成会首先检查 GOOGLE_API_KEY，然后检查 GEMINI_API_KEY 作为后备。

使用 API 密钥身份验证的 Vertex AI您可以使用 API 密钥身份验证的 Vertex AI 进行更简单的设置：

export GEMINI_API_KEY='your-api-key'
export GOOGLE_GENAI_USE_VERTEXAI=true
export GOOGLE_CLOUD_PROJECT='your-project-id'

或以编程方式：

from langchain_google_genai import ChatGoogleGenerativeAI

llm = ChatGoogleGenerativeAI(
    model="gemini-2.5-flash",
    api_key="your-api-key",
    project="your-project-id",
    vertexai=True,
)

使用服务帐户或 ADC 的 Vertex AI设置应用程序默认凭据 (ADC)：

gcloud auth application-default login

设置您的 Google Cloud 项目：

export GOOGLE_CLOUD_PROJECT='your-project-id'
# 可选：设置区域（默认为 us-central1）
export GOOGLE_CLOUD_LOCATION='us-central1'

或使用服务帐户凭据：

from google.oauth2 import service_account
from langchain_google_genai import ChatGoogleGenerativeAI

credentials = service_account.Credentials.from_service_account_file(
    "path/to/service-account.json",
    scopes=["https://www.googleapis.com/auth/cloud-platform"],
)

llm = ChatGoogleGenerativeAI(
    model="gemini-2.5-flash",
    credentials=credentials,
    project="your-project-id",
)

环境变量

变量	用途	后端
`GOOGLE_API_KEY`	API 密钥（主要）	两者（参见 `GOOGLE_GENAI_USE_VERTEXAI`）
`GEMINI_API_KEY`	API 密钥（后备）	两者（参见 `GOOGLE_GENAI_USE_VERTEXAI`）
`GOOGLE_GENAI_USE_VERTEXAI`	强制使用 Vertex AI 后端（`true`/`false`）	Vertex AI
`GOOGLE_CLOUD_PROJECT`	GCP 项目 ID	Vertex AI
`GOOGLE_CLOUD_LOCATION`	GCP 区域（默认：`us-central1`）	Vertex AI

要启用模型调用的自动跟踪，请设置您的 LangSmith API 密钥：

os.environ["LANGSMITH_API_KEY"] = getpass.getpass("Enter your LangSmith API key: ")
os.environ["LANGSMITH_TRACING"] = "true"

实例化

现在我们可以实例化我们的模型对象并生成响应：

Gemini Developer API
Vertex AI

from langchain_google_genai import ChatGoogleGenerativeAI

model = ChatGoogleGenerativeAI(
    model="gemini-3.1-pro-preview",
    temperature=1.0,  # Gemini 3.0+ 默认为 1.0
    max_tokens=None,
    timeout=None,
    max_retries=2,
    # 其他参数...
)

from langchain_google_genai import ChatGoogleGenerativeAI

model = ChatGoogleGenerativeAI(
    model="gemini-3.1-pro-preview",
    project="your-project-id",
    location="us-central1",  # 可选，默认为 us-central1
    temperature=1.0,  # Gemini 3.0+ 默认为 1.0
    max_tokens=None,
    timeout=None,
    max_retries=2,
    # 其他参数...
)

提供 project 会自动选择 Vertex AI 后端，除非您显式设置 vertexai=False。

Gemini 3.0+ 模型的 Temperature如果未显式设置 temperature 且模型是 Gemini 3.0 或更高版本，则它将自动设置为 1.0，而不是 Google GenAI API 最佳实践中默认的 0.7。在 Gemini 3.0+ 上使用 0.7 可能导致无限循环、推理性能下降以及复杂任务失败。

有关可用模型参数的完整列表，请参阅 ChatGoogleGenerativeAI API 参考。

代理配置

如果需要使用代理，请在初始化前设置这些环境变量：

export HTTPS_PROXY='http://username:password@proxy_uri:port'
export SSL_CERT_FILE='path/to/cert.pem'  # 可选：自定义 SSL 证书

对于 SOCKS5 代理或高级代理配置，请使用 client_args 参数：

model = ChatGoogleGenerativeAI(
    model="gemini-3.1-pro-preview",
    client_args={"proxy": "socks5://user:pass@host:port"},
)

调用

messages = [
    (
        "system",
        "You are a helpful assistant that translates English to French. Translate the user sentence.",
    ),
    ("human", "I love programming."),
]
ai_msg = model.invoke(messages)
ai_msg

AIMessage(content=[{'type': 'text', 'text': "J'adore la programmation.", 'extras': {'signature': 'EpoWCpc...'}}], additional_kwargs={}, response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'model_name': 'gemini-3.1-pro-preview', 'safety_ratings': [], 'model_provider': 'google_genai'}, id='lc_run--fb732b64-1ab4-4a28-b93b-dcfb2a164a3d-0', usage_metadata={'input_tokens': 21, 'output_tokens': 779, 'total_tokens': 800, 'input_token_details': {'cache_read': 0}, 'output_token_details': {'reasoning': 772}})

消息内容形状Gemini 3 系列模型返回内容块列表以捕获思考签名。使用 .text 获取字符串内容：

response.content  # -> [{"type": "text", "text": "Hello!", "extras": {"signature": "EpQFCp..."}}]
response.text     # -> "Hello!"

Gemini 2.5 及更早版本为 .content 返回纯字符串。

多模态用法

Gemini 模型接受多模态输入（文本、图像、音频、视频、PDF），并且某些模型可以生成多模态输出。

支持的输入方法

方法	图像	视频	音频	PDF
文件上传 (Files API)	✅	✅	✅	✅
Base64 内联数据	✅	✅	✅	✅
HTTP/HTTPS URL*	✅	✅	✅	✅
GCS URI (`gs://...`)	✅	✅	✅	✅

*YouTube URL 在预览中支持视频输入。

文件上传

您可以将文件上传到 Google 的服务器，并通过 URI 引用它们。这适用于 PDF、图像、视频和音频文件。

import time
from google import genai
from langchain.messages import HumanMessage
from langchain_google_genai import ChatGoogleGenerativeAI

client = genai.Client()
model = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview")

# 将文件上传到 Google 的服务器
myfile = client.files.upload(file="/path/to/your/file.pdf")
while myfile.state.name == "PROCESSING":
    time.sleep(2)
    myfile = client.files.get(name=myfile.name)

# 在 FileContentBlock 中通过 file_id 引用
message = HumanMessage(
    content=[
        {"type": "text", "text": "What is in the document?"},
        {
            "type": "file",
            "file_id": myfile.uri,  # 或 myfile.name
            "mime_type": "application/pdf",
        },
    ]
)
response = model.invoke([message])

上传后，您可以使用 file_id 模式在以下任何特定于媒体的部分中引用该文件。

图像输入

使用带有列表内容格式的 HumanMessage 提供图像输入和文本。

from langchain.messages import HumanMessage
from langchain_google_genai import ChatGoogleGenerativeAI

model = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview")

message = HumanMessage(
    content=[
        {"type": "text", "text": "Describe the image at the URL."},
        {
            "type": "image",
            "url": "https://picsum.photos/seed/picsum/200/300",
        },
    ]
)
response = model.invoke([message])

其他支持的图像格式：

Google Cloud Storage URI (gs://...)。确保服务帐户具有访问权限。

PDF 输入

提供 PDF 文件输入和文本。

from langchain.messages import HumanMessage
from langchain_google_genai import ChatGoogleGenerativeAI

model = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview")

message = HumanMessage(
    content=[
        {"type": "text", "text": "Describe the document in a sentence."},
        {
            "type": "image_url",  # (PDF 被视为图像)
            "image_url": "https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf",
        },
    ]
)
response = model.invoke([message])

音频输入

提供音频文件输入和文本。

from langchain.messages import HumanMessage
from langchain_google_genai import ChatGoogleGenerativeAI

model = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview")

message = HumanMessage(
    content=[
        {"type": "text", "text": "Summarize this audio in a sentence."},
        {
            "type": "image_url",
            "image_url": "https://example.com/audio.mp3",
        },
    ]
)
response = model.invoke([message])

视频输入

提供视频文件输入和文本。

import base64
from langchain.messages import HumanMessage
from langchain_google_genai import ChatGoogleGenerativeAI

model = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview")

video_bytes = open("path/to/your/video.mp4", "rb").read()
video_base64 = base64.b64encode(video_bytes).decode("utf-8")
mime_type = "video/mp4"

message = HumanMessage(
    content=[
        {"type": "text", "text": "Describe what's in this video in a sentence."},
        {
            "type": "video",
            "base64": video_base64,
            "mime_type": mime_type,
        },
    ]
)
response = model.invoke([message])

YouTube 视频输入（预览）

仅支持公开视频（不支持私有或未列出的视频）
免费层：每天最多 8 小时的 YouTube 视频

图像生成

某些模型可以内联生成文本和图像。有关详细信息，请参阅 Gemini API 文档。

import base64
from IPython.display import Image, display
from langchain.messages import AIMessage
from langchain_google_genai import ChatGoogleGenerativeAI

model = ChatGoogleGenerativeAI(model="gemini-2.5-flash-image")

response = model.invoke("Generate a photorealistic image of a cuddly cat wearing a hat.")

def _get_image_base64(response: AIMessage) -> None:
    image_block = next(
        block
        for block in response.content
        if isinstance(block, dict) and block.get("image_url")
    )
    return image_block["image_url"].get("url").split(",")[-1]

image_base64 = _get_image_base64(response)
display(Image(data=base64.b64decode(image_base64), width=300))

使用 image_config 控制图像尺寸和质量（参见 genai.types.ImageConfig）。可以在实例化时设置（应用于所有调用）或在调用时设置（每次调用覆盖）：

from langchain_google_genai import ChatGoogleGenerativeAI

# 在实例化时设置（应用于所有调用）
model = ChatGoogleGenerativeAI(
    model="gemini-2.5-flash-image",
    image_config={"aspect_ratio": "16:9"},
)

# 或每次调用覆盖
response = model.invoke(
    "Generate a photorealistic image of a cuddly cat wearing a hat.",
    image_config={"aspect_ratio": "1:1"},
)

默认情况下，图像生成模型可能同时返回文本和图像（例如 “Ok! Here’s an image of a…”）。您可以通过设置 response_modalities 参数来请求模型仅返回图像：

from langchain_google_genai import ChatGoogleGenerativeAI, Modality

model = ChatGoogleGenerativeAI(
    model="gemini-2.5-flash-image",
    response_modalities=[Modality.IMAGE],
)

# 所有调用将仅返回图像
response = model.invoke("Generate a photorealistic image of a cuddly cat wearing a hat.")

音频生成

某些模型可以生成音频文件。有关详细信息，请参阅 Gemini API 文档。

Vertex AI 限制音频生成模型目前在 Vertex AI 上处于有限预览状态，可能需要允许列表访问。如果在使用 TTS 模型时遇到 INVALID_ARGUMENT 错误且 vertexai=True，您的 GCP 项目可能需要被列入允许列表。有关更多详细信息，请参阅此 Google AI 论坛讨论。

from langchain_google_genai import ChatGoogleGenerativeAI

model = ChatGoogleGenerativeAI(model="gemini-2.5-flash-preview-tts")

response = model.invoke("Please say The quick brown fox jumps over the lazy dog")

# 音频的 Base64 编码二进制数据
wav_data = response.additional_kwargs.get("audio")
with open("output.wav", "wb") as f:
    f.write(wav_data)

工具调用

您可以为模型配备工具以进行调用。

from langchain.tools import tool
from langchain.messages import HumanMessage
from langchain_google_genai import ChatGoogleGenerativeAI


# 定义工具
@tool(description="Get the current weather in a given location")
def get_weather(location: str) -> str:
    return "It's sunny."


# 初始化并将（可能多个）工具绑定到模型
model_with_tools = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview").bind_tools([get_weather])

# 步骤 1：模型生成工具调用
messages = [HumanMessage("What's the weather in Boston?")]
ai_msg = model_with_tools.invoke(messages)
messages.append(ai_msg)

# 检查响应中的工具调用
print(ai_msg.tool_calls)

# 步骤 2：执行工具并收集结果
for tool_call in ai_msg.tool_calls:
    # 使用生成的参数执行工具
    tool_result = get_weather.invoke(tool_call)
    messages.append(tool_result)

# 步骤 3：将结果传回模型以获取最终响应
final_response = model_with_tools.invoke(messages)
final_response

[{'name': 'get_weather', 'args': {'location': 'Boston'}, 'id': '879b4233-901b-4bbb-af56-3771ca8d3a75', 'type': 'tool_call'}]

结构化输出

强制模型以特定结构响应。有关更多信息，请参阅 Gemini API 文档。

from langchain_google_genai import ChatGoogleGenerativeAI
from pydantic import BaseModel
from typing import Literal


class Feedback(BaseModel):
    sentiment: Literal["positive", "neutral", "negative"]
    summary: str


model = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview")
structured_model = model.with_structured_output(
    schema=Feedback.model_json_schema(), method="json_schema"
)

response = structured_model.invoke("The new UI is great!")
response["sentiment"]  # "positive"
response["summary"]  # "The user expresses positive..."

对于流式结构化输出，请合并字典而不是使用 +=：

stream = structured_model.stream("The interface is intuitive and beautiful!")
full = next(stream)
for chunk in stream:
    full.update(chunk)  # 合并字典
print(full)  # 完整的结构化响应
# -> {'sentiment': 'positive', 'summary': 'The user praises...'}

结构化输出方法

支持两种结构化输出方法：

method="json_schema"（默认）：使用 Gemini 的原生结构化输出。推荐用于更好的可靠性，因为它直接约束模型的生成过程，而不是依赖后处理工具调用。
method="function_calling"：使用工具调用来提取结构化数据。

将结构化输出与 Google 搜索结合使用

当使用 with_structured_output(method="function_calling") 时，不要在同一调用中传递其他工具（如 Google 搜索）。要在单个调用中获取结构化输出和搜索基础，请使用 .bind() 配合 response_mime_type 和 response_schema，而不是 with_structured_output：

from langchain_google_genai import ChatGoogleGenerativeAI
from pydantic import BaseModel


class MatchResult(BaseModel):
    winner: str
    final_match_score: str
    scorers: list[str]


llm = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview")

llm_with_search = llm.bind(
    tools=[{"google_search": {}}],
    response_mime_type="application/json",
    response_schema=MatchResult.model_json_schema(),
)

response = llm_with_search.invoke(
    "Search for details of the latest Euro championship final match."
)

这使用 Gemini 的原生 JSON 模式来结构化输出，同时允许使用 Google 搜索等工具进行基础——全部在单个 LLM 调用中完成。

令牌使用情况跟踪

从响应元数据访问令牌使用信息。

from langchain_google_genai import ChatGoogleGenerativeAI

model = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview")

result = model.invoke("Explain the concept of prompt engineering in one sentence.")

print(result.content)
print("\nUsage Metadata:")
print(result.usage_metadata)

Prompt engineering is the art and science of crafting effective text prompts to elicit desired and accurate responses from large language models.

Usage Metadata:
{'input_tokens': 10, 'output_tokens': 24, 'total_tokens': 34, 'input_token_details': {'cache_read': 0}}

思考支持

某些 Gemini 模型支持可配置的思考深度。参数取决于模型版本：

模型系列	参数	值
Gemini 3+	`thinking_level`	`"minimal"`, `"low"`, `"medium"`, `"high"`（Pro 默认）
Gemini 2.5	`thinking_budget`	`0`（关闭），`-1`（动态），或正整数（令牌限制）

from langchain_google_genai import ChatGoogleGenerativeAI

# Gemini 3+: 使用 thinking_level
llm = ChatGoogleGenerativeAI(
    model="gemini-3.1-pro-preview",
    thinking_level="low",
)

response = llm.invoke("How many O's are in Google?")

Gemini 2.5 模型：`thinking_budget`

对于 Gemini 2.5 模型，请使用 thinking_budget（整数令牌计数）：

设置为 0 以禁用思考（在支持的情况下）
设置为 -1 以进行动态思考（由模型决定）
设置为正整数以限制令牌使用

from langchain_google_genai import ChatGoogleGenerativeAI

llm = ChatGoogleGenerativeAI(
    model="gemini-2.5-flash",
    thinking_budget=1024,
)

并非所有模型都允许禁用思考。有关详细信息，请参阅 Gemini 模型文档。

查看模型思考

要查看思考模型的推理，请设置 include_thoughts=True：

from langchain_google_genai import ChatGoogleGenerativeAI

llm = ChatGoogleGenerativeAI(
    model="gemini-3.1-pro-preview",
    include_thoughts=True,
)

response = llm.invoke("How many O's are in Google? How did you verify your answer?")
reasoning_tokens = response.usage_metadata["output_token_details"]["reasoning"]

print("Response:", response.content)
print("Reasoning tokens used:", reasoning_tokens)

Response: [{'type': 'thinking', 'thinking': '**Analyzing and Cou...'}, {'type': 'text', 'text': 'There a...', 'extras': {'signature': 'EroR...'}}]
Reasoning tokens used: 672

有关思考的更多信息，请参阅 Gemini API 文档。

思考签名

思考签名是模型推理的加密表示。它们使 Gemini 能够在多轮对话中保持思考上下文，因为 API 是无状态的。

如果思考签名未随工具调用响应一起传回，Gemini 3 可能会引发 4xx 错误。升级到 langchain-google-genai >= 3.1.0 以确保正确处理此问题。

签名出现在 AIMessage 响应中：

文本块：内容块内的 extras.signature
工具调用：additional_kwargs["__gemini_function_call_thought_signatures__"]

对于多轮对话，请将完整的 AIMessage 传回模型以保留签名。当您将 AIMessage 附加到消息列表时，这会自动发生（如工具调用示例所示）。

不要手动重建消息。 如果您创建一个新的 AIMessage 而不是传递原始对象，签名将丢失，API 可能会拒绝请求。

内置工具

Google Gemini 支持各种内置工具，可以以通常的方式绑定到模型。

Google 搜索

有关详细信息，请参阅 Gemini 文档。

from langchain_google_genai import ChatGoogleGenerativeAI

model = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview")

model_with_search = model.bind_tools([{"google_search": {}}])
response = model_with_search.invoke("When is the next total solar eclipse in US?")

response.content_blocks

[{'type': 'text',
  'text': 'The next total solar eclipse visible in the contiguous United States will occur on...',
  'annotations': [{'type': 'citation',
    'id': 'abc123',
    'url': '<url for source 1>',
    'title': '<source 1 title>',
    'start_index': 0,
    'end_index': 99,
    'cited_text': 'The next total solar eclipse...',
    'extras': {'google_ai_metadata': {'web_search_queries': ['next total solar eclipse in US'],
       'grounding_chunk_index': 0,
       'confidence_scores': []}}},
   ...

Google 地图

某些模型支持使用 Google 地图进行基础。地图基础将 Gemini 的生成功能与 Google 地图当前、事实的位置数据连接起来。这使得位置感知应用程序能够提供准确、特定于地理位置的响应。有关详细信息，请参阅 Gemini 文档。

from langchain_google_genai import ChatGoogleGenerativeAI

model = ChatGoogleGenerativeAI(model="gemini-2.5-pro")

model_with_maps = model.bind_tools([{"google_maps": {}}])
response = model_with_maps.invoke(
    "What are some good Italian restaurants near the Eiffel Tower in Paris?"
)

响应将包含来自 Google 地图的位置信息的基础元数据。您可以选择使用 tool_config 和 lat_lng 提供特定的位置上下文。当您希望相对于特定地理点进行基础查询时，这很有用。

from langchain_google_genai import ChatGoogleGenerativeAI

model = ChatGoogleGenerativeAI(model="gemini-2.5-pro")

# 提供位置上下文（纬度和经度）
model_with_maps = model.bind_tools(
    [{"google_maps": {}}],
    tool_config={
        "retrieval_config": {  # 埃菲尔铁塔
            "lat_lng": {
                "latitude": 48.858844,
                "longitude": 2.294351,
            }
        }
    },
)

response = model_with_maps.invoke(
    "What Italian restaurants are within a 5 minute walk from here?"
)

URL 上下文

URL 上下文工具使模型能够访问和分析您在提示中提供的 URL 内容。这对于总结网页、从多个来源提取数据或回答有关在线内容的问题等任务非常有用。有关详细信息和限制，请参阅 Gemini 文档。

from langchain_google_genai import ChatGoogleGenerativeAI

model = ChatGoogleGenerativeAI(model="gemini-2.5-flash")

model_with_url_context = model.bind_tools([{"url_context": {}}])
response = model_with_url_context.invoke(
    "Summarize the content at https://docs.langchain.com"
)

代码执行

有关详细信息，请参阅 Gemini 文档。

from langchain_google_genai import ChatGoogleGenerativeAI

model = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview")

model_with_code_interpreter = model.bind_tools([{"code_execution": {}}])
response = model_with_code_interpreter.invoke("Use Python to calculate 3^3.")

response.content_blocks

[{'type': 'server_tool_call',
  'name': 'code_interpreter',
  'args': {'code': 'print(3**3)', 'language': <Language.PYTHON: 1>},
  'id': '...'},
 {'type': 'server_tool_result',
  'tool_call_id': '',
  'status': 'success',
  'output': '27\n',
  'extras': {'block_type': 'code_execution_result',
   'outcome': <Outcome.OUTCOME_OK: 1>}},
 {'type': 'text', 'text': 'The calculation of 3 to the power of 3 is 27.'}]

计算机使用

Gemini 2.5 计算机使用模型 (gemini-2.5-computer-use-preview-10-2025) 可以与浏览器环境交互以自动化 Web 任务，如单击、键入和滚动。

预览模型限制计算机使用模型处于预览状态，可能会产生意外行为。始终监督自动化任务，避免在敏感数据或关键操作中使用。有关安全最佳实践，请参阅 Gemini API 文档。

from langchain_google_genai import ChatGoogleGenerativeAI

model = ChatGoogleGenerativeAI(model="gemini-2.5-computer-use-preview-10-2025")
model_with_computer = model.bind_tools([{"computer_use": {}}])

response = model_with_computer.invoke("Please navigate to example.com")

response.content_blocks

[{'type': 'tool_call',
  'id': '08a8b175-16ab-4861-8965-b736d5d4dd7e',
  'name': 'open_web_browser',
  'args': {}}]

您可以配置环境并排除特定的 UI 操作：

高级配置

from langchain_google_genai import ChatGoogleGenerativeAI, Environment

model = ChatGoogleGenerativeAI(model="gemini-2.5-computer-use-preview-10-2025")

# 指定环境（浏览器为默认）
model_with_computer = model.bind_tools(
    [{"computer_use": {"environment": Environment.ENVIRONMENT_BROWSER}}]
)

# 排除特定的 UI 操作
model_with_computer = model.bind_tools(
    [
        {
            "computer_use": {
                "environment": Environment.ENVIRONMENT_BROWSER,
                "excludedPredefinedFunctions": [
                    "drag_and_drop",
                    "key_combination",
                ],
            }
        }
    ]
)

response = model_with_computer.invoke("Search for Python tutorials")

模型返回用于 UI 操作（如 click_at、type_text_at、scroll）的函数调用，并带有规范化坐标。您需要在浏览器自动化框架中实现这些操作的实际执行。

安全设置

Gemini 模型具有默认的安全设置，可以覆盖。如果您从模型收到大量 'Safety Warnings'，可以尝试调整模型的 safety_settings 属性。例如，要关闭危险内容的安全阻止，您可以按如下方式构造您的 LLM：

from langchain_google_genai import (
    ChatGoogleGenerativeAI,
    HarmBlockThreshold,
    HarmCategory,
)

llm = ChatGoogleGenerativeAI(
        model="gemini-3.1-pro-preview",
        safety_settings={
        HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: HarmBlockThreshold.BLOCK_NONE,
    },
)

有关可用类别和阈值的枚举，请参阅 Google 的安全设置类型。

上下文缓存

上下文缓存允许您存储和重用内容（例如 PDF、图像）以加快处理速度。cached_content 参数接受通过 Google Generative AI API 创建的缓存名称。

单个文件示例

此示例缓存单个文件并对其进行查询。

import time
from google import genai
from google.genai import types
from langchain.messages import HumanMessage
from langchain_google_genai import ChatGoogleGenerativeAI

client = genai.Client()

# 上传文件
file = client.files.upload(file="path/to/your/file")
while file.state.name == "PROCESSING":
    time.sleep(2)
    file = client.files.get(name=file.name)

# 创建缓存
model = "gemini-3.1-pro-preview"
cache = client.caches.create(
    model=model,
    config=types.CreateCachedContentConfig(
        display_name="Cached Content",
        system_instruction=(
            "You are an expert content analyzer, and your job is to answer "
            "the user's query based on the file you have access to."
        ),
        contents=[file],
        ttl="300s",
    ),
)

# 使用 LangChain 查询
llm = ChatGoogleGenerativeAI(
    model=model,
    cached_content=cache.name,
)
message = HumanMessage(content="Summarize the main points of the content.")
llm.invoke([message])

多个文件示例

此示例使用 Part 缓存两个文件并一起查询它们。

import time
from google import genai
from google.genai.types import CreateCachedContentConfig, Content, Part
from langchain.messages import HumanMessage
from langchain_google_genai import ChatGoogleGenerativeAI

client = genai.Client()

# 上传文件
file_1 = client.files.upload(file="./file1")
while file_1.state.name == "PROCESSING":
    time.sleep(2)
    file_1 = client.files.get(name=file_1.name)

file_2 = client.files.upload(file="./file2")
while file_2.state.name == "PROCESSING":
    time.sleep(2)
    file_2 = client.files.get(name=file_2.name)

# 使用多个文件创建缓存
contents = [
    Content(
        role="user",
        parts=[
            Part.from_uri(file_uri=file_1.uri, mime_type=file_1.mime_type),
            Part.from_uri(file_uri=file_2.uri, mime_type=file_2.mime_type),
        ],
    )
]
model = "gemini-3.1-pro-preview"
cache = client.caches.create(
    model=model,
    config=CreateCachedContentConfig(
        display_name="Cached Contents",
        system_instruction=(
            "You are an expert content analyzer, and your job is to answer "
            "the user's query based on the files you have access to."
        ),
        contents=contents,
        ttl="300s",
    ),
)

# 使用 LangChain 查询
llm = ChatGoogleGenerativeAI(
    model=model,
    cached_content=cache.name,
)
message = HumanMessage(
    content="Provide a summary of the key information across both files."
)
llm.invoke([message])

有关上下文缓存的更多信息，请参阅 Gemini API 文档中的上下文缓存。

响应元数据

从模型响应访问响应元数据。

from langchain_google_genai import ChatGoogleGenerativeAI

llm = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview")

response = llm.invoke("Hello!")
response.response_metadata

{'prompt_feedback': {'block_reason': 0, 'safety_ratings': []},
 'finish_reason': 'STOP',
 'model_name': 'gemini-3.1-pro-preview',
 'safety_ratings': [],
 'model_provider': 'google_genai'}

API 参考

有关所有功能和配置选项的详细文档，请访问 ChatGoogleGenerativeAI API 参考。

在 GitHub 上编辑此页面或提交问题。

连接这些文档到 Claude、VSCode 等，通过 MCP 获取实时答案。

Popular Providers

Integrations by component

​概述

​集成详情

​模型功能

​设置

​安装

​凭据

​后端选择

​环境变量

​实例化

​代理配置

​调用

​多模态用法

​支持的输入方法

​文件上传

​图像输入

​PDF 输入

​音频输入

​视频输入

​图像生成

​音频生成

​工具调用

​结构化输出

​结构化输出方法

​将结构化输出与 Google 搜索结合使用

​令牌使用情况跟踪

​思考支持

​Gemini 2.5 模型：thinking_budget

​查看模型思考

​思考签名

​内置工具

​Google 搜索

​Google 地图

​URL 上下文

​代码执行

​计算机使用

​安全设置

​上下文缓存

​响应元数据

​API 参考

概述

集成详情

模型功能

设置

安装

凭据

后端选择

环境变量

实例化

代理配置

调用

多模态用法

支持的输入方法

文件上传

图像输入

PDF 输入

音频输入

视频输入

图像生成

音频生成

工具调用

结构化输出

结构化输出方法

将结构化输出与 Google 搜索结合使用

令牌使用情况跟踪

思考支持

Gemini 2.5 模型：`thinking_budget`

查看模型思考

思考签名

内置工具

Google 搜索

Google 地图

URL 上下文

代码执行

计算机使用

安全设置

上下文缓存

响应元数据

API 参考