Skip to main content
Amazon Bedrock AgentCore Browser 使智能体能够通过托管的 Chrome 浏览器与网页进行交互。智能体可以在安全的托管环境中导航网站、提取内容、填写表单、点击元素并截图。

概述

集成详情

可序列化JS 支持版本
BrowserToolkitlangchain-awsPyPI - Version

工具特性

返回制品原生异步支持浏览器交互定价
按需计费 (AWS)

可用工具

该工具包提供了多种浏览器自动化工具:
工具描述
navigate_browser导航至某个 URL
click_element使用 CSS 选择器点击某个元素
type_text在输入框中输入文本
extract_text提取页面中的所有文本内容
extract_hyperlinks提取页面中的所有超链接
get_elements获取匹配 CSS 选择器的元素
current_webpage获取当前页面的 URL 和标题
navigate_back返回上一页
take_screenshot截取页面截图
scroll_page向某个方向滚动页面
wait_for_element等待某个元素出现

设置

该集成位于 langchain-aws 包中,同时还需要 playwrightbeautifulsoup4 用于浏览器自动化和 HTML 解析。
pip install -U langchain-aws bedrock-agentcore playwright beautifulsoup4
playwright install chromium

凭证

您需要配置具有 Bedrock AgentCore Browser 权限的 AWS 凭证。有关所需 IAM 权限,请参阅 Amazon Bedrock AgentCore 文档 同时建议(但非必需)设置 LangSmith 以获得一流的可观测性:
import os

os.environ["LANGSMITH_API_KEY"] = "your-api-key"
os.environ["LANGSMITH_TRACING"] = "true"

实例化

使用工厂函数创建工具包:
from langchain_aws.tools import create_browser_toolkit

# Create toolkit and get tools
toolkit, browser_tools = create_browser_toolkit(region="us-west-2")

调用

直接使用工具

获取特定工具并调用它们:
# Get tools by name
tools_by_name = toolkit.get_tools_by_name()

# Navigate to a URL (requires config with thread_id)
config = {"configurable": {"thread_id": "session-123"}}

result = tools_by_name["navigate_browser"].invoke(
    {"url": "https://example.com"},
    config=config
)
print(result)

# Extract text from the page
text = tools_by_name["extract_text"].invoke({}, config=config)
print(text)

在智能体中使用

import asyncio
from langchain.agents import create_react_agent
from langchain.chat_models import init_chat_model
from langchain_aws.tools import create_browser_toolkit

async def main():
    # Create toolkit
    toolkit, browser_tools = create_browser_toolkit(region="us-west-2")

    # Initialize chat model
    llm = init_chat_model(
        "us.anthropic.claude-sonnet-4-20250514-v1:0",
        model_provider="bedrock_converse",
    )

    # Create agent with browser tools
    agent = create_react_agent(
        model=llm,
        tools=browser_tools,
    )

    # Create config with thread_id for session isolation
    config = {"configurable": {"thread_id": "research-session"}}

    # Run the agent
    result = await agent.ainvoke(
        {"messages": [{
            "role": "user",
            "content": "Navigate to https://example.com and tell me the main heading"
        }]},
        config=config
    )
    print(result["messages"][-1].content)

    # Clean up when done
    await toolkit.cleanup()

asyncio.run(main())

基于线程的会话隔离

工具包为每个 thread_id 维护独立的浏览器会话,从而支持并发使用而不会相互干扰:
# Each thread gets its own browser session
config_user1 = {"configurable": {"thread_id": "user-1"}}
config_user2 = {"configurable": {"thread_id": "user-2"}}

# User 1 navigates to site A
tools_by_name["navigate_browser"].invoke(
    {"url": "https://site-a.com"},
    config=config_user1
)

# User 2 navigates to site B (different browser session)
tools_by_name["navigate_browser"].invoke(
    {"url": "https://site-b.com"},
    config=config_user2
)

浏览器操作

导航

config = {"configurable": {"thread_id": "session-123"}}

# Navigate to URL
tools_by_name["navigate_browser"].invoke({"url": "https://example.com"}, config=config)

# Go back
tools_by_name["navigate_back"].invoke({}, config=config)

# Get current page info
current = tools_by_name["current_webpage"].invoke({}, config=config)
print(current)  # URL and title

与元素交互

# Click an element
tools_by_name["click_element"].invoke({"selector": "#submit-button"}, config=config)

# Type into an input field
tools_by_name["type_text"].invoke({
    "selector": "input[name='search']",
    "text": "search query"
}, config=config)

# Wait for element to appear
tools_by_name["wait_for_element"].invoke({
    "selector": ".results",
    "timeout": 10000,  # 10 seconds
    "state": "visible"
}, config=config)

提取内容

# Extract all text
text = tools_by_name["extract_text"].invoke({}, config=config)

# Extract all hyperlinks
links = tools_by_name["extract_hyperlinks"].invoke({}, config=config)

# Get specific elements
elements = tools_by_name["get_elements"].invoke(
    {"selector": "article h2"},
    config=config
)

截图与滚动

# Take screenshot of visible viewport (returns base64 image)
screenshot = tools_by_name["take_screenshot"].invoke(
    {"capture_type": "viewport"},
    config=config
)

# Take screenshot of entire scrollable page
full_screenshot = tools_by_name["take_screenshot"].invoke(
    {"capture_type": "full_page"},
    config=config
)

# Scroll the page
tools_by_name["scroll_page"].invoke({
    "direction": "down",
    "amount": 500  # pixels
}, config=config)

会话清理

完成后请务必清理浏览器会话以释放资源:
# Clean up all browser sessions
await toolkit.cleanup()
注意: 尽管 create_browser_toolkit() 是同步的,但 cleanup() 方法是异步的,必须使用 await

并发保护

工具包内置了并发保护机制。每个浏览器会话与特定的 thread_id 绑定,若尝试在会话已使用的情况下再次访问同一会话,将抛出 RuntimeError。并发操作请使用不同的 thread_id
# Good: Different thread IDs for concurrent operations
config_a = {"configurable": {"thread_id": "task-a"}}
config_b = {"configurable": {"thread_id": "task-b"}}

# These can run concurrently without conflicts
await asyncio.gather(
    agent.ainvoke({"messages": [...]}, config=config_a),
    agent.ainvoke({"messages": [...]}, config=config_b),
)

API 参考

有关所有功能和配置的详细文档,请参阅: