Scrapeless 爬虫集成

Scrapeless 提供灵活且功能丰富的数据采集服务，支持广泛的参数定制和多格式导出。这些能力使 LangChain 能够更有效地集成和利用外部数据。核心功能模块包括： DeepSerp

Google 搜索：支持对所有结果类型进行全面的 Google SERP 数据提取。
- 支持选择本地化 Google 域名（例如 google.com、google.ad），以获取特定地区的搜索结果。
- 支持分页以获取第一页之外的结果。
- 支持搜索结果过滤开关，用于控制是否排除重复或相似内容。
Google Trends：从 Google 获取关键词趋势数据，包括随时间的流行度、地区兴趣以及相关搜索。
- 支持多关键词比较。
- 支持多种数据类型：interest_over_time、interest_by_region、related_queries 和 related_topics。
- 支持按特定 Google 属性（网页、YouTube、新闻、购物）进行过滤，以进行特定来源的趋势分析。

通用抓取

专为现代 JavaScript 密集型网站设计，支持动态内容提取。
- 全球高级代理支持，用于绕过地理限制并提高可靠性。

爬虫

Crawl（抓取）：递归爬取网站及其链接页面，提取全站内容。
- 支持可配置的爬取深度和范围 URL 定向。
Scrape（采集）：高精度地从单个网页提取内容。
- 支持”仅主要内容”提取，排除广告、页脚及其他非必要元素。
- 支持对多个独立 URL 进行批量采集。

概述

集成详情

类	包	可序列化	JS 支持	版本
ScrapelessCrawlerScrapeTool	langchain-scrapeless	✅	❌
ScrapelessCrawlerCrawlTool	langchain-scrapeless	✅	❌

工具功能

原生异步	返回 artifact	返回数据
✅	✅	markdown, rawHtml, screenshot@fullPage, json, links, screenshot, html

安装

该集成位于 langchain-scrapeless 包中。 !pip install langchain-scrapeless

凭证

使用此工具需要 Scrapeless API 密钥。您可以将其设置为环境变量：

import os

os.environ["SCRAPELESS_API_KEY"] = "your-api-key"

实例化

ScrapelessCrawlerScrapeTool

ScrapelessCrawlerScrapeTool 允许您使用 Scrapeless 的 Crawler Scrape API 从一个或多个网站采集内容。您可以提取主要内容、控制格式、请求头、等待时间和输出类型。该工具接受以下参数：

urls（必填，List[str]）：要采集的一个或多个网站 URL。
formats（可选，List[str]）：定义采集输出的格式。默认为 ['markdown']。选项包括：
- 'markdown'
- 'rawHtml'
- 'screenshot@fullPage'
- 'json'
- 'links'
- 'screenshot'
- 'html'
only_main_content（可选，bool）：是否仅返回页面主要内容，排除页眉、导航栏、页脚等。默认为 True。
include_tags（可选，List[str]）：要包含在输出中的 HTML 标签列表（例如 ['h1', 'p']）。若设为 None，则不显式包含任何标签。
exclude_tags（可选，List[str]）：要从输出中排除的 HTML 标签列表。若设为 None，则不显式排除任何标签。
headers（可选，Dict[str, str]）：随请求发送的自定义请求头（例如，用于 cookies 或 user-agent）。默认为 None。
wait_for（可选，int）：采集前等待的毫秒数。用于给页面充分加载时间。默认为 0。
timeout（可选，int）：请求超时时间（毫秒）。默认为 30000。

ScrapelessCrawlerCrawlTool

ScrapelessCrawlerCrawlTool 允许您使用 Scrapeless 的 Crawler Crawl API 从基础 URL 开始爬取网站。支持高级 URL 过滤、爬取深度控制、内容采集选项、请求头自定义等功能。该工具接受以下参数：

url（必填，str）：开始爬取的基础 URL。
limit（可选，int）：最大爬取页面数。默认为 10000。
include_paths（可选，List[str]）：要包含在爬取中的 URL 路径名正则表达式模式。只有匹配这些模式的 URL 才会被包含。例如，设置 ["blog/.*"] 只会包含 /blog/ 路径下的 URL。默认为 None。
exclude_paths（可选，List[str]）：要从爬取中排除的 URL 路径名正则表达式模式。例如，设置 ["blog/.*"] 会排除 /blog/ 路径下的 URL。默认为 None。
max_depth（可选，int）：相对于基础 URL 的最大爬取深度，按 URL 路径中斜杠数量计算。默认为 10。
max_discovery_depth（可选，int）：基于发现顺序的最大爬取深度。根页面和站点地图页面深度为 0。例如，设置为 1 并忽略站点地图时，只会爬取输入的 URL 及其直接链接。默认为 None。
ignore_sitemap（可选，bool）：爬取时是否忽略网站站点地图。默认为 False。
ignore_query_params（可选，bool）：是否忽略查询参数差异以避免重复采集相似 URL。默认为 False。
deduplicate_similar_urls（可选，bool）：是否对相似 URL 进行去重。默认为 True。
regex_on_full_url（可选，bool）：正则匹配是否应用于完整 URL 而非仅路径部分。默认为 True。
allow_backward_links（可选，bool）：是否允许爬取 URL 层次结构之外的反向链接。默认为 False。
allow_external_links（可选，bool）：是否允许爬取外部网站的链接。默认为 False。
delay（可选，int）：页面采集之间的延迟秒数，用于遵守速率限制。默认为 1。
formats（可选，List[str]）：采集内容的格式。默认为 [“markdown”]。选项包括：
- 'markdown'
- 'rawHtml'
- 'screenshot@fullPage'
- 'json'
- 'links'
- 'screenshot'
- 'html'
only_main_content（可选，bool）：是否仅返回主要内容，排除页眉、导航栏、页脚等。默认为 True。
include_tags（可选，List[str]）：要包含在输出中的 HTML 标签列表（例如 ['h1', 'p']）。默认为 None（无显式包含过滤器）。
exclude_tags（可选，List[str]）：要从输出中排除的 HTML 标签列表。默认为 None（无显式排除过滤器）。
headers（可选，Dict[str, str]）：随请求发送的自定义 HTTP 请求头，例如 cookies 或 user-agent 字符串。默认为 None。
wait_for（可选，int）：采集内容前等待的毫秒数，允许页面完全加载。默认为 0。
timeout（可选，int）：请求超时时间（毫秒）。默认为 30000。

调用

ScrapelessCrawlerCrawlTool

带参数使用

from langchain_scrapeless import ScrapelessCrawlerCrawlTool

tool = ScrapelessCrawlerCrawlTool()

# Advanced usage
result = tool.invoke({"url": "https://exmaple.com", "limit": 4})
print(result)

{'success': True, 'status': 'completed', 'completed': 1, 'total': 1, 'data': [{'markdown': '# Well hello there.\n\nWelcome to exmaple.com.\n\nChances are you got here by mistake (example.com, anyone?)', 'metadata': {'scrapeId': '547b2478-a41a-4a17-8015-8db378ee455f', 'sourceURL': 'https://exmaple.com', 'url': 'https://exmaple.com', 'statusCode': 200}}]}

在 Agent 中使用

from langchain_openai import ChatOpenAI
from langchain_scrapeless import ScrapelessCrawlerCrawlTool
from langchain.agents import create_agent


model = ChatOpenAI()

tool = ScrapelessCrawlerCrawlTool()

# Use the tool with an agent
tools = [tool]
agent = create_agent(model, tools)

for chunk in agent.stream(
    {
        "messages": [
            (
                "human",
                "Use the scrapeless crawler crawl tool to crawl the website https://example.com and output the markdown content as a string.",
            )
        ]
    },
    stream_mode="values",
):
    chunk["messages"][-1].pretty_print()

================================ Human Message =================================

Use the scrapeless crawler crawl tool to crawl the website https://example.com and output the markdown content as a string.
================================== Ai Message ==================================
Tool Calls:
  scrapeless_crawler_crawl (call_Ne5HbxqsYDOKFaGDSuc4xppB)
 Call ID: call_Ne5HbxqsYDOKFaGDSuc4xppB
  Args:
    url: https://example.com
    formats: ['markdown']
    limit: 1
================================= Tool Message =================================
Name: scrapeless_crawler_crawl

{"success": true, "status": "completed", "completed": 1, "total": 1, "data": [{"markdown": "# Example Domain\n\nThis domain is for use in illustrative examples in documents. You may use this\ndomain in literature without prior coordination or asking for permission.\n\n[More information...](https://www.iana.org/domains/example)", "metadata": {"viewport": "width=device-width, initial-scale=1", "title": "Example Domain", "scrapeId": "00561460-9166-492b-8fed-889667383e55", "sourceURL": "https://example.com", "url": "https://example.com", "statusCode": 200}}]}
================================== Ai Message ==================================

The crawl of the website https://example.com has been completed. Here is the markdown content extracted from the website:

\`\`\`
# Example Domain

This domain is for use in illustrative examples in documents. You may use this
domain in literature without prior coordination or asking for permission.

[More information...](https://www.iana.org/domains/example)
\`\`\`

You can find more information on the website [here](https://www.iana.org/domains/example).

ScrapelessCrawlerScrapeTool

带参数使用

from langchain_scrapeless import ScrapelessDeepSerpGoogleTrendsTool

tool = ScrapelessDeepSerpGoogleTrendsTool()

# Basic usage
result = tool.invoke("Funny 2048,negamon monster trainer")
print(result)

{'parameters': {'engine': 'google.trends.search', 'hl': 'en', 'data_type': 'INTEREST_OVER_TIME', 'tz': '0', 'cat': '0', 'date': 'today 1-m', 'q': 'Funny 2048,negamon monster trainer'}, 'interest_over_time': {'timeline_data': [{'date': 'Jul 11, 2025', 'timestamp': '1752192000', 'value': [0, 0]}, {'date': 'Jul 12, 2025', 'timestamp': '1752278400', 'value': [0, 0]}, {'date': 'Jul 13, 2025', 'timestamp': '1752364800', 'value': [0, 0]}, {'date': 'Jul 14, 2025', 'timestamp': '1752451200', 'value': [0, 0]}, {'date': 'Jul 15, 2025', 'timestamp': '1752537600', 'value': [0, 0]}, {'date': 'Jul 16, 2025', 'timestamp': '1752624000', 'value': [0, 0]}, {'date': 'Jul 17, 2025', 'timestamp': '1752710400', 'value': [0, 0]}, {'date': 'Jul 18, 2025', 'timestamp': '1752796800', 'value': [0, 0]}, {'date': 'Jul 19, 2025', 'timestamp': '1752883200', 'value': [0, 0]}, {'date': 'Jul 20, 2025', 'timestamp': '1752969600', 'value': [0, 0]}, {'date': 'Jul 21, 2025', 'timestamp': '1753056000', 'value': [0, 0]}, {'date': 'Jul 22, 2025', 'timestamp': '1753142400', 'value': [0, 0]}, {'date': 'Jul 23, 2025', 'timestamp': '1753228800', 'value': [0, 0]}, {'date': 'Jul 24, 2025', 'timestamp': '1753315200', 'value': [0, 0]}, {'date': 'Jul 25, 2025', 'timestamp': '1753401600', 'value': [0, 0]}, {'date': 'Jul 26, 2025', 'timestamp': '1753488000', 'value': [0, 0]}, {'date': 'Jul 27, 2025', 'timestamp': '1753574400', 'value': [0, 0]}, {'date': 'Jul 28, 2025', 'timestamp': '1753660800', 'value': [0, 0]}, {'date': 'Jul 29, 2025', 'timestamp': '1753747200', 'value': [0, 0]}, {'date': 'Jul 30, 2025', 'timestamp': '1753833600', 'value': [0, 0]}, {'date': 'Jul 31, 2025', 'timestamp': '1753920000', 'value': [0, 0]}, {'date': 'Aug 1, 2025', 'timestamp': '1754006400', 'value': [0, 0]}, {'date': 'Aug 2, 2025', 'timestamp': '1754092800', 'value': [0, 0]}, {'date': 'Aug 3, 2025', 'timestamp': '1754179200', 'value': [0, 0]}, {'date': 'Aug 4, 2025', 'timestamp': '1754265600', 'value': [0, 0]}, {'date': 'Aug 5, 2025', 'timestamp': '1754352000', 'value': [0, 0]}, {'date': 'Aug 6, 2025', 'timestamp': '1754438400', 'value': [0, 0]}, {'date': 'Aug 7, 2025', 'timestamp': '1754524800', 'value': [0, 0]}, {'date': 'Aug 8, 2025', 'timestamp': '1754611200', 'value': [0, 0]}, {'date': 'Aug 9, 2025', 'timestamp': '1754697600', 'value': [0, 0]}, {'date': 'Aug 10, 2025', 'timestamp': '1754784000', 'value': [0, 100]}, {'date': 'Aug 11, 2025', 'timestamp': '1754870400', 'value': [0, 0]}], 'averages': [{'value': 0}, {'value': 3}], 'isPartial': True}}

带参数的高级使用

from langchain_scrapeless import ScrapelessCrawlerScrapeTool

tool = ScrapelessCrawlerScrapeTool()

result = tool.invoke(
    {
        "urls": ["https://exmaple.com", "https://www.scrapeless.com/en"],
        "formats": ["markdown"],
    }
)
print(result)

{'success': True, 'status': 'completed', 'completed': 1, 'total': 1, 'data': [{'markdown': "[🩵 Don't just take our word for it. See what our users say on Product Hunt.](https://www.producthunt.com/posts/scrapeless-deep-serpapi)\n\n# Effortless Web Scraping Toolkit  for Business and Developers\n\nThe ultimate scraper's companion: an expandable suite of tools, including\n\nScraping Browser, Scraping API, Universal Scraping API\n\nand Anti-Bot Solutions—designed to work together or independently.\n\n[**4.8**](https://www.g2.com/products/scrapeless/reviews) [**4.5**](https://www.trustpilot.com/review/scrapeless.com) [**4.8**](https://slashdot.org/software/p/Scrapeless/) [**8.5**](https://tekpon.com/software/scrapeless/reviews/)\n\nNo credit card required\n\n## A Flexible Toolkit for Accessing Public Web Data\n\nAI-powered seamless data extraction, effortlessly bypassing blocks with a single API call.\n\n[scrapeless](https://www.scrapeless.com/en)\n\n[![Deep SerpApi](https://www.scrapeless.com/_next/image?url=%2Fassets%2Fimages%2Ftoolkit%2Flight%2Fimg-2.png&w=750&q=100)\\\\\n\\\\\nView more\\\\\n\\\\\n20+ custom parameters\\\\\n\\\\\n20+ Google SERP scenarios\\\\\n\\\\\nPrecision Search Fueling LLM & RAG AI\\\\\n\\\\\n1-2s response; $0.1/1k queries](https://www.scrapeless.com/en/product/deep-serp-api) [![Scraping Browser](https://www.scrapeless.com/_next/image?url=%2Fassets%2Fimages%2Ftoolkit%2Flight%2Fimg-4.png&w=750&q=100)\\\\\n\\\\\nView more\\\\\n\\\\\nHuman-like Behavior\\\\\n\\\\\nHigh Performance\\\\\n\\\\\nBypassing Risk Control\\\\\n\\\\\nConnect using the CDP Protocol](https://www.scrapeless.com/en/product/scraping-browser) [![Universal Scraping API](https://www.scrapeless.com/_next/image?url=%2Fassets%2Fimages%2Ftoolkit%2Flight%2Fimg-1.png&w=750&q=100)\\\\\n\\\\\nView more\\\\\n\\\\\nSession Mode\\\\\n\\\\\nCustom TLS\\\\\n\\\\\nJs Render](https://www.scrapeless.com/en/product/universal-scraping-api)\n\n### Customized Services\n\nContact our technical experts for custom solutions.\n\nBook a demo\n\n## From Simple Data Scraping to Complex Anti-Bot Challenges,   Scrapeless Has You Covered.\n\nFlexible Toolkit for Adapting to Diverse Data Extraction Needs.\n\n[Try for Free](https://app.scrapeless.com/passport/register)\n\n### Fully Compatible with Key Programming Languages and Tools\n\nSeamlessly integrate across all devices, OS, and languages. Worry-free compatibility ensures smooth data collection.\n\nGet all example codes on the dashboard after login\n\n![scrapeless](https://www.scrapeless.com/_next/image?url=%2Fassets%2Fimages%2Fcode%2Fcode-l.jpg&w=3840&q=75)\n\n## Enterprise-level Data Scraping Solution\n\nHigh-quality, tailored web scraping solutions and expert services designed for critical business projects.\n\n### Customized Data Scraping Solutions\n\nTailored web scraping services designed to address your\xa0 unique business requirements and deliver actionable insights.\n\n### High Concurrency and High-Performance Scraping\n\nEfficiently gather massive volumes of data with unparalleled speed and reliability,\xa0ensuring optimal performance even under heavy load.\n\n### Data Cleaning and Transformation\n\nEnhance data accuracy and usability through comprehensive\xa0 cleaning and transformation processes, turning raw data into\xa0 valuable information.\n\n### Real-Time Data Push and API Integration\n\nSeamlessly integrate and access live data streams with robust APIs,\xa0ensuring your applications are always up-to-date with the latest information.\n\n### Data Security and Privacy Protection\n\nProtect your data with state-of-the-art security measures and strict\xa0compliance standards, ensuring privacy and confidentiality at every step.\n\n### Enterprise-level SLA\n\nThe Service Level Agreement (SLA) serves as a safeguard for your project,\xa0ensuring a contract for anticipated outcomes, automated oversight, prompt issue\xa0resolution, and a personalized maintenance plan.\n\n## Why Scrapeless: Simplify Your Data Flow Effortlessly.\n\nAchieve all your data scraping tasks with more power, simplicity, and cost-effectiveness in less time.\n\n### Articles\n\nNews articles/Blog posts/Research papers\n\n### Organized Fresh Data\n\n### Prices\n\nProduct prices/Discount information/Market trend analysis\n\n### No need to hassle with browser maintenance\n\n### Reviews\n\nProduct reviews/User feedback/Social media reviews\n\n### Only pay for successful requests\n\n### Products\n\nProduct Launches/Tech Specs/Product Comparisons\n\n### Fully scalable\n\n## Unleash Your Competitive Edge  in Data within the Industry\n\n## Regulate Compliance for All Users\n\nContact us\n\nWe are committed to using technology for the benefit of humanity and firmly oppose any illegal activities and misuse of our products. We support the collection of publicly available data to improve human life, while strongly opposing the collection of unauthorized or unapproved sensitive information. If you find anyone abusing our services, please provide us with feedback! To further enhance user confidence and control, we have established a dedicated Privacy Center aimed at empowering users with more capabilities and information rights.\n\n![scrapeless](https://www.scrapeless.com/_next/image?url=%2Fassets%2Fimages%2Fregulate-compliance.png&w=640&q=75)\n\n## Web Scraping Blog\n\nMost comprehensive guide, created for all Web Scraping developers.\n\n[View All Blogs](https://www.scrapeless.com/en/blog)\n\n[**Scrapeless MCP Server Is Officially Live! Build Your Ultimate AI-Web Connector** \\\\\n\\\\\nDiscover how the Scrapeless MCP Server gives LLMs real-time web browsing and scraping abilities. Learn how to build AI agents that search, extract, and interact with dynamic web content seamlessly.\\\\\n\\\\\n![Michael Lee](https://www.scrapeless.com/_next/image?url=https%3A%2F%2Fassets.scrapeless.com%2Fprod%2Fimages%2Fauthor-avatars%2Fmichael-lee.png&w=48&q=75)Michael Lee\\\\\n\\\\\n17-Jul-2025\\\\\n\\\\\n![Scrapeless MCP Server](https://www.scrapeless.com/_next/image?url=https%3A%2F%2Fassets.scrapeless.com%2Fprod%2Fposts%2Fscrapeless-mcp-server%2Fc85738fc1c504abe930fd4514e4a2190.jpeg&w=3840&q=75)](https://www.scrapeless.com/en/blog/scrapeless-mcp-server) [**Product Updates \\| New Profile Feature** \\\\\n\\\\\nProduct Updates \\| Introducing the new Profile feature to enable persistent browser data storage, streamline cross-session workflows, and boost automation efficiency.\\\\\n\\\\\n![Emily Chen](https://www.scrapeless.com/_next/image?url=https%3A%2F%2Fassets.scrapeless.com%2Fprod%2Fimages%2Fauthor-avatars%2Femily-chen.png&w=48&q=75)Emily Chen\\\\\n\\\\\n17-Jul-2025\\\\\n\\\\\n![Product Updates | New Profile Feature: Make Browser Data Persistent, Efficient, and Controllable](https://www.scrapeless.com/_next/image?url=https%3A%2F%2Fassets.scrapeless.com%2Fprod%2Fposts%2Fscrapeelss-profile%2F3194244c16c9b56e1592640ea95c389e.jpeg&w=3840&q=75)](https://www.scrapeless.com/en/blog/scrapeelss-profile) [**How to Track Your Ranking on ChatGPT?** \\\\\n\\\\\nLearn why traditional SEO tools fall short and how Scrapeless helps you monitor and optimize your AI rankings effortlessly.\\\\\n\\\\\n![Michael Lee](https://www.scrapeless.com/_next/image?url=https%3A%2F%2Fassets.scrapeless.com%2Fprod%2Fimages%2Fauthor-avatars%2Fmichael-lee.png&w=48&q=75)Michael Lee\\\\\n\\\\\n01-Jul-2025\\\\\n\\\\\n![ChatGPT Scraper](https://www.scrapeless.com/_next/image?url=https%3A%2F%2Fassets.scrapeless.com%2Fprod%2Fposts%2Fchatgpt-scraper%2F7c5b1ac494b6838a7eca2964df15ef59.png&w=3840&q=75)](https://www.scrapeless.com/en/blog/chatgpt-scraper)\n\nContact our sales team\n\nMonday to Friday, 9:00 AM - 18:00 PMSingapore Standard Time (UTC+08:00)\n\nScrapeless offers AI-powered, robust, and scalable web scraping and automation services trusted by leading enterprises. Our enterprise-grade solutions are tailored to meet your project needs, with dedicated technical support throughout. With a strong technical team and flexible delivery times, we charge only for successful data, enabling efficient data extraction while bypassing limitations.\n\nContact us now to fuel your business growth.\n\n[**4.8**](https://www.g2.com/products/scrapeless/reviews) [**4.5**](https://www.trustpilot.com/review/scrapeless.com) [**4.8**](https://slashdot.org/software/p/Scrapeless/) [**8.5**](https://tekpon.com/software/scrapeless/reviews/)\n\nBook a demo\n\nProvide your contact details, and we\'ll promptly reach out to offer a product demo and introduction. We ensure your information remains confidential, complying with GDPR standards.\n\nGet a demo\n\nRegister and Claim Free Trial\n\nYour free trial is ready! Sign up for a Scrapeless account for free, and your trial will be instantly activated in your account.\n\n[Sign up](https://app.scrapeless.com/passport/register)\n\nWe value your privacy\n\nWe use cookies to analyze website usage and do not record any of your personal information. View [Privacy Policy](https://www.scrapeless.com/en/legal/privacy-policy)\n\nReject\n\nAccept", 'metadata': {'language': 'en', 'description': 'Scrapeless is the best full-stack web scraping toolkit offering Scraping API, Scraping Browser, Universal Scraping API, Captcha Solver, and Proxies, designed to handle all your data collection needs with ease and reliability, empowering businesses and developers with efficient data extraction solutions.', 'google-site-verification': 'xj1xDpU8LpGG_h-2lIBVW_6GNW5Vtx0h5M3lz43HUXc', 'viewport': 'width=device-width, initial-scale=1', 'keywords': 'Scraping API, Scraping Browser, Universal Scraping API, Captcha Solver, and Proxies, web scraping,  web scraper, web scraping api, Web scraper,data scraping, web crawler', 'next-size-adjust': '', 'favicon': 'https://www.scrapeless.com/favicon.ico', 'title': 'Effortless Web Scraping Toolkit - Scrapeless', 'scrapeId': 'c7189211-7034-4e86-9afd-89fa5268b013', 'sourceURL': 'https://www.scrapeless.com/en', 'url': 'https://www.scrapeless.com/en', 'statusCode': 200}}]}

在 Agent 中使用

from langchain_openai import ChatOpenAI
from langchain_scrapeless import ScrapelessCrawlerScrapeTool
from langchain.agents import create_agent


model = ChatOpenAI()

tool = ScrapelessCrawlerScrapeTool()

# Use the tool with an agent
tools = [tool]
agent = create_agent(model, tools)

for chunk in agent.stream(
    {
        "messages": [
            (
                "human",
                "Use the scrapeless crawler scrape tool to get the website content of https://example.com and output the html content as a string.",
            )
        ]
    },
    stream_mode="values",
):
    chunk["messages"][-1].pretty_print()

================================ Human Message =================================

Use the scrapeless crawler scrape tool to get the website content of https://example.com and output the html content as a string.
================================== Ai Message ==================================
Tool Calls:
  scrapeless_crawler_scrape (call_qrPMGLjXmzb5QlVoIZgMuyPN)
 Call ID: call_qrPMGLjXmzb5QlVoIZgMuyPN
  Args:
    urls: ['https://example.com']
    formats: ['html']
================================= Tool Message =================================
Name: scrapeless_crawler_scrape

{"success": true, "status": "completed", "completed": 1, "total": 1, "data": [{"metadata": {"viewport": "width=device-width, initial-scale=1", "title": "Example Domain", "scrapeId": "63070ee5-ebef-4727-afe7-2b06466c6777", "sourceURL": "https://example.com", "url": "https://example.com", "statusCode": 200}, "html": "<!DOCTYPE html><html>\n\n<body>\n<div>\n    <h1>Example Domain</h1>\n    <p>This domain is for use in illustrative examples in documents. You may use this\n    domain in literature without prior coordination or asking for permission.</p>\n    <p><a href=\"https://www.iana.org/domains/example\">More information...</a></p>\n</div>\n\n\n<div id=\"div-f3t6fv31hyl\" style=\"display: none;\"></div></body></html>"}]}
================================== Ai Message ==================================

The HTML content of the website "https://example.com" is as follows:

\`\`\`html
<!DOCTYPE html><html>
<body>
<div>
    <h1>Example Domain</h1>
    <p>This domain is for use in illustrative examples in documents. You may use this
    domain in literature without prior coordination or asking for permission.</p>
    <p><a href="https://www.iana.org/domains/example">More information...</a></p>
</div>

<div id="div-f3t6fv31hyl" style="display: none;"></div></body></html>
\`\`\`

API 参考

Edit this page on GitHub or file an issue.

Connect these docs to Claude, VSCode, and more via MCP for real-time answers.

Popular Providers

Integrations by component

概述

集成详情

工具功能

安装

凭证

实例化

ScrapelessCrawlerScrapeTool

ScrapelessCrawlerCrawlTool

调用

ScrapelessCrawlerCrawlTool

带参数使用

在 Agent 中使用

ScrapelessCrawlerScrapeTool

带参数使用

带参数的高级使用

在 Agent 中使用

API 参考

Popular Providers

Integrations by component

​概述

​集成详情

​工具功能

​安装

​凭证

​实例化

​ScrapelessCrawlerScrapeTool

​ScrapelessCrawlerCrawlTool

​调用

​ScrapelessCrawlerCrawlTool

​带参数使用

​在 Agent 中使用

​ScrapelessCrawlerScrapeTool

​带参数使用

​带参数的高级使用

​在 Agent 中使用

​API 参考

概述

集成详情

工具功能

安装

凭证

实例化

ScrapelessCrawlerScrapeTool

ScrapelessCrawlerCrawlTool

调用

ScrapelessCrawlerCrawlTool

带参数使用

在 Agent 中使用

ScrapelessCrawlerScrapeTool

带参数使用

带参数的高级使用

在 Agent 中使用

API 参考