Nimble Search

Nimble 的 Search API 通过无头浏览器实时浏览网络来提供实时网页搜索，而非查询预建索引。该检索器可处理 JavaScript 渲染、动态内容和复杂的导航流程，适用于需要访问当前网络数据的 RAG 应用，包括分页、过滤器和客户端渲染背后的内容。

我们可以将其用作检索器。本文将展示该集成的特定功能。阅读完毕后，建议探索相关用例页面，以了解如何将此检索器作为更大链的一部分使用。

安装

pip install -U langchain-nimble

我们还需要设置 Nimble API 密钥。你可以在 Nimble 注册后获取 API 密钥。

import getpass
import os

if not os.environ.get("NIMBLE_API_KEY"):
    os.environ["NIMBLE_API_KEY"] = getpass.getpass("Nimble API key:\n")

使用

现在可以实例化我们的检索器：

from langchain_nimble import NimbleSearchRetriever

# 基础检索器
retriever = NimbleSearchRetriever(k=5)

在链中使用

我们可以轻松地将此检索器集成到用于问答的 RAG 链中：

from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import ChatOpenAI

# 创建 RAG 提示
prompt = ChatPromptTemplate.from_template(
    """Answer the question based only on the provided context.
If you cannot answer based on the context, say so.

Context: {context}

Question: {question}

Answer:"""
)

llm = ChatOpenAI(model="gpt-4o-mini")

# 配置检索器以获取全面结果
retriever = NimbleSearchRetriever(
    k=5,
    deep_search=True,
    parsing_type="markdown",
    include_domains=["wikipedia.org", "britannica.com", ".edu", ".gov"]
)


def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


# 构建 RAG 链
chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

# 提问
response = chain.invoke("What are the key differences between renewable and non-renewable energy sources?")
print(response)

Based on the provided context, here are the key differences between renewable and non-renewable energy sources:

**Renewable Energy Sources:**
- Naturally replenished on a human timescale (solar, wind, hydro, geothermal, biomass)
- Sustainable and virtually inexhaustible
- Generally produce little to no greenhouse gas emissions
- Lower environmental impact
- Costs have decreased significantly in recent years

**Non-Renewable Energy Sources:**
- Finite resources that cannot be replenished quickly (coal, oil, natural gas, nuclear)
- Will eventually be depleted
- Combustion releases significant greenhouse gases and pollutants
- Major contributor to climate change
- Currently still provide the majority of global energy but declining in competitiveness

The context indicates that renewable energy is increasingly becoming cost-competitive with fossil fuels while offering environmental benefits.

高级配置

该检索器支持针对不同使用场景的广泛配置：

参数	类型	默认值	说明
`k`	int	10	最大返回结果数（1-20）
`deep_search`	bool	True	深度模式（默认）提取完整内容，快速模式（False）仅返回 SERP 结果
`topic`	str	”general”	针对特定内容类型优化搜索：“general”、“news” 或 “location”
`include_answer`	bool	False	在搜索结果旁边生成 AI 摘要答案
`include_domains`	list[str]	None	白名单特定域名（如 [“wikipedia.org”, “.edu”]）
`exclude_domains`	list[str]	None	黑名单特定域名以过滤掉
`start_date`	str	None	过滤该日期之后的结果（YYYY-MM-DD 或 YYYY）
`end_date`	str	None	过滤该日期之前的结果（YYYY-MM-DD 或 YYYY）
`parsing_type`	str	”markdown”	输出格式：“plain_text”、“markdown” 或 “simplified_html”
`locale`	str	”en”	搜索语言区域（如 “en-US”）
`country`	str	”US”	本地化结果的国家代码（如 “US”）
`api_key`	str	环境变量	Nimble API 密钥（默认读取 NIMBLE_API_KEY 环境变量）

高级配置示例：

from langchain_nimble import NimbleSearchRetriever

# 针对学术研究优化的检索器
retriever = NimbleSearchRetriever(
    k=10,
    deep_search=True,
    topic="general",
    include_domains=["arxiv.org", "nature.com", "science.org"],
    start_date="2025-01-01",
    parsing_type="markdown"
)

docs = retriever.invoke("recent advances in quantum computing")

最佳实践

快速模式 vs 深度模式

深度模式（deep_search=True，默认）：
- 从网页中提取完整内容
- 适用于需要完整内容的 RAG 应用
- 最适合详细研究和构建知识库
- 处理 JavaScript 渲染和动态内容
快速模式（deep_search=False）：
- 仅提供包含标题和摘要的快速 SERP 结果
- 针对性能敏感型应用优化
- 最适合速度至关重要的大量查询
- 每次查询成本更低

过滤策略

域名过滤：

使用 include_domains 进行聚焦研究（学术、政府、可信来源）
使用 exclude_domains 过滤掉论坛、社交媒体或不可靠来源
组合使用两者可精确控制来源质量

日期过滤：

设置 start_date 和 end_date 处理时效性查询
对于最新新闻、时事或有时效性的信息至关重要
格式：“YYYY-MM-DD”（具体日期）或 “YYYY”（仅年份）

主题路由：

使用 topic="news" 优化当前事件和新闻文章的搜索
使用 topic="location" 优化本地商业和地理查询
使用 topic="general" 或省略以进行标准网页搜索

性能优化

选择合适的模式：对速度优先的大量查询使用快速模式（deep_search=False）；需要全面内容提取时使用深度模式（默认）
调整结果数量：从较小的 k 值开始，根据需要增加
使用异步：利用 ainvoke() 进行并发查询
合理使用缓存：考虑对频繁查询进行缓存
善用过滤：域名和日期过滤可减少噪音并提高相关性

API 参考

有关所有 NimbleSearchRetriever 功能和配置的详细文档，请访问 Nimble API 文档。

在 GitHub 上编辑此页面或提交问题。

通过 MCP 将这些文档连接到 Claude、VSCode 等工具，获取实时答案。

Popular Providers

Integrations by component

安装

使用

在链中使用

高级配置

最佳实践

快速模式 vs 深度模式

过滤策略

性能优化

API 参考

Popular Providers

Integrations by component

​安装

​使用

​在链中使用

​高级配置

​最佳实践

​快速模式 vs 深度模式

​过滤策略

​性能优化

​API 参考

安装

使用

在链中使用

高级配置

最佳实践

快速模式 vs 深度模式

过滤策略

性能优化

API 参考