Skip to main content
ParallelSearchRetriever 是一个由 ParallelSearch API 支持的 LangChain BaseRetriever。它返回带有丰富 metadataurltitlepublish_datesearch_idexcerptsquery)的 list[Document],并可接入任何 RAG 流水线。
寻找一个可由 LLM 调用的工具,该工具返回原始搜索响应而非 Document?请参阅 ParallelSearchTool

概述

集成详情

JS 支持包最新版本
ParallelSearchRetrieverlangchain-parallelPyPI - Latest version

设置

该集成位于 langchain-parallel 包中。
pip install -U langchain-parallel

凭证

前往 Parallel 注册并生成 API 密钥。在您的环境中设置 PARALLEL_API_KEY
import getpass
import os

if not os.environ.get("PARALLEL_API_KEY"):
    os.environ["PARALLEL_API_KEY"] = getpass.getpass("Parallel API key:\n")

实例化

from langchain_parallel import ParallelSearchRetriever

retriever = ParallelSearchRetriever(
    max_results=3,
    excerpts={"max_chars_per_result": 800},
)

用法

每个返回的 Document 将其摘录合并到 page_content 中,并在 metadata 中公开源 URL、标题和发布日期:
docs = retriever.invoke("breakthroughs in fusion energy 2025")
for d in docs:
    print(d.metadata.get("title"), "—", d.metadata.get("url"))
    print(d.page_content[:200], "...\n")
Net energy gain in fusion: NIF results — https://www.nature.com/articles/...
The National Ignition Facility achieved net energy gain on December 5, 2022 ...

Commonwealth Fusion's SPARC milestone — https://news.mit.edu/...
SPARC is on track for first plasma in 2026 ...

配置搜索

传递一个 objective 以提供比关键字 search_queries 更丰富的检索目标。检索器将源和获取策略转发给底层的 Search API。
configured = ParallelSearchRetriever(
    max_results=5,
    excerpts={"max_chars_per_result": 1500},
    mode="basic",  # 'basic' (lower latency) or 'advanced' (higher quality)
    source_policy={
        "include_domains": ["nature.com", "science.org", "iter.org"],
    },
)

docs = configured.invoke(
    "What's the latest peer-reviewed result on net-energy-gain fusion?"
)

异步

docs = await retriever.ainvoke("Latest GLP-1 trial results 2025")

在链中使用

ParallelSearchRetriever 可接入任何 LangChain 链:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain.chat_models import init_chat_model

llm = init_chat_model(model="claude-opus-4-7")

prompt = ChatPromptTemplate.from_messages([
    ("system", "Answer using only the context below. Cite URLs."),
    ("human", "Context:\n{context}\n\nQuestion: {question}"),
])

def format_docs(docs):
    return "\n\n".join(
        f"[{d.metadata.get('url')}] {d.page_content[:500]}" for d in docs
    )

chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

chain.invoke("What was the most recent fusion energy breakthrough?")

API 参考

有关详细文档,请访问 ParallelSearchRetriever API 参考或 Parallel Search API 指南