Skip to main content
LangChain 实现了一个流式系统来展示实时更新。 流式传输对于增强基于 LLM 构建的应用程序的响应能力至关重要。通过逐步显示输出,甚至在完整响应准备好之前,流式传输显著改善了用户体验 (UX),特别是在处理 LLM 的延迟时。

概览

LangChain 的流式系统允许您将 agent 运行的实时反馈展示给您的应用程序。 LangChain 流式传输可以做什么: 请参阅下面的 常见模式 部分以获取更多端到端示例。

支持的流模式

将一个或多个流模式作为列表传递给 stream 方法:
模式描述
updates在每个 agent 步骤后流式传输状态更新。如果在同一步骤中进行了多次更新(例如,运行了多个节点),则这些更新将单独流式传输。
messages从调用 LLM 的任何图形节点流式传输 (token, metadata) 元组。
custom使用流编写器从图形节点内部流式传输自定义数据。

Agent 进度

要流式传输 agent 进度,请使用 streamMode: "updates"stream 方法。这会在每个 agent 步骤后发出一个事件。 例如,如果您有一个调用工具一次的 agent,您应该看到以下更新:
  • LLM 节点: 带有工具调用请求的 AIMessage
  • 工具节点: 带有执行结果的 ToolMessage
  • LLM 节点: 最终 AI 响应
import z from "zod";
import { createAgent, tool } from "langchain";

const getWeather = tool(
    async ({ city }) => {
        return `The weather in ${city} is always sunny!`;
    },
    {
        name: "get_weather",
        description: "Get weather for a given city.",
        schema: z.object({
        city: z.string(),
        }),
    }
);

const agent = createAgent({
    model: "gpt-5-nano",
    tools: [getWeather],
});

for await (const chunk of await agent.stream(
    { messages: [{ role: "user", content: "what is the weather in sf" }] },
    { streamMode: "updates" }
)) {
    const [step, content] = Object.entries(chunk)[0];
    console.log(`step: ${step}`);
    console.log(`content: ${JSON.stringify(content, null, 2)}`);
}
/**
 * step: model
 * content: {
 *   "messages": [
 *     {
 *       "kwargs": {
 *         // ...
 *         "tool_calls": [
 *           {
 *             "name": "get_weather",
 *             "args": {
 *               "city": "San Francisco"
 *             },
 *             "type": "tool_call",
 *             "id": "call_0qLS2Jp3MCmaKJ5MAYtr4jJd"
 *           }
 *         ],
 *         // ...
 *       }
 *     }
 *   ]
 * }
 * step: tools
 * content: {
 *   "messages": [
 *     {
 *       "kwargs": {
 *         "content": "The weather in San Francisco is always sunny!",
 *         "name": "get_weather",
 *         // ...
 *       }
 *     }
 *   ]
 * }
 * step: model
 * content: {
 *   "messages": [
 *     {
 *       "kwargs": {
 *         "content": "The latest update says: The weather in San Francisco is always sunny!\n\nIf you'd like real-time details (current temperature, humidity, wind, and today's forecast), I can pull the latest data for you. Want me to fetch that?",
 *         // ...
 *       }
 *     }
 *   ]
 * }
 * */

LLM token

要在 LLM 生成 token 时流式传输它们,请使用 streamMode: "messages"
import z from "zod";
import { createAgent, tool } from "langchain";

const getWeather = tool(
    async ({ city }) => {
        return `The weather in ${city} is always sunny!`;
    },
    {
        name: "get_weather",
        description: "Get weather for a given city.",
        schema: z.object({
        city: z.string(),
        }),
    }
);

const agent = createAgent({
    model: "gpt-4.1-mini",
    tools: [getWeather],
});

for await (const [token, metadata] of await agent.stream(
    { messages: [{ role: "user", content: "what is the weather in sf" }] },
    { streamMode: "messages" }
)) {
    console.log(`node: ${metadata.langgraph_node}`);
    console.log(`content: ${JSON.stringify(token.contentBlocks, null, 2)}`);
}

自定义更新

要在工具执行时流式传输更新,您可以使用配置中的 writer 参数。
import z from "zod";
import { tool, createAgent } from "langchain";
import { LangGraphRunnableConfig } from "@langchain/langgraph";

const getWeather = tool(
    async (input, config: LangGraphRunnableConfig) => {
        // Stream any arbitrary data
        config.writer?.(`Looking up data for city: ${input.city}`);
        // ... fetch city data
        config.writer?.(`Acquired data for city: ${input.city}`);
        return `It's always sunny in ${input.city}!`;
    },
    {
        name: "get_weather",
        description: "Get weather for a given city.",
        schema: z.object({
        city: z.string().describe("The city to get weather for."),
        }),
    }
);

const agent = createAgent({
    model: "gpt-4.1-mini",
    tools: [getWeather],
});

for await (const chunk of await agent.stream(
    { messages: [{ role: "user", content: "what is the weather in sf" }] },
    { streamMode: "custom" }
)) {
    console.log(chunk);
}
Output
Looking up data for city: San Francisco
Acquired data for city: San Francisco
如果您将 writer 参数添加到您的工具中,您将无法在没有提供 writer 函数的情况下在 LangGraph 执行上下文之外调用该工具。

流式传输多种模式

您可以通过传递 streamMode 数组来指定多种流式传输模式:streamMode: ["updates", "messages", "custom"] 流式输出将是 [mode, chunk] 元组,其中 mode 是流式传输模式的名称,chunk 是该模式流式传输的数据。
import z from "zod";
import { tool, createAgent } from "langchain";
import { LangGraphRunnableConfig } from "@langchain/langgraph";

const getWeather = tool(
    async (input, config: LangGraphRunnableConfig) => {
        // Stream any arbitrary data
        config.writer?.(`Looking up data for city: ${input.city}`);
        // ... fetch city data
        config.writer?.(`Acquired data for city: ${input.city}`);
        return `It's always sunny in ${input.city}!`;
    },
    {
        name: "get_weather",
        description: "Get weather for a given city.",
        schema: z.object({
        city: z.string().describe("The city to get weather for."),
        }),
    }
);

const agent = createAgent({
    model: "gpt-4.1-mini",
    tools: [getWeather],
});

for await (const [streamMode, chunk] of await agent.stream(
    { messages: [{ role: "user", content: "what is the weather in sf" }] },
    { streamMode: ["updates", "messages", "custom"] }
)) {
    console.log(`${streamMode}: ${JSON.stringify(chunk, null, 2)}`);
}

禁用流式传输

在某些应用程序中,您可能需要禁用给定模型的单个 token 流式传输。这在以下情况下很有用:
  • 使用 多 agent 系统来控制哪些 agent 流式传输其输出
  • 混合支持和不支持流式传输的模型
  • 部署到 LangSmith 并希望防止某些模型输出流式传输到客户端
初始化模型时设置 streaming: false
import { ChatOpenAI } from "@langchain/openai";

const model = new ChatOpenAI({
  model: "gpt-4.1",
  streaming: false,
});
部署到 LangSmith 时,对任何不希望流式传输到客户端的模型设置 streaming=False。这在部署前的图形代码中配置。
并非所有聊天模型集成都支持 streaming 参数。如果您的模型不支持,请改用 disableStreaming: true。此参数通过基类在所有聊天模型上可用。
有关更多详细信息,请参阅 LangGraph 流式传输指南

相关