Skip to main content

概述

记忆是一个记录先前交互信息的系统。对于AI代理而言,记忆至关重要,因为它使代理能够记住先前的交互、从反馈中学习并适应用户偏好。随着代理处理更复杂的任务和大量的用户交互,这一能力对于效率和用户满意度都变得至关重要。 短期记忆使您的应用程序能够在单个线程或会话中记住先前的交互。
一个线程组织会话中的多次交互,类似于电子邮件将消息分组到单个对话中的方式。
对话历史是短期记忆最常见的形式。长对话对当今的LLM构成了挑战;完整的历史记录可能无法放入LLM的上下文窗口中,导致上下文丢失或错误。 即使您的模型支持完整的上下文长度,大多数LLM在长上下文上的表现仍然不佳。它们会因过时或离题的内容而“分心”,同时响应时间更慢,成本更高。 聊天模型使用消息来接受上下文,其中包括指令(系统消息)和输入(人类消息)。在聊天应用程序中,消息在人类输入和模型响应之间交替,形成一个随时间增长的消息列表。由于上下文窗口有限,许多应用程序可以从使用移除或“遗忘”过时信息的技术中受益。
需要对话记忆信息?使用长期记忆来存储和检索跨不同线程和会话的用户特定或应用程序级数据。

用法

要为代理添加短期记忆(线程级持久化),您需要在创建代理时指定一个checkpointer
LangChain的代理将短期记忆作为代理状态的一部分进行管理。通过将这些存储在图的状态中,代理可以访问给定对话的完整上下文,同时保持不同线程之间的分离。状态使用检查点持久化到数据库(或内存),以便线程可以随时恢复。短期记忆在代理被调用或步骤(如工具调用)完成时更新,并在每个步骤开始时读取状态。
from langchain.agents import create_agent
from langgraph.checkpoint.memory import InMemorySaver  


agent = create_agent(
    "gpt-5.4",
    tools=[get_user_info],
    checkpointer=InMemorySaver(),
)

agent.invoke(
    {"messages": [{"role": "user", "content": "Hi! My name is Bob."}]},
    {"configurable": {"thread_id": "1"}},  # [code highlight]
)

在生产环境中

在生产环境中,使用由数据库支持的检查点:
pip install langgraph-checkpoint-postgres
from langchain.agents import create_agent

from langgraph.checkpoint.postgres import PostgresSaver  


DB_URI = "postgresql://postgres:postgres@localhost:5442/postgres?sslmode=disable"
with PostgresSaver.from_conn_string(DB_URI) as checkpointer:
    checkpointer.setup() # 在PostgreSQL中自动创建表
    agent = create_agent(
        "gpt-5.4",
        tools=[get_user_info],
        checkpointer=checkpointer,
    )
有关更多检查点选项,包括SQLite、Postgres和Azure Cosmos DB,请参阅持久化文档中的检查点库列表

自定义代理记忆

默认情况下,代理使用AgentState来管理短期记忆,特别是通过messages键管理对话历史。 您可以扩展AgentState以添加额外字段。自定义状态模式通过state_schema参数传递给create_agent
from langchain.agents import create_agent, AgentState
from langgraph.checkpoint.memory import InMemorySaver


class CustomAgentState(AgentState):
    user_id: str
    preferences: dict

agent = create_agent(
    "gpt-5.4",
    tools=[get_user_info],
    state_schema=CustomAgentState,
    checkpointer=InMemorySaver(),
)

# 自定义状态可以在调用时传入
result = agent.invoke(
    {
        "messages": [{"role": "user", "content": "Hello"}],
        "user_id": "user_123",
        "preferences": {"theme": "dark"}
    },
    {"configurable": {"thread_id": "1"}})

常见模式

启用短期记忆后,长对话可能会超出LLM的上下文窗口。常见的解决方案有:

裁剪消息

移除前N条或后N条消息(在调用LLM之前)

删除消息

从LangGraph状态中永久删除消息

总结消息

总结历史中的早期消息,并用摘要替换它们

自定义策略

自定义策略(例如,消息过滤等)
这允许代理跟踪对话而不会超出LLM的上下文窗口。

裁剪消息

大多数LLM都有一个最大支持的上下文窗口(以令牌为单位)。 决定何时截断消息的一种方法是计算消息历史中的令牌数,并在接近该限制时进行截断。如果您使用LangChain,可以使用裁剪消息实用程序,并指定要从列表中保留的令牌数量,以及用于处理边界的strategy(例如,保留最后max_tokens条)。 要在代理中裁剪消息历史,请使用@before_model中间件装饰器:
from langchain.messages import RemoveMessage
from langgraph.graph.message import REMOVE_ALL_MESSAGES
from langgraph.checkpoint.memory import InMemorySaver
from langchain.agents import create_agent, AgentState
from langchain.agents.middleware import before_model
from langgraph.runtime import Runtime
from langchain_core.runnables import RunnableConfig
from typing import Any


@before_model
def trim_messages(state: AgentState, runtime: Runtime) -> dict[str, Any] | None:
    """仅保留最后几条消息以适应上下文窗口。"""
    messages = state["messages"]

    if len(messages) <= 3:
        return None  # 无需更改

    first_msg = messages[0]
    recent_messages = messages[-3:] if len(messages) % 2 == 0 else messages[-4:]
    new_messages = [first_msg] + recent_messages

    return {
        "messages": [
            RemoveMessage(id=REMOVE_ALL_MESSAGES),
            *new_messages
        ]
    }

agent = create_agent(
    your_model_here,
    tools=your_tools_here,
    middleware=[trim_messages],
    checkpointer=InMemorySaver(),
)

config: RunnableConfig = {"configurable": {"thread_id": "1"}}

agent.invoke({"messages": "hi, my name is bob"}, config)
agent.invoke({"messages": "write a short poem about cats"}, config)
agent.invoke({"messages": "now do the same but for dogs"}, config)
final_response = agent.invoke({"messages": "what's my name?"}, config)

final_response["messages"][-1].pretty_print()
"""
================================== Ai 消息 ==================================

您的名字是Bob。您之前告诉过我。
如果您希望我称呼您的昵称或使用其他名称,请告诉我。
"""

删除消息

您可以从图状态中删除消息以管理消息历史。 当您想要移除特定消息或清除整个消息历史时,这很有用。 要从图状态中删除消息,您可以使用RemoveMessage 要使RemoveMessage工作,您需要使用带有add_messages 归约器的状态键。 默认的AgentState提供了此功能。 要移除特定消息:
from langchain.messages import RemoveMessage  

def delete_messages(state):
    messages = state["messages"]
    if len(messages) > 2:
        # 移除最早的两条消息
        return {"messages": [RemoveMessage(id=m.id) for m in messages[:2]]}
要移除所有消息:
from langgraph.graph.message import REMOVE_ALL_MESSAGES  

def delete_messages(state):
    return {"messages": [RemoveMessage(id=REMOVE_ALL_MESSAGES)]}
删除消息时,请确保生成的消息历史是有效的。检查您使用的LLM提供商的限制。例如:
  • 一些提供商期望消息历史以user消息开始
  • 大多数提供商要求带有工具调用的assistant消息后跟相应的tool结果消息。
from langchain.messages import RemoveMessage
from langchain.agents import create_agent, AgentState
from langchain.agents.middleware import after_model
from langgraph.checkpoint.memory import InMemorySaver
from langgraph.runtime import Runtime
from langchain_core.runnables import RunnableConfig


@after_model
def delete_old_messages(state: AgentState, runtime: Runtime) -> dict | None:
    """移除旧消息以保持对话可管理。"""
    messages = state["messages"]
    if len(messages) > 2:
        # 移除最早的两条消息
        return {"messages": [RemoveMessage(id=m.id) for m in messages[:2]]}
    return None


agent = create_agent(
    "gpt-5-nano",
    tools=[],
    system_prompt="请简洁明了。",
    middleware=[delete_old_messages],
    checkpointer=InMemorySaver(),
)

config: RunnableConfig = {"configurable": {"thread_id": "1"}}

for event in agent.stream(
    {"messages": [{"role": "user", "content": "hi! I'm bob"}]},
    config,
    stream_mode="values",
):
    print([(message.type, message.content) for message in event["messages"]])

for event in agent.stream(
    {"messages": [{"role": "user", "content": "what's my name?"}]},
    config,
    stream_mode="values",
):
    print([(message.type, message.content) for message in event["messages"]])
[('human', "hi! I'm bob")]
[('human', "hi! I'm bob"), ('ai', 'Hi Bob! Nice to meet you. How can I help you today? I can answer questions, brainstorm ideas, draft text, explain things, or help with code.')]
[('human', "hi! I'm bob"), ('ai', 'Hi Bob! Nice to meet you. How can I help you today? I can answer questions, brainstorm ideas, draft text, explain things, or help with code.'), ('human', "what's my name?")]
[('human', "hi! I'm bob"), ('ai', 'Hi Bob! Nice to meet you. How can I help you today? I can answer questions, brainstorm ideas, draft text, explain things, or help with code.'), ('human', "what's my name?"), ('ai', 'Your name is Bob. How can I help you today, Bob?')]
[('human', "what's my name?"), ('ai', 'Your name is Bob. How can I help you today, Bob?')]

总结消息

如上所示,裁剪或移除消息的问题在于,您可能会因消息队列的筛选而丢失信息。 因此,一些应用程序受益于使用聊天模型总结消息历史的更复杂方法。 摘要 要在代理中总结消息历史,请使用内置的SummarizationMiddleware
from langchain.agents import create_agent
from langchain.agents.middleware import SummarizationMiddleware
from langgraph.checkpoint.memory import InMemorySaver
from langchain_core.runnables import RunnableConfig


checkpointer = InMemorySaver()

agent = create_agent(
    model="gpt-5.4",
    tools=[],
    middleware=[
        SummarizationMiddleware(
            model="gpt-5.4-mini",
            trigger=("tokens", 4000),
            keep=("messages", 20)
        )
    ],
    checkpointer=checkpointer,
)

config: RunnableConfig = {"configurable": {"thread_id": "1"}}
agent.invoke({"messages": "hi, my name is bob"}, config)
agent.invoke({"messages": "write a short poem about cats"}, config)
agent.invoke({"messages": "now do the same but for dogs"}, config)
final_response = agent.invoke({"messages": "what's my name?"}, config)

final_response["messages"][-1].pretty_print()
"""
================================== Ai 消息 ==================================

您的名字是Bob!
"""
有关更多配置选项,请参阅SummarizationMiddleware

访问记忆

您可以通过多种方式访问和修改代理的短期记忆(状态):

工具

在工具中读取短期记忆

使用runtime参数(类型为ToolRuntime)在工具中访问短期记忆(状态)。 runtime参数在工具签名中是隐藏的(因此模型看不到它),但工具可以通过它访问状态。
from langchain.agents import create_agent, AgentState
from langchain.tools import tool, ToolRuntime


class CustomState(AgentState):
    user_id: str

@tool
def get_user_info(
    runtime: ToolRuntime
) -> str:
    """查找用户信息。"""
    user_id = runtime.state["user_id"]
    return "User is John Smith" if user_id == "user_123" else "Unknown user"

agent = create_agent(
    model="gpt-5-nano",
    tools=[get_user_info],
    state_schema=CustomState,
)

result = agent.invoke({
    "messages": "look up user information",
    "user_id": "user_123"
})
print(result["messages"][-1].content)
# > User is John Smith.

从工具写入短期记忆

要在执行期间修改代理的短期记忆(状态),您可以直接从工具返回状态更新。 这对于持久化中间结果或使信息可供后续工具或提示访问很有用。
from langchain.tools import tool, ToolRuntime
from langchain_core.runnables import RunnableConfig
from langchain.messages import ToolMessage
from langchain.agents import create_agent, AgentState
from langgraph.types import Command
from pydantic import BaseModel


class CustomState(AgentState):
    user_name: str

class CustomContext(BaseModel):
    user_id: str

@tool
def update_user_info(
    runtime: ToolRuntime[CustomContext, CustomState],
) -> Command:
    """查找并更新用户信息。"""
    user_id = runtime.context.user_id
    name = "John Smith" if user_id == "user_123" else "Unknown user"
    return Command(update={
        "user_name": name,
        # 更新消息历史
        "messages": [
            ToolMessage(
                "Successfully looked up user information",
                tool_call_id=runtime.tool_call_id
            )
        ]
    })

@tool
def greet(
    runtime: ToolRuntime[CustomContext, CustomState]
) -> str | Command:
    """找到用户信息后使用此工具向用户问好。"""
    user_name = runtime.state.get("user_name", None)
    if user_name is None:
       return Command(update={
            "messages": [
                ToolMessage(
                    "Please call the 'update_user_info' tool it will get and update the user's name.",
                    tool_call_id=runtime.tool_call_id
                )
            ]
        })
    return f"Hello {user_name}!"

agent = create_agent(
    model="gpt-5-nano",
    tools=[update_user_info, greet],
    state_schema=CustomState,
    context_schema=CustomContext,
)

agent.invoke(
    {"messages": [{"role": "user", "content": "greet the user"}]},
    context=CustomContext(user_id="user_123"),
)

提示

在中间件中访问短期记忆(状态),以根据对话历史或自定义状态字段创建动态提示。
from langchain.agents import create_agent
from typing import TypedDict
from langchain.agents.middleware import dynamic_prompt, ModelRequest


class CustomContext(TypedDict):
    user_name: str


def get_weather(city: str) -> str:
    """获取城市的天气。"""
    return f"The weather in {city} is always sunny!"


@dynamic_prompt
def dynamic_system_prompt(request: ModelRequest) -> str:
    user_name = request.runtime.context["user_name"]
    system_prompt = f"You are a helpful assistant. Address the user as {user_name}."
    return system_prompt


agent = create_agent(
    model="gpt-5-nano",
    tools=[get_weather],
    middleware=[dynamic_system_prompt],
    context_schema=CustomContext,
)

result = agent.invoke(
    {"messages": [{"role": "user", "content": "What is the weather in SF?"}]},
    context=CustomContext(user_name="John Smith"),
)
for msg in result["messages"]:
    msg.pretty_print()

输出
================================ Human 消息 =================================

What is the weather in SF?
================================== Ai 消息 ==================================
Tool Calls:
  get_weather (call_WFQlOGn4b2yoJrv7cih342FG)
 Call ID: call_WFQlOGn4b2yoJrv7cih342FG
  Args:
    city: San Francisco
================================= Tool 消息 =================================
Name: get_weather

The weather in San Francisco is always sunny!
================================== Ai 消息 ==================================

Hi John Smith, the weather in San Francisco is always sunny!

模型之前

@before_model中间件中访问短期记忆(状态),以在模型调用之前处理消息。
from langchain.messages import RemoveMessage
from langgraph.graph.message import REMOVE_ALL_MESSAGES
from langgraph.checkpoint.memory import InMemorySaver
from langchain.agents import create_agent, AgentState
from langchain.agents.middleware import before_model
from langchain_core.runnables import RunnableConfig
from langgraph.runtime import Runtime
from typing import Any


@before_model
def trim_messages(state: AgentState, runtime: Runtime) -> dict[str, Any] | None:
    """仅保留最后几条消息以适应上下文窗口。"""
    messages = state["messages"]

    if len(messages) <= 3:
        return None  # 无需更改

    first_msg = messages[0]
    recent_messages = messages[-3:] if len(messages) % 2 == 0 else messages[-4:]
    new_messages = [first_msg] + recent_messages

    return {
        "messages": [
            RemoveMessage(id=REMOVE_ALL_MESSAGES),
            *new_messages
        ]
    }


agent = create_agent(
    "gpt-5-nano",
    tools=[],
    middleware=[trim_messages],
    checkpointer=InMemorySaver()
)

config: RunnableConfig = {"configurable": {"thread_id": "1"}}

agent.invoke({"messages": "hi, my name is bob"}, config)
agent.invoke({"messages": "write a short poem about cats"}, config)
agent.invoke({"messages": "now do the same but for dogs"}, config)
final_response = agent.invoke({"messages": "what's my name?"}, config)

final_response["messages"][-1].pretty_print()
"""
================================== Ai 消息 ==================================

您的名字是Bob。您之前告诉过我。
如果您希望我称呼您的昵称或使用其他名称,请告诉我。
"""

模型之后

@after_model中间件中访问短期记忆(状态),以在模型调用之后处理消息。
from langchain.messages import RemoveMessage
from langgraph.checkpoint.memory import InMemorySaver
from langchain.agents import create_agent, AgentState
from langchain.agents.middleware import after_model
from langgraph.runtime import Runtime


@after_model
def validate_response(state: AgentState, runtime: Runtime) -> dict | None:
    """移除包含敏感词的消息。"""
    STOP_WORDS = ["password", "secret"]
    last_message = state["messages"][-1]
    if any(word in last_message.content for word in STOP_WORDS):
        return {"messages": [RemoveMessage(id=last_message.id)]}
    return None

agent = create_agent(
    model="gpt-5-nano",
    tools=[],
    middleware=[validate_response],
    checkpointer=InMemorySaver(),
)