TruLens 集成

TruLens 是一个开源包，为大型语言模型（LLM）应用程序提供插桩和评估工具。

本页介绍如何使用 TruLens 评估和跟踪基于 LangChain 构建的 LLM 应用程序。

安装与设置

安装 trulens-eval Python 包。

pip install trulens-eval

快速入门

查看 TruLens 文档中的集成详情。

跟踪

创建 LLM 链之后，您可以使用 TruLens 进行评估和跟踪。 TruLens 提供许多开箱即用的反馈函数，同时也是一个可扩展的 LLM 评估框架。创建反馈函数：

from trulens_eval.feedback import Feedback, Huggingface,

# Initialize HuggingFace-based feedback function collection class:
hugs = Huggingface()
openai = OpenAI()

# Define a language match feedback function using HuggingFace.
lang_match = Feedback(hugs.language_match).on_input_output()
# By default this will check language match on the main app input and main app
# output.

# Question/answer relevance between overall question and answer.
qa_relevance = Feedback(openai.relevance).on_input_output()
# By default this will evaluate feedback on main app input and main app output.

# Toxicity of input
toxicity = Feedback(openai.toxicity).on_input()

链

在为 LLM 设置好反馈函数之后，您可以用 TruChain 封装您的应用程序，以获取 LLM 应用程序的详细追踪、日志和评估。注意：chain 创建的代码请参阅 TruLens 文档。

from trulens_eval import TruChain

# wrap your chain with TruChain
truchain = TruChain(
    chain,
    app_id='Chain1_ChatApplication',
    feedbacks=[lang_match, qa_relevance, toxicity]
)
# Note: any `feedbacks` specified here will be evaluated and logged whenever the chain is used.
truchain("que hora es?")

评估

现在您可以探索您的 LLM 应用程序了！这样做将帮助您一目了然地了解 LLM 应用程序的性能表现。随着您迭代新版本的 LLM 应用程序，您可以跨所有设置的质量指标比较其性能。您还可以在记录级别查看评估结果，并探索每条记录的链元数据。

from trulens_eval import Tru

tru = Tru()
tru.run_dashboard() # open a Streamlit app to explore

有关 TruLens 的更多信息，请访问 trulens.org

在 GitHub 上编辑此页面或提交问题。

连接这些文档到 Claude、VSCode 等，通过 MCP 获取实时答案。

Popular Providers

Integrations by component

安装与设置

快速入门

跟踪

链

评估

Popular Providers

Integrations by component

​安装与设置

​快速入门

​跟踪

​链

​评估

安装与设置

快速入门

跟踪

链

评估