Skip to main content
本 notebook 演示了以适合作为嵌入向量传递给大语言模型的格式检索 Cube 数据模型元数据的过程,从而增强上下文信息。

关于 Cube

Cube 是用于构建数据应用的语义层。它帮助数据工程师和应用开发人员从现代数据存储中访问数据,将其整理为一致的定义,并将其提供给各个应用程序。 Cube’s data model provides structure and definitions that are used as a context for LLM to understand data and generate correct queries. LLM doesn’t need to navigate complex joins and metrics calculations because Cube abstracts those and provides a simple interface that operates on the business-level terminology, instead of SQL table and column names. This simplification helps LLM to be less error-prone and avoid hallucinations.

示例

输入参数(必填) Cube Semantic Loader 需要 2 个参数:
  • cube_api_url:您的 Cube 部署的 REST API 的 URL。有关配置基本路径的更多信息,请参阅 Cube 文档
  • cube_api_token:根据您的 Cube API 密钥生成的身份验证令牌。有关生成 JSON Web Token(JWT)的说明,请参阅 Cube 文档
输入参数(可选)
  • load_dimension_values:是否为每个字符串维度加载维度值。
  • dimension_values_limit:要加载的维度值的最大数量。
  • dimension_values_max_retries:加载维度值的最大重试次数。
  • dimension_values_retry_delay:加载维度值重试之间的延迟。
import jwt
from langchain_community.document_loaders import CubeSemanticLoader

api_url = "https://api-example.gcp-us-central1.cubecloudapp.dev/cubejs-api/v1/meta"
cubejs_api_secret = "api-secret-here"
security_context = {}
# Read more about security context here: https://cube.dev/docs/security
api_token = jwt.encode(security_context, cubejs_api_secret, algorithm="HS256")

loader = CubeSemanticLoader(api_url, api_token)

documents = loader.load()
返回具有以下属性的文档列表:
  • page_content
  • metadata
    • table_name
    • column_name
    • column_data_type
    • column_title
    • column_description
    • column_values
    • cube_data_obj_type
# Given string containing page content
page_content = "Users View City, None"

# Given dictionary containing metadata
metadata = {
    "table_name": "users_view",
    "column_name": "users_view.city",
    "column_data_type": "string",
    "column_title": "Users View City",
    "column_description": "None",
    "column_member_type": "dimension",
    "column_values": [
        "Austin",
        "Chicago",
        "Los Angeles",
        "Mountain View",
        "New York",
        "Palo Alto",
        "San Francisco",
        "Seattle",
    ],
    "cube_data_obj_type": "view",
}