Gel 集成

使用 gel 作为后端的 LangChain 向量存储抽象实现。

Gel 是一款开源 PostgreSQL 数据层，针对快速从开发到生产的周期进行了优化。它提供高层次的严格类型化图状数据模型、可组合的层次化查询语言、完整的 SQL 支持、迁移、Auth 和 AI 模块。相关代码位于名为 langchain-gel 的集成包中。

设置

首先安装相关包：

! pip install -qU gel langchain-gel

初始化

要将 Gel 用作 VectorStore 的后端，您需要一个正常运行的 Gel 实例。幸运的是，这不一定需要 Docker 容器或其他复杂操作（除非您有此需求）。要设置本地实例，请运行：

! gel project init --non-interactive

如果您使用 Gel Cloud（推荐），请在命令中添加一个参数：

gel project init --server-instance <org-name>/<instance-name>

有关运行 Gel 的完整方式列表，请参阅参考文档中的 Running Gel 部分。

设置 Schema

Gel schema 是对应用程序数据模型的显式高层描述。除了让您精确定义数据的布局方式外，它还驱动 Gel 的许多强大功能，如链接、访问策略、函数、触发器、约束、索引等。 LangChain 的 VectorStore 期望 schema 采用以下布局：

schema_content = """
using extension pgvector;

module default {
    scalar type EmbeddingVector extending ext::pgvector::vector<1536>;

    type Record {
        required collection: str;
        text: str;
        embedding: EmbeddingVector;
        external_id: str {
            constraint exclusive;
        };
        metadata: json;

        index ext::pgvector::hnsw_cosine(m := 16, ef_construction := 128)
            on (.embedding)
    }
}
""".strip()

with open("dbschema/default.gel", "w") as f:
    f.write(schema_content)

要将 schema 变更应用到数据库，请使用 Gel 的迁移机制执行迁移：

! gel migration create --non-interactive
! gel migrate

从此时起，GelVectorStore 可作为 LangChain 中任何其他向量存储的直接替代品使用。

实例化

# | output: false
# | echo: false
from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

from langchain_gel import GelVectorStore

vector_store = GelVectorStore(
    embeddings=embeddings,
)

管理向量存储

向向量存储添加条目

注意，通过 ID 添加文档会覆盖与该 ID 匹配的任何现有文档。

from langchain_core.documents import Document

docs = [
    Document(
        page_content="there are cats in the pond",
        metadata={"id": "1", "location": "pond", "topic": "animals"},
    ),
    Document(
        page_content="ducks are also found in the pond",
        metadata={"id": "2", "location": "pond", "topic": "animals"},
    ),
    Document(
        page_content="fresh apples are available at the market",
        metadata={"id": "3", "location": "market", "topic": "food"},
    ),
    Document(
        page_content="the market also sells fresh oranges",
        metadata={"id": "4", "location": "market", "topic": "food"},
    ),
    Document(
        page_content="the new art exhibit is fascinating",
        metadata={"id": "5", "location": "museum", "topic": "art"},
    ),
    Document(
        page_content="a sculpture exhibit is also at the museum",
        metadata={"id": "6", "location": "museum", "topic": "art"},
    ),
    Document(
        page_content="a new coffee shop opened on Main Street",
        metadata={"id": "7", "location": "Main Street", "topic": "food"},
    ),
    Document(
        page_content="the book club meets at the library",
        metadata={"id": "8", "location": "library", "topic": "reading"},
    ),
    Document(
        page_content="the library hosts a weekly story time for kids",
        metadata={"id": "9", "location": "library", "topic": "reading"},
    ),
    Document(
        page_content="a cooking class for beginners is offered at the community center",
        metadata={"id": "10", "location": "community center", "topic": "classes"},
    ),
]

vector_store.add_documents(docs, ids=[doc.metadata["id"] for doc in docs])

从向量存储删除条目

vector_store.delete(ids=["3"])

查询向量存储

向量存储创建完毕并添加相关文档后，您很可能需要在链或代理运行过程中对其进行查询。

过滤支持

向量存储支持一组可应用于文档元数据字段的过滤器。

运算符	含义/类别
$eq	等于 (==)
$ne	不等于 (!=)
$lt	小于 (<)
$lte	小于或等于 (<=)
$gt	大于 (>)
$gte	大于或等于 (>=)
$in	特殊用途（in）
$nin	特殊用途（not in）
$between	特殊用途（between）
$like	文本（like）
$ilike	文本（不区分大小写的 like）
$and	逻辑（and）
$or	逻辑（or）

直接查询

简单相似度搜索可按如下方式执行：

results = vector_store.similarity_search(
    "kitty", k=10, filter={"id": {"$in": ["1", "5", "2", "9"]}}
)
for doc in results:
    print(f"* {doc.page_content} [{doc.metadata}]")

如果您提供包含多个字段但不含运算符的字典，顶层将被解释为逻辑 AND 过滤器。

vector_store.similarity_search(
    "ducks",
    k=10,
    filter={
        "id": {"$in": ["1", "5", "2", "9"]},
        "location": {"$in": ["pond", "market"]},
    },
)

vector_store.similarity_search(
    "ducks",
    k=10,
    filter={
        "$and": [
            {"id": {"$in": ["1", "5", "2", "9"]}},
            {"location": {"$in": ["pond", "market"]}},
        ]
    },
)

如果您希望执行相似度搜索并获取对应分数，可以运行：

results = vector_store.similarity_search_with_score(query="cats", k=1)
for doc, score in results:
    print(f"* [SIM={score:3f}] {doc.page_content} [{doc.metadata}]")

转换为检索器后查询

您也可以将向量存储转换为检索器，以便在链中更方便地使用。

retriever = vector_store.as_retriever(search_kwargs={"k": 1})
retriever.invoke("kitty")

用于检索增强生成

关于如何将此向量存储用于检索增强生成（RAG）的指南，请参阅以下章节：

API 参考

有关所有 GelVectorStore 功能和配置的详细文档，请前往 API 参考：python.langchain.com/api_reference/

在 GitHub 上编辑此页面或提交 Issue。

通过 MCP 将这些文档连接到 Claude、VSCode 等，获取实时答案。

Popular Providers

Integrations by component

设置

初始化

设置 Schema

实例化

管理向量存储

向向量存储添加条目

从向量存储删除条目

查询向量存储

过滤支持

直接查询

转换为检索器后查询

用于检索增强生成

API 参考

Popular Providers

Integrations by component

​设置

​初始化

​设置 Schema

​实例化

​管理向量存储

​向向量存储添加条目

​从向量存储删除条目

​查询向量存储

​过滤支持

​直接查询

​转换为检索器后查询

​用于检索增强生成

​API 参考

设置

初始化

设置 Schema

实例化

管理向量存储

向向量存储添加条目

从向量存储删除条目

查询向量存储

过滤支持

直接查询

转换为检索器后查询

用于检索增强生成

API 参考