Skip to main content
Valkey 是一个开源、高性能的键/值数据存储,支持缓存、消息队列等工作负载,并可作为主数据库使用。Valkey 可以作为独立守护进程或集群运行,并提供复制和高可用性选项。
本页介绍如何将 Valkey 向量存储与 Amazon ElastiCache for ValkeyAmazon MemoryDB for Valkey 结合使用。

设置

安装所需依赖项:
pip install "langchain-aws[valkey]"
Valkey 集成需要 langchain-aws>=1.5.0。如果您使用的是早期版本,请直接安装依赖项:
pip install langchain-aws valkey-glide-sync

基本用法

使用 Bedrock 嵌入

from langchain_aws import BedrockEmbeddings
from langchain_aws.vectorstores import ValkeyVectorStore

# 初始化嵌入
embeddings = BedrockEmbeddings(
    model_id="amazon.titan-embed-text-v1",
    region_name="us-east-1"
)

# 从文本创建向量存储
vectorstore = ValkeyVectorStore.from_texts(
    texts=["Valkey is fast", "Valkey supports vector search"],
    embedding=embeddings,
    valkey_url="valkey://localhost:6379",
    index_name="my_index"
)

# 执行相似性搜索
results = vectorstore.similarity_search("fast database", k=2)
for doc in results:
    print(doc.page_content)

使用 Ollama 嵌入

from langchain_ollama import OllamaEmbeddings
from langchain_aws.vectorstores import ValkeyVectorStore

# 初始化 Ollama 嵌入
embeddings = OllamaEmbeddings(
    model="nomic-embed-text",
    base_url="http://localhost:11434"
)

# 创建向量存储
vectorstore = ValkeyVectorStore(
    embedding=embeddings,
    valkey_url="valkey://localhost:6379",
    index_name="my_index",
    vector_schema={
        "name": "content_vector",
        "algorithm": "FLAT",
        "dims": 768,  # nomic-embed-text 维度
        "distance_metric": "COSINE",
        "datatype": "FLOAT32",
    }
)

# 添加文本
vectorstore.add_texts(
    texts=["Document 1", "Document 2"],
    metadatas=[{"source": "doc1"}, {"source": "doc2"}]
)

# 搜索
results = vectorstore.similarity_search("query", k=2)

连接 URL

ValkeyVectorStore 支持各种连接 URL 格式:
# 独立模式
valkey_url = "valkey://localhost:6379"

# 带认证
valkey_url = "valkey://username:password@host:6379"

# SSL/TLS
valkey_url = "valkeyss://host:6379"

# 带认证的 SSL
valkey_url = "valkeyss://username:password@host:6379"

AWS ElastiCache for Valkey

from langchain_aws import BedrockEmbeddings
from langchain_aws.vectorstores import ValkeyVectorStore

embeddings = BedrockEmbeddings()

# 连接到 ElastiCache 集群
vectorstore = ValkeyVectorStore(
    embedding=embeddings,
    valkey_url="valkeyss://my-cluster.cache.amazonaws.com:6379",
    index_name="my_index"
)

# 添加文档
vectorstore.add_texts(
    texts=["Document 1", "Document 2"],
    metadatas=[{"source": "doc1"}, {"source": "doc2"}]
)

元数据过滤

from langchain_aws.vectorstores.valkey.filters import ValkeyTag, ValkeyNum

# 添加带元数据的文档
vectorstore.add_texts(
    texts=["AI article from 2024", "ML paper from 2023"],
    metadatas=[
        {"category": "ai", "year": 2024},
        {"category": "ml", "year": 2023}
    ]
)

# 带过滤器的搜索
filter_expr = (ValkeyTag("category") == "ai") & (ValkeyNum("year") >= 2024)
results = vectorstore.similarity_search(
    "artificial intelligence",
    k=5,
    filter=str(filter_expr)
)

自定义向量模式

from langchain_aws.vectorstores import ValkeyVectorStore

vectorstore = ValkeyVectorStore(
    embedding=embeddings,
    valkey_url="valkey://localhost:6379",
    index_name="my_index",
    vector_schema={
        "name": "content_vector",
        "algorithm": "HNSW",  # 或 "FLAT"
        "dims": 1536,
        "distance_metric": "COSINE",  # 或 "L2", "IP"
        "datatype": "FLOAT32",
    }
)

API 参考

有关详细的 API 文档,请参阅 ValkeyVectorStore