Skip to main content
本笔记本介绍如何从 Psychic 加载文档。详情请参阅此处

前提条件

  1. 按照此文档中的快速入门部分进行操作
  2. 登录 Psychic 控制台 并获取您的密钥
  3. 将前端 React 库集成到您的 Web 应用中,并让用户完成连接认证。连接将使用您指定的连接 ID 创建。

加载文档

使用 PsychicLoader 类从连接中加载文档。每个连接都有一个连接器 ID(对应已连接的 SaaS 应用)和一个连接 ID(您传入前端库的值)。
# Uncomment this to install psychicapi if you don't already have it installed
!poetry run pip -q install psychicapi langchain-chroma
from langchain_community.document_loaders import PsychicLoader
from psychicapi import ConnectorId

# Create a document loader for google drive. We can also load from other connectors by setting the connector_id to the appropriate value e.g. ConnectorId.notion.value
# This loader uses our test credentials
google_drive_loader = PsychicLoader(
    api_key="7ddb61c1-8b6a-4d31-a58e-30d1c9ea480e",
    connector_id=ConnectorId.gdrive.value,
    connection_id="google-test",
)

documents = google_drive_loader.load()

将文档转换为嵌入向量

现在我们可以将这些文档转换为嵌入向量,并存储到向量数据库(如 Chroma)中。
from langchain_classic.chains import RetrievalQAWithSourcesChain
from langchain_chroma import Chroma
from langchain_openai import OpenAI, OpenAIEmbeddings
from langchain_text_splitters import CharacterTextSplitter
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
texts = text_splitter.split_documents(documents)

embeddings = OpenAIEmbeddings()
docsearch = Chroma.from_documents(texts, embeddings)
chain = RetrievalQAWithSourcesChain.from_chain_type(
    OpenAI(temperature=0), chain_type="stuff", retriever=docsearch.as_retriever()
)
chain({"question": "what is psychic?"}, return_only_outputs=True)