基于 AnythingLLM API 訓(xùn)練并調(diào)用本地知識(shí)庫(kù)

引言

在人工智能技術(shù)快速發(fā)展的今天，企業(yè)對(duì)于數(shù)據(jù)安全和隱私保護(hù)的需求日益增強(qiáng)。基于檢索增強(qiáng)生成（Retrieval-Augmented Generation, RAG）的本地知識(shí)庫(kù)系統(tǒng)，成為解決大模型知識(shí)局限性和幻覺問(wèn)題的關(guān)鍵方案。本文將以 Ollama 和 AnythingLLM 為核心工具，詳細(xì)介紹如何通過(guò)API實(shí)現(xiàn)企業(yè)本地知識(shí)庫(kù)的訓(xùn)練與調(diào)用，并提供完整的Python代碼示例及執(zhí)行效果分析，助力企業(yè)構(gòu)建安全高效的私有化知識(shí)管理系統(tǒng)。

1 工具與技術(shù)背景

1.1 RAG技術(shù)原理

RAG通過(guò)結(jié)合大語(yǔ)言模型（LLM）的生成能力與外部知識(shí)庫(kù)的檢索功能，顯著提升回答的準(zhǔn)確性和專業(yè)性。其核心流程包括：

知識(shí)庫(kù)構(gòu)建：將企業(yè)文檔轉(zhuǎn)化為向量并存儲(chǔ)于向量數(shù)據(jù)庫(kù)；
檢索增強(qiáng)：根據(jù)用戶問(wèn)題檢索相關(guān)文檔片段；
生成回答：LLM結(jié)合檢索結(jié)果生成最終響應(yīng)。

1.2 AnythingLLM與Ollama簡(jiǎn)介

Ollama：開源本地大模型管理工具，支持一鍵部署Llama、Gemma等模型，提供REST API接口。
AnythingLLM：企業(yè)級(jí)知識(shí)庫(kù)管理平臺(tái)，支持多格式文檔上傳、向量數(shù)據(jù)庫(kù)集成及靈活的API調(diào)用，適用于構(gòu)建私有化問(wèn)答系統(tǒng)。

組合優(yōu)勢(shì)：本地化部署保障數(shù)據(jù)安全，靈活支持多模型切換，降低企業(yè)AI應(yīng)用門檻。

訪問(wèn)http://localhost:3001，選擇Ollama作為L(zhǎng)LM Provider，填寫B(tài)ase URL為http://host.docker.internal:11434。
選擇預(yù)加載的模型（如deepseek-r1:8b）。

2 知識(shí)庫(kù)訓(xùn)練實(shí)戰(zhàn)

2.1 文檔上傳與嵌入處理

1. 通過(guò)UI上傳文檔：

支持PDF、TXT、DOCX等格式，單個(gè)文件可達(dá)500MB。
示例：這里錄入一本道德經(jīng)做測(cè)試。

如果你用過(guò) swagger，那點(diǎn)擊閱讀 api 文檔，你會(huì)很熟悉，下面接口調(diào)用部分可以忽略了，自行玩吧。

本地也可以訪問(wèn)：http://localhost:3001/api/docs/

點(diǎn)開 api 文檔中，authorize，將生成的 apikey 錄入：

調(diào)用接口測(cè)試是否成功：

運(yùn)行第一個(gè)借口/v1/auth，如果返回截圖中的結(jié)果，即調(diào)用成功。

2. 創(chuàng)建工作區(qū)：

import requests

# 替換為實(shí)際的 API 端點(diǎn)
api_url = "http://localhost:3001/api/v1/workspace/new"

# 身份驗(yàn)證，設(shè)置請(qǐng)求頭
headers = {
    "accept": "application/json",
    "Authorization": "Bearer YOUR_API_KEY",  #注意，替換apikey一定要保留Bearer，空格也有保留
    "Content-Type": "application/json"
}

# 準(zhǔn)備創(chuàng)建工作區(qū)所需的數(shù)據(jù)
workspace_data = {
    "name": "Test2", #我這里創(chuàng)建的 ddj，替換成你的工作空間
    "similarityThreshold": 0.7,
    "openAiTemp": 0.7,
    "openAiHistory": 20,
    "openAiPrompt": "Custom prompt for responses",
    "queryRefusalResponse": "Custom refusal message",
    "chatMode": "chat",
    "topN": 4
}

try:
    # 發(fā)送 POST 請(qǐng)求
    response = requests.post(api_url, headers=headers, json=workspace_data)

    # 檢查響應(yīng)狀態(tài)碼
    if response.status_code == 200:  # 200 表示創(chuàng)建成功
        result = response.json()
        print("工作區(qū)創(chuàng)建成功：", result)
    else:
        print(f"工作區(qū)創(chuàng)建失敗，狀態(tài)碼：{response.status_code}，錯(cuò)誤信息：{response.text}")
except requests.RequestException as e:
    print(f"請(qǐng)求發(fā)生錯(cuò)誤：{e}")

執(zhí)行輸出如下結(jié)果，則為成功，如果不成功，仔細(xì)看我上面的每一行代碼注釋地方要特別注意。

#如下結(jié)果則執(zhí)行成功
{
  "workspace": {
    "id": 4,
    "name": "ddj",
    "slug": "ddj",
    "vectorTag": null,
    "createdAt": "2025-02-10T16:14:58.744Z",
    "openAiTemp": 0.7,
    "openAiHistory": 20,
    "lastUpdatedAt": "2025-02-10T16:14:58.744Z",
    "openAiPrompt": "Custom prompt for responses",
    "similarityThreshold": 0.7,
    "chatProvider": null,
    "chatModel": null,
    "topN": 4,
    "chatMode": "chat",
    "pfpFilename": null,
    "agentProvider": null,
    "agentModel": null,
    "queryRefusalResponse": "Custom refusal message",
    "vectorSearchMode": "default"
  },
  "message": null
}

3. 執(zhí)行嵌入操作（文檔上傳）：

# Python調(diào)用AnythingLLM文檔上傳API   
import requests   
headers = {       
    "Authorization": "Bearer YOUR_API_KEY",       
    "accept": "application/json"   }   
files = {'file': open('product_guide.pdf', 'rb')}   
response = requests.post(       
    'http://localhost:3001/api/v1/workspace/{workspace_id}/document',       
    headers=headers,       
    files=files   
)   
print(response.json())  # 返回文檔ID及處理狀態(tài)

關(guān)鍵參數(shù)：

workspace_id: 目標(biāo)工作區(qū)ID（可通過(guò)GET /api/v1/workspaces獲?。?，也可以在 swagger 中調(diào)用，如下（屏幕不夠大，截圖范圍有限，不過(guò)核心的都截到了）file

2.2 向量數(shù)據(jù)庫(kù)管理- 默認(rèn)數(shù)據(jù)庫(kù)：LanceDB（無(wú)需額外配置）。

高級(jí)選項(xiàng)：支持Chroma、Pinecone等數(shù)據(jù)庫(kù)，優(yōu)化檢索性能。后期會(huì)出針對(duì)文檔向量化的方法，這塊有很多細(xì)節(jié)需要注意，目前先將整個(gè)流程走通。

3 API調(diào)用與問(wèn)答系統(tǒng)開發(fā)

3.1 列出所有 Workspace，并獲取sulg

# Response body

{
  "workspaces": [
    {
      "id": 1,
      "name": "Test",
      "slug": "test",
      "vectorTag": null,
      "createdAt": "2025-02-11T08:21:44.403Z",
      "openAiTemp": null,
      "openAiHistory": 20,
      "lastUpdatedAt": "2025-02-11T08:21:44.403Z",
      "openAiPrompt": null,
      "similarityThreshold": 0.25,
      "chatProvider": null,
      "chatModel": null,
      "topN": 4,
      "chatMode": "chat",
      "pfpFilename": null,
      "agentProvider": null,
      "agentModel": null,
      "queryRefusalResponse": null,
      "vectorSearchMode": "default",
      "threads": [
        {
          "user_id": null,
          "slug": "c27ac120-9239-4852-abdf-faa8d03e4b9f",
          "name": "喬·多伊將myDNAge預(yù)測(cè)的年齡是…"
        }
      ]
    },
    {
      "id": 2,
      "name": "Test2",
      "slug": "test2",
      "vectorTag": null,
      "createdAt": "2025-02-20T09:23:07.809Z",
      "openAiTemp": 0.7,
      "openAiHistory": 20,
      "lastUpdatedAt": "2025-02-20T09:23:07.809Z",
      "openAiPrompt": "Custom prompt for responses",
      "similarityThreshold": 0.7,
      "chatProvider": null,
      "chatModel": null,
      "topN": 4,
      "chatMode": "chat",
      "pfpFilename": null,
      "agentProvider": null,
      "agentModel": null,
      "queryRefusalResponse": "Custom refusal message",
      "vectorSearchMode": "default",
      "threads": []
    },
    {
      "id": 4,
      "name": "Test2",
      "slug": "test2-64726870",
      "vectorTag": null,
      "createdAt": "2025-03-18T03:06:36.982Z",
      "openAiTemp": 0.7,
      "openAiHistory": 20,
      "lastUpdatedAt": "2025-03-18T03:06:36.982Z",
      "openAiPrompt": "Custom prompt for responses",
      "similarityThreshold": 0.7,
      "chatProvider": null,
      "chatModel": null,
      "topN": 4,
      "chatMode": "chat",
      "pfpFilename": null,
      "agentProvider": null,
      "agentModel": null,
      "queryRefusalResponse": "Custom refusal message",
      "vectorSearchMode": "default",
      "threads": []
    }
  ]
}

3.2 生成API密鑰

在AnythingLLM設(shè)置界面創(chuàng)建API Key，權(quán)限設(shè)置為Full Access。
密鑰格式：Bearer {API_KEY}，需加入請(qǐng)求頭。

3.3 Python調(diào)用示例

在一個(gè)Workspace中進(jìn)行Chat

import requests
import json

def ask_anythingllm(question, workspace_name, api_key):

    url = f"http://localhost:3001/api/v1/workspace/{workspace_name}/chat"
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json",
        "accept": "application/json"
    }
    data = {
        "message": question,
        "mode": "chat"  # 可選chat/query模式
    }
    response = requests.post(url, headers=headers, json=data)
    if response.status_code == 200:
        result = response.json()
        # 提取有效回答（去除思考過(guò)程）
        answer = result['textResponse'].split('</think>')[-1].strip()
        sources = result.get('sources', [])
        return answer, sources
    else:
        return f"Error: {response.text}", []
        # 示例調(diào)用

api_key = "your_api_key"  #替換成你自己的apikey
workspace = "test"
question = "今天北京的天氣如何？"
answer, sources = ask_anythingllm(question, workspace, api_key)
print("回答:", answer)
print("來(lái)源:", [src['title'] for src in sources])

執(zhí)行效果：

回答: 抱歉，我無(wú)法提供實(shí)時(shí)天氣信息。請(qǐng)您查看本地天氣預(yù)報(bào)以獲取今天北京的具體天氣情況。
來(lái)源: ['it.sohu.com_a_856247220_121124360.html']

3.4 高級(jí)功能擴(kuò)展

多工作區(qū)隔離：為不同部門創(chuàng)建獨(dú)立知識(shí)庫(kù)。
對(duì)話歷史管理：通過(guò)chatId參數(shù)實(shí)現(xiàn)多輪對(duì)話上下文保持。

4 優(yōu)化與故障排查

4.1 性能調(diào)優(yōu)建議

模型選擇：根據(jù)硬件配置選擇模型尺寸（如deepseek-r1:8b vs 70b）。
分塊策略：調(diào)整文檔分割大小（默認(rèn)512 tokens）以平衡精度與速度。

4.2 常見問(wèn)題解決

問(wèn)題現(xiàn)象	解決方案
API返回403錯(cuò)誤	檢查API密鑰權(quán)限及有效期
文檔嵌入失敗	確認(rèn)文件格式兼容性，嘗試重新上傳
響應(yīng)速度慢	增加Ollama的`num_ctx`參數(shù)提升上下文容量
api執(zhí)行報(bào)錯(cuò)	建議用swager上測(cè)試，沒(méi)問(wèn)題再編輯代碼，如果執(zhí)行不下去，嘗試切換anythingllm 版本

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

基于 AnythingLLM API 訓(xùn)練并調(diào)用本地知識(shí)庫(kù)