2025-08-27

從零開(kāi)始學(xué)MCP(6) | MCP 與大型語(yǔ)言模型(LLM)深度集成

在前幾期的MCP系列教程中,我們已經(jīng)了解了MCP的基本概念、工作原理和核心組件。本期我們將深入探討如何將Model Context Protocol (MCP) 與大型語(yǔ)言模型(LLM)進(jìn)行深度集成,實(shí)現(xiàn)更加智能和強(qiáng)大的AI應(yīng)用。

本文將涵蓋三個(gè)核心方面:本地模型接入(Ollama/vLLM)、在線(xiàn)模型擴(kuò)展(OpenAI/DeepSeek)以及提示詞模板設(shè)計(jì),幫助你全面掌握MCP與LLM的集成技巧。

一、MCP與LLM集成架構(gòu)設(shè)計(jì)

1.1 整體架構(gòu)概述

MCP與LLM的集成通常采用客戶(hù)端-服務(wù)器架構(gòu):

<pre data-tool="mdnice編輯器" style="-webkit-tap-highlight-color: transparent; margin: 10px 0px; padding: 0px; outline: 0px; max-width: 100%; box-sizing: border-box !important; overflow-wrap: break-word !important; border-radius: 5px; box-shadow: rgba(0, 0, 0, 0.55) 0px 2px 10px; text-align: left; visibility: visible;">+----------------+ +----------------+ +----------------+ | | | | | | | MCP客戶(hù)端 +------+ MCP服務(wù)器 +------+ LLM后端 | | (應(yīng)用層) | | (適配層) | | (模型層) | | | | | | | +----------------+ +----------------+ +----------------+ </pre>

1.2 核心組件職責(zé)

  • MCP客戶(hù)端:主應(yīng)用程序,負(fù)責(zé)用戶(hù)交互和請(qǐng)求調(diào)度
  • MCP服務(wù)器:協(xié)議轉(zhuǎn)換層,將MCP協(xié)議轉(zhuǎn)換為L(zhǎng)LM API調(diào)用
  • LLM后端:實(shí)際執(zhí)行模型推理的組件

二、本地模型接入:Ollama/vLLM + MCP

2.1 Ollama集成方案

環(huán)境準(zhǔn)備

首先安裝必要的依賴(lài):

<pre data-tool="mdnice編輯器" style="-webkit-tap-highlight-color: transparent; margin: 10px 0px; padding: 0px; outline: 0px; max-width: 100%; box-sizing: border-box !important; overflow-wrap: break-word !important; border-radius: 5px; box-shadow: rgba(0, 0, 0, 0.55) 0px 2px 10px; text-align: left;"># 安裝Ollama curl -fsSL https://ollama.ai/install.sh | sh # 安裝Python MCP SDK pip install mcp[sse] ollama </pre>

創(chuàng)建Ollama MCP服務(wù)器

<pre data-tool="mdnice編輯器" style="-webkit-tap-highlight-color: transparent; margin: 10px 0px; padding: 0px; outline: 0px; max-width: 100%; box-sizing: border-box !important; overflow-wrap: break-word !important; border-radius: 5px; box-shadow: rgba(0, 0, 0, 0.55) 0px 2px 10px; text-align: left;"># ollama_mcp_server.py import mcp.server as mcp from mcp.server import Server import ollama from pydantic import BaseModel # 創(chuàng)建服務(wù)器實(shí)例 server = Server("ollama-mcp-server") class GenerateRequest(BaseModel): model: str = "llama2" prompt: str max_tokens: int = 512 @server.tool() asyncdef generate_text(request: GenerateRequest) -> str: """使用Ollama生成文本""" try: response = ollama.generate( model=request.model, prompt=request.prompt, options={'num_predict': request.max_tokens} ) return response['response'] except Exception as e: returnf"生成文本時(shí)出錯(cuò): {str(e)}" @server.list_resources() asyncdef list_models() -> list: """列出可用的Ollama模型""" try: models = ollama.list() return [ mcp.Resource( uri=f"ollama://{model['name']}", name=model['name'], description=f"Ollama模型: {model['name']}" ) for model in models['models'] ] except Exception as e: return [] if __name__ == "__main__": # 啟動(dòng)服務(wù)器 mcp.run(server, transport='stdio') </pre>

客戶(hù)端配置

<pre data-tool="mdnice編輯器" style="-webkit-tap-highlight-color: transparent; margin: 10px 0px; padding: 0px; outline: 0px; max-width: 100%; box-sizing: border-box !important; overflow-wrap: break-word !important; border-radius: 5px; box-shadow: rgba(0, 0, 0, 0.55) 0px 2px 10px; text-align: left;">// mcp.client.json { "mcpServers": { "ollama": { "command": "python", "args": ["/path/to/ollama_mcp_server.py"] } } } </pre>

2.2 vLLM集成方案

vLLM MCP服務(wù)器實(shí)現(xiàn)

<pre data-tool="mdnice編輯器" style="-webkit-tap-highlight-color: transparent; margin: 10px 0px; padding: 0px; outline: 0px; max-width: 100%; box-sizing: border-box !important; overflow-wrap: break-word !important; border-radius: 5px; box-shadow: rgba(0, 0, 0, 0.55) 0px 2px 10px; text-align: left;"># vllm_mcp_server.py import mcp.server as mcp from mcp.server import Server from vllm import LLM, SamplingParams from pydantic import BaseModel import asyncio # 全局vLLM實(shí)例 vllm_engine = None class VLLMRequest(BaseModel): prompt: str max_tokens: int = 256 temperature: float = 0.7 top_p: float = 0.9 def initialize_vllm(model_name: str = "facebook/opt-125m"): """初始化vLLM引擎""" global vllm_engine if vllm_engine isNone: vllm_engine = LLM( model=model_name, tensor_parallel_size=1, gpu_memory_utilization=0.9 ) server = Server("vllm-mcp-server") @server.tool() asyncdef vllm_generate(request: VLLMRequest) -> str: """使用vLLM生成文本""" try: sampling_params = SamplingParams( temperature=request.temperature, top_p=request.top_p, max_tokens=request.max_tokens ) outputs = vllm_engine.generate([request.prompt], sampling_params) return outputs[0].outputs[0].text except Exception as e: returnf"vLLM生成失敗: {str(e)}" @server.list_resources() asyncdef list_vllm_models() -> list: """列出支持的vLLM模型""" return [ mcp.Resource( uri="vllm://facebook/opt-125m", name="OPT-125M", description="Facebook OPT 125M參數(shù)模型" ), mcp.Resource( uri="vllm://gpt2", name="GPT-2", description="OpenAI GPT-2模型" ) ] if __name__ == "__main__": # 初始化vLLM initialize_vllm() mcp.run(server, transport='stdio') </pre>

三、在線(xiàn)模型擴(kuò)展:OpenAI/DeepSeek適配器

3.1 OpenAI MCP適配器

<pre data-tool="mdnice編輯器" style="-webkit-tap-highlight-color: transparent; margin: 10px 0px; padding: 0px; outline: 0px; max-width: 100%; box-sizing: border-box !important; overflow-wrap: break-word !important; border-radius: 5px; box-shadow: rgba(0, 0, 0, 0.55) 0px 2px 10px; text-align: left;"># openai_mcp_server.py import mcp.server as mcp from mcp.server import Server from openai import OpenAI from pydantic import BaseModel import os client = OpenAI(api_key=os.getenv("OPENAI_API_KEY")) server = Server("openai-mcp-server") class OpenAIChatRequest(BaseModel): message: str model: str = "gpt-3.5-turbo" temperature: float = 0.7 @server.tool() asyncdef chat_completion(request: OpenAIChatRequest) -> str: """使用OpenAI API進(jìn)行對(duì)話(huà)補(bǔ)全""" try: response = client.chat.completions.create( model=request.model, messages=[{"role": "user", "content": request.message}], temperature=request.temperature ) return response.choices[0].message.content except Exception as e: returnf"OpenAI API調(diào)用失敗: {str(e)}" @server.list_resources() asyncdef list_openai_models() -> list: """列出可用的OpenAI模型""" return [ mcp.Resource( uri="openai://gpt-3.5-turbo", name="GPT-3.5-Turbo", description="OpenAI GPT-3.5 Turbo模型" ), mcp.Resource( uri="openai://gpt-4", name="GPT-4", description="OpenAI GPT-4模型" ) ] if __name__ == "__main__": mcp.run(server, transport='stdio') </pre>

3.2 DeepSeek MCP適配器

<pre data-tool="mdnice編輯器" style="-webkit-tap-highlight-color: transparent; margin: 10px 0px; padding: 0px; outline: 0px; max-width: 100%; box-sizing: border-box !important; overflow-wrap: break-word !important; border-radius: 5px; box-shadow: rgba(0, 0, 0, 0.55) 0px 2px 10px; text-align: left;"># deepseek_mcp_server.py import mcp.server as mcp from mcp.server import Server from openai import OpenAI from pydantic import BaseModel import os # DeepSeek的API與OpenAI兼容,但使用不同的base_url client = OpenAI( api_key=os.getenv("DEEPSEEK_API_KEY"), base_url="https://api.deepseek.com/v1" ) server = Server("deepseek-mcp-server") class DeepSeekRequest(BaseModel): message: str model: str = "deepseek-chat" temperature: float = 0.7 @server.tool() asyncdef deepseek_chat(request: DeepSeekRequest) -> str: """使用DeepSeek API進(jìn)行對(duì)話(huà)""" try: response = client.chat.completions.create( model=request.model, messages=[{"role": "user", "content": request.message}], temperature=request.temperature ) return response.choices[0].message.content except Exception as e: returnf"DeepSeek API調(diào)用失敗: {str(e)}" if __name__ == "__main__": mcp.run(server, transport='stdio') </pre>

四、提示詞模板設(shè)計(jì):動(dòng)態(tài)注入上下文

4.1 基礎(chǔ)模板設(shè)計(jì)

<pre data-tool="mdnice編輯器" style="-webkit-tap-highlight-color: transparent; margin: 10px 0px; padding: 0px; outline: 0px; max-width: 100%; box-sizing: border-box !important; overflow-wrap: break-word !important; border-radius: 5px; box-shadow: rgba(0, 0, 0, 0.55) 0px 2px 10px; text-align: left;"># prompt_templates.py from string import Template from datetime import datetime class PromptTemplate: def __init__(self, template_str: str): self.template = Template(template_str) def render(self, **kwargs) -> str: """渲染模板""" # 添加默認(rèn)上下文 defaults = { 'current_time': datetime.now().strftime("%Y-%m-%d %H:%M:%S"), 'system_role': "你是一個(gè)有幫助的AI助手" } defaults.update(kwargs) return self.template.safe_substitute(defaults) # 定義各種場(chǎng)景的模板 TEMPLATES = { "code_assistant": PromptTemplate(""" $system_role 當(dāng)前時(shí)間: $current_time 請(qǐng)幫助我解決以下編程問(wèn)題: $user_query 請(qǐng)?zhí)峁┰敿?xì)的代碼示例和解釋。 """), "content_writer": PromptTemplate(""" $system_role 當(dāng)前時(shí)間: $current_time 請(qǐng)根據(jù)以下要求創(chuàng)作內(nèi)容: 主題: $topic 字?jǐn)?shù)要求: $word_count 風(fēng)格: $style 請(qǐng)開(kāi)始創(chuàng)作: """), "data_analyzer": PromptTemplate(""" $system_role 當(dāng)前時(shí)間: $current_time 請(qǐng)分析以下數(shù)據(jù): 數(shù)據(jù)集描述: $dataset_description 分析目標(biāo): $analysis_goal 請(qǐng)?zhí)峁┰敿?xì)的分析結(jié)果: """) } </pre>

4.2 動(dòng)態(tài)上下文注入

<pre data-tool="mdnice編輯器" style="-webkit-tap-highlight-color: transparent; margin: 10px 0px; padding: 0px; outline: 0px; max-width: 100%; box-sizing: border-box !important; overflow-wrap: break-word !important; border-radius: 5px; box-shadow: rgba(0, 0, 0, 0.55) 0px 2px 10px; text-align: left;"># context_manager.py from typing import Dict, Any from prompt_templates import TEMPLATES class ContextManager: def __init__(self): self.context_stores = {} def add_context(self, key: str, context: Any): """添加上下文信息""" self.context_stores[key] = context def get_context(self, key: str, default=None): """獲取上下文信息""" return self.context_stores.get(key, default) def generate_prompt(self, template_name: str, user_input: str, **extra_context) -> str: """生成最終提示詞""" if template_name notin TEMPLATES: raise ValueError(f"未知的模板: {template_name}") # 合并所有上下文 context = { 'user_query': user_input, **self.context_stores, **extra_context } return TEMPLATES[template_name].render(**context) # 使用示例 context_manager = ContextManager() context_manager.add_context("user_level", "advanced") context_manager.add_context("preferred_language", "Python") prompt = context_manager.generate_prompt( "code_assistant", "如何實(shí)現(xiàn)一個(gè)快速排序算法?", complexity="high" ) </pre>

4.3 多輪對(duì)話(huà)上下文管理

<pre data-tool="mdnice編輯器" style="-webkit-tap-highlight-color: transparent; margin: 10px 0px; padding: 0px; outline: 0px; max-width: 100%; box-sizing: border-box !important; overflow-wrap: break-word !important; border-radius: 5px; box-shadow: rgba(0, 0, 0, 0.55) 0px 2px 10px; text-align: left;"># conversation_manager.py from typing import List, Dict from dataclasses import dataclass @dataclass class Message: role: str # "user", "assistant", "system" content: str timestamp: str class ConversationManager: def __init__(self, max_history: int = 10): self.history: List[Message] = [] self.max_history = max_history def add_message(self, role: str, content: str): """添加消息到歷史記錄""" from datetime import datetime message = Message( role=role, content=content, timestamp=datetime.now().isoformat() ) self.history.append(message) # 保持歷史記錄長(zhǎng)度 if len(self.history) > self.max_history: self.history = self.history[-self.max_history:] def get_conversation_context(self) -> str: """獲取對(duì)話(huà)上下文""" context_lines = [] for msg in self.history: context_lines.append(f"{msg.role}: {msg.content}") return"\n".join(context_lines) def generate_contextual_prompt(self, user_input: str, template_name: str) -> str: """生成包含對(duì)話(huà)上下文的提示詞""" from prompt_templates import TEMPLATES conversation_context = self.get_conversation_context() prompt = TEMPLATES[template_name].render( user_query=user_input, conversation_history=conversation_context, current_time=datetime.now().strftime("%Y-%m-%d %H:%M:%S") ) return prompt </pre>

五、完整集成示例

5.1 綜合MCP服務(wù)器

<pre data-tool="mdnice編輯器" style="-webkit-tap-highlight-color: transparent; margin: 10px 0px; padding: 0px; outline: 0px; max-width: 100%; box-sizing: border-box !important; overflow-wrap: break-word !important; border-radius: 5px; box-shadow: rgba(0, 0, 0, 0.55) 0px 2px 10px; text-align: left;"># comprehensive_mcp_server.py import mcp.server as mcp from mcp.server import Server from pydantic import BaseModel from typing import Optional import os # 導(dǎo)入各個(gè)模塊 from ollama_integration import OllamaIntegration from openai_integration import OpenAIIntegration from prompt_system import PromptSystem server = Server("comprehensive-llm-server") class LLMRequest(BaseModel): prompt: str model_type: str = "ollama"# ollama, openai, deepseek model_name: Optional[str] = None max_tokens: int = 512 temperature: float = 0.7 # 初始化各個(gè)集成模塊 ollama_integration = OllamaIntegration() openai_integration = OpenAIIntegration() prompt_system = PromptSystem() @server.tool() asyncdef generate_text(request: LLMRequest) -> str: """統(tǒng)一的文本生成接口""" # 使用提示詞系統(tǒng)增強(qiáng)用戶(hù)輸入 enhanced_prompt = prompt_system.enhance_prompt( request.prompt, context=prompt_system.get_current_context() ) # 根據(jù)模型類(lèi)型選擇后端 if request.model_type == "ollama": result = await ollama_integration.generate( enhanced_prompt, request.model_name, request.max_tokens ) elif request.model_type == "openai": result = await openai_integration.chat_completion( enhanced_prompt, request.model_name, request.temperature ) else: return"不支持的模型類(lèi)型" # 記錄到對(duì)話(huà)歷史 prompt_system.add_to_history("user", request.prompt) prompt_system.add_to_history("assistant", result) return result @server.list_resources() asyncdef list_all_models() -> list: """列出所有可用的模型""" ollama_models = await ollama_integration.list_models() openai_models = openai_integration.list_models() return ollama_models + openai_models if __name__ == "__main__": mcp.run(server, transport='stdio') </pre>

5.2 客戶(hù)端使用示例

<pre data-tool="mdnice編輯器" style="-webkit-tap-highlight-color: transparent; margin: 10px 0px; padding: 0px; outline: 0px; max-width: 100%; box-sizing: border-box !important; overflow-wrap: break-word !important; border-radius: 5px; box-shadow: rgba(0, 0, 0, 0.55) 0px 2px 10px; text-align: left;"># client_example.py import asyncio from mcp import ClientSession from mcp.client.stdio import stdio_client asyncdef main(): # 連接到MCP服務(wù)器 asyncwith stdio_client("python", ["comprehensive_mcp_server.py"]) as (read, write): asyncwith ClientSession(read, write) as session: # 初始化會(huì)話(huà) await session.initialize() # 列出可用資源 resources = await session.list_resources() print("可用模型:", resources) # 使用Ollama生成文本 response = await session.call_tool( "generate_text", { "prompt": "解釋一下機(jī)器學(xué)習(xí)的基本概念", "model_type": "ollama", "model_name": "llama2", "max_tokens": 300 } ) print("生成的響應(yīng):", response) if __name__ == "__main__": asyncio.run(main()) </pre>

六、最佳實(shí)踐與優(yōu)化建議

6.1 性能優(yōu)化

  1. 連接池管理:為頻繁使用的模型連接創(chuàng)建連接池
  2. 緩存機(jī)制:對(duì)常見(jiàn)請(qǐng)求結(jié)果進(jìn)行緩存
  3. 批量處理:支持批量提示詞處理以提高效率

6.2 安全考慮

  1. API密鑰管理:使用環(huán)境變量或密鑰管理系統(tǒng)
  2. 輸入驗(yàn)證:對(duì)所有輸入進(jìn)行嚴(yán)格的驗(yàn)證和清理
  3. 訪問(wèn)控制:實(shí)現(xiàn)基于角色的訪問(wèn)控制

6.3 監(jiān)控與日志

  1. 性能監(jiān)控:跟蹤響應(yīng)時(shí)間和資源使用情況
  2. 使用日志:記錄詳細(xì)的請(qǐng)求和響應(yīng)日志
  3. 錯(cuò)誤處理:實(shí)現(xiàn)完善的錯(cuò)誤處理和重試機(jī)制

推薦學(xué)習(xí)

行業(yè)首個(gè)「知識(shí)圖譜+測(cè)試開(kāi)發(fā)」深度整合課程【人工智能測(cè)試開(kāi)發(fā)訓(xùn)練營(yíng)】,贈(zèng)送智能體工具。提供企業(yè)級(jí)解決方案,人工智能的管理平臺(tái)部署,實(shí)現(xiàn)智能化測(cè)試,落地大模型,實(shí)現(xiàn)從傳統(tǒng)手工轉(zhuǎn)向用AI和自動(dòng)化來(lái)實(shí)現(xiàn)測(cè)試,提升效率和質(zhì)量。

image.png

總結(jié)

本文詳細(xì)介紹了如何將MCP與大型語(yǔ)言模型進(jìn)行深度集成,涵蓋了本地模型(Ollama/vLLM)和在線(xiàn)模型(OpenAI/DeepSeek)的接入方案,以及提示詞模板設(shè)計(jì)和動(dòng)態(tài)上下文注入的高級(jí)技巧。

通過(guò)MCP協(xié)議,我們可以構(gòu)建更加模塊化、可擴(kuò)展的AI應(yīng)用系統(tǒng),實(shí)現(xiàn)不同模型之間的無(wú)縫切換和組合使用。這種架構(gòu)不僅提高了系統(tǒng)的靈活性,還為未來(lái)的功能擴(kuò)展奠定了堅(jiān)實(shí)的基礎(chǔ)。

希望本教程能夠幫助你在實(shí)際項(xiàng)目中成功實(shí)現(xiàn)MCP與LLM的深度集成,構(gòu)建出更加強(qiáng)大和智能的AI應(yīng)用。

推薦閱讀
2025大語(yǔ)言模型部署實(shí)戰(zhàn)指南:從個(gè)人筆記本到企業(yè)級(jí)服務(wù)的全棧方案 - 霍格沃茲測(cè)試開(kāi)發(fā)學(xué)社 - 博客園
Playwright實(shí)戰(zhàn):寫(xiě)UI自動(dòng)化腳本,速度直接起飛 - 霍格沃茲測(cè)試開(kāi)發(fā)學(xué)社 - 博客園
2025大模型應(yīng)用平臺(tái)選型指南:從個(gè)人助手到企業(yè)級(jí)智能體,5大平臺(tái)場(chǎng)景化拆解 - 霍格沃茲測(cè)試開(kāi)發(fā)學(xué)社 - 博客園

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書(shū)系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容