完整項目實戰(zhàn):使用 Playwright MCP 構(gòu)建網(wǎng)頁交互 AI 助手教程

項目概述:打造智能網(wǎng)頁操作助手

在本教程中,我們將構(gòu)建一個完整的、能夠?qū)嶋H交互網(wǎng)頁的AI助手。這個助手不僅能理解自然語言指令,還能通過 Playwright MCP 執(zhí)行復雜的網(wǎng)頁操作。我們將從零開始,搭建一個功能完備的系統(tǒng),涵蓋從環(huán)境配置到實際部署的全流程。
項目目標

  • 構(gòu)建一個能夠執(zhí)行以下任務(wù)的AI助手:
  • 自動登錄網(wǎng)站并處理認證
  • 填寫復雜表單和交互元素
  • 提取、分析和結(jié)構(gòu)化網(wǎng)頁數(shù)據(jù)
  • 處理多步驟工作流程
  • 應(yīng)對網(wǎng)頁異常和動態(tài)內(nèi)容

一、項目架構(gòu)設(shè)計

技術(shù)棧選擇

  • 后端框架: Node.js + Express
  • 瀏覽器自動化: Playwright
  • AI 模型集成: Anthropic Claude API
  • 協(xié)議層: 自定義 MCP (Model Context Protocol) Server
  • 前端界面: React + Tailwind CSS
  • 數(shù)據(jù)庫: SQLite (用于會話存儲)
  • 任務(wù)隊列: Bull (用于異步任務(wù)處理)

系統(tǒng)架構(gòu)
用戶界面 (React)
↓ (HTTP/REST API)
后端服務(wù)器 (Express + AI 路由)
↓ (MCP 協(xié)議)
Playwright MCP Server
↓ (瀏覽器控制)
Chromium/Firefox 實例

二、環(huán)境準備與項目初始化

步驟1:創(chuàng)建項目結(jié)構(gòu)

mkdir ai-web-assistant
cd ai-web-assistant
mkdir -p src/{mcp,ai,routes,models,utils} public/{css,js} tests
touch package.json server.js .env.example README.md

步驟2:定義項目依賴
創(chuàng)建 package.json:

{
"name": "ai-web-assistant",
"version": "1.0.0",
"type": "module",
"scripts": {
"start": "node server.js",
"dev": "nodemon server.js",
"test": "jest",
"mcp:dev": "node src/mcp/server.js"
},
"dependencies": {
"express": "^4.18.2",
"cors": "^2.8.5",
"dotenv": "^16.3.0",
"playwright": "^1.40.0",
"@anthropic-ai/sdk": "^0.7.0",
"sqlite3": "^5.1.6",
"bull": "^4.11.0",
"express-rate-limit": "^7.1.0",
"helmet": "^7.0.0"
},
"devDependencies": {
"nodemon": "^3.0.0",
"jest": "^29.6.0"
}
}

運行 npm install 安裝依賴。

步驟3:環(huán)境配置
創(chuàng)建 .env 文件:

API 配置

ANTHROPIC_API_KEY=your_anthropic_api_key_here
PORT=3000
NODE_ENV=development

瀏覽器配置

BROWSER_TYPE=chromium
HEADLESS_MODE=false
BROWSER_TIMEOUT=30000

數(shù)據(jù)庫配置

DB_PATH=./data/sessions.db

安全配置

SESSION_SECRET=your_session_secret_here
RATE_LIMIT_WINDOW=900000
RATE_LIMIT_MAX=100

三、核心模塊實現(xiàn)

1. Playwright MCP Server 實現(xiàn)
創(chuàng)建 src/mcp/server.js:

import { chromium, firefox, webkit } from'playwright';
import { EventEmitter } from'events';
class PlaywrightMCPServer extends EventEmitter {
constructor(config = {}) {
super();
this.config = {
browserType: config.browserType || 'chromium',
headless: config.headless !== false,
timeout: config.timeout || 30000,
...config
};
this.browser = null;
this.context = null;
this.page = null;
this.isInitialized = false;
this.sessionId = null;
}
// 初始化瀏覽器實例
async initialize(sessionId = null) {
try {
this.sessionId = sessionId || session_${Date.now()};
const browserMap = { chromium, firefox, webkit };
const BrowserClass = browserMap[this.config.browserType] || chromium;
this.browser = await BrowserClass.launch({
headless: this.config.headless,
timeout: this.config.timeout,
args: ['--no-sandbox', '--disable-dev-shm-usage']
});
this.context = awaitthis.browser.newContext({
viewport: { width: 1280, height: 720 },
userAgent: 'AI-Web-Assistant/1.0',
acceptDownloads: true,
ignoreHTTPSErrors: true
});
// 添加頁面錯誤處理
this.context.on('page', page => {
page.on('pageerror', error => {
this.emit('pageError', { sessionId: this.sessionId, error });
});
});
this.page = awaitthis.context.newPage();
// 設(shè)置默認超時
this.page.setDefaultTimeout(this.config.timeout);
this.page.setDefaultNavigationTimeout(this.config.timeout * 2);
this.isInitialized = true;
this.emit('initialized', { sessionId: this.sessionId });
return {
success: true,
message: 'Playwright MCP Server initialized successfully',
sessionId: this.sessionId
};
} catch (error) {
console.error('Failed to initialize Playwright:', error);
this.emit('error', error);
return { success: false, error: error.message };
}
}
// 工具定義 - MCP 協(xié)議核心
getTools() {
return {
navigate: {
name: 'navigate',
description: 'Navigate to a specific URL',
parameters: {
url: {
type: 'string',
description: 'The URL to navigate to'
},
waitUntil: {
type: 'string',
description: 'When to consider navigation successful',
enum: ['load', 'domcontentloaded', 'networkidle'],
default: 'networkidle'
}
}
},
click: {
name: 'click',
description: 'Click on an element using CSS selector, XPath, or text',
parameters: {
selector: {
type: 'string',
description: 'CSS selector, XPath, or text to identify the element'
},
selectorType: {
type: 'string',
description: 'Type of selector: css, xpath, or text',
enum: ['css', 'xpath', 'text'],
default: 'css'
},
waitForNavigation: {
type: 'boolean',
description: 'Whether to wait for navigation after click',
default: false
}
}
},
fill_form: {
name: 'fill_form',
description: 'Fill a form with multiple fields',
parameters: {
fields: {
type: 'object',
description: 'Object mapping selectors to values'
}
}
},
extract_data: {
name: 'extract_data',
description: 'Extract structured data from the page',
parameters: {
schema: {
type: 'object',
description: 'Schema defining what data to extract'
}
}
},
wait_for_element: {
name: 'wait_for_element',
description: 'Wait for an element to appear',
parameters: {
selector: {
type: 'string',
description: 'CSS selector for the element'
},
state: {
type: 'string',
description: 'Element state to wait for',
enum: ['attached', 'detached', 'visible', 'hidden'],
default: 'visible'
},
timeout: {
type: 'number',
description: 'Timeout in milliseconds',
default: 10000
}
}
},
screenshot: {
name: 'screenshot',
description: 'Take a screenshot for debugging',
parameters: {
fullPage: {
type: 'boolean',
description: 'Whether to capture full page',
default: false
}
}
},
get_page_info: {
name: 'get_page_info',
description: 'Get comprehensive information about the current page'
}
};
}
// 工具執(zhí)行引擎
async executeTool(toolName, parameters = {}) {
if (!this.isInitialized) {
thrownewError('Playwright not initialized. Call initialize() first.');
}
try {
let result;
switch (toolName) {
case'navigate':
result = awaitthis.navigateToUrl(parameters.url, parameters.waitUntil);
break;
case'click':
result = awaitthis.clickElement(parameters.selector, parameters.selectorType, parameters.waitForNavigation);
break;
case'fill_form':
result = awaitthis.fillForm(parameters.fields);
break;
case'extract_data':
result = awaitthis.extractData(parameters.schema);
break;
case'wait_for_element':
result = awaitthis.waitForElement(parameters.selector, parameters.state, parameters.timeout);
break;
case'screenshot':
result = awaitthis.takeScreenshot(parameters.fullPage);
break;
case'get_page_info':
result = awaitthis.getPageInfo();
break;
default:
thrownewError(Unknown tool: ${toolName});
}
this.emit('toolExecuted', {
sessionId: this.sessionId,
toolName,
parameters,
result
});
return { success: true, data: result };
} catch (error) {
console.error(Tool execution failed: ${toolName}, error);
this.emit('toolError', {
sessionId: this.sessionId,
toolName,
parameters,
error: error.message
});
return {
success: false,
error: error.message,
suggestion: this.getErrorSuggestion(error.message)
};
}
}
// 具體的工具實現(xiàn)方法
async navigateToUrl(url, waitUntil = 'networkidle') {
if (!url.startsWith('http')) {
url = 'https://' + url;
}
const response = awaitthis.page.goto(url, {
waitUntil,
timeout: this.config.timeout
});
return {
url: this.page.url(),
status: response?.status(),
title: awaitthis.page.title(),
finalUrl: this.page.url()
};
}
async clickElement(selector, selectorType = 'css', waitForNavigation = false) {
let element;
switch (selectorType) {
case'css':
element = this.page.locator(selector);
break;
case'xpath':
element = this.page.locator(xpath=${selector});
break;
case'text':
element = this.page.getByText(selector, { exact: false });
break;
default:
thrownewError(Unsupported selector type: ${selectorType});
}
await element.waitFor({ state: 'visible' });
if (waitForNavigation) {
awaitPromise.all([
this.page.waitForNavigation({ waitUntil: 'networkidle' }),
element.click()
]);
} else {
await element.click();
}
return {
success: true,
element: awaitthis.getElementInfo(element)
};
}
async fillForm(fields) {
const results = {};
for (const [selector, value] ofObject.entries(fields)) {
try {
const element = this.page.locator(selector);
await element.waitFor({ state: 'visible' });
await element.fill(value);
results[selector] = { success: true, value };
} catch (error) {
results[selector] = { success: false, error: error.message };
}
}
return results;
}
async extractData(schema) {
const data = {};
for (const [key, config] ofObject.entries(schema)) {
try {
const { selector, type = 'text', attribute } = config;
const element = this.page.locator(selector);
switch (type) {
case'text':
data[key] = await element.textContent();
break;
case'attribute':
data[key] = await element.getAttribute(attribute);
break;
case'multiple':
data[key] = await element.allTextContents();
break;
default:
data[key] = await element.textContent();
}
} catch (error) {
data[key] = null;
}
}
return data;
}
async getElementInfo(element) {
try {
const boundingBox = await element.boundingBox();
const isVisible = await element.isVisible();
return {
visible: isVisible,
boundingBox,
tagName: await element.evaluate(el => el.tagName.toLowerCase())
};
} catch (error) {
return { error: error.message };
}
}
async takeScreenshot(fullPage = false) {
const screenshot = awaitthis.page.screenshot({
fullPage,
type: 'png'
});
return {
screenshot: screenshot.toString('base64'),
type: 'png',
fullPage
};
}
async getPageInfo() {
return {
url: this.page.url(),
title: awaitthis.page.title(),
content: awaitthis.page.content(),
viewport: this.page.viewportSize()
};
}
// 錯誤處理和建議
getErrorSuggestion(errorMessage) {
const suggestions = {
'timeout': '嘗試增加等待時間或檢查網(wǎng)絡(luò)連接',
'element not found': '檢查選擇器是否正確,或等待元素加載',
'navigation failed': '檢查URL是否正確,或網(wǎng)站是否可訪問',
'target closed': '瀏覽器頁面已關(guān)閉,需要重新初始化'
};
for (const [key, suggestion] ofObject.entries(suggestions)) {
if (errorMessage.toLowerCase().includes(key)) {
return suggestion;
}
}
return'請檢查網(wǎng)絡(luò)連接和頁面狀態(tài)后重試';
}
// 清理資源
async cleanup() {
try {
if (this.page) {
awaitthis.page.close();
}
if (this.context) {
awaitthis.context.close();
}
if (this.browser) {
awaitthis.browser.close();
}
this.isInitialized = false;
this.emit('cleanedUp', { sessionId: this.sessionId });
return { success: true, message: 'Resources cleaned up successfully' };
} catch (error) {
console.error('Cleanup failed:', error);
return { success: false, error: error.message };
}
}
}
exportdefault PlaywrightMCPServer;

2. AI 處理模塊
創(chuàng)建 src/ai/handler.js:

import Anthropic from'@anthropic-ai/sdk';
import PlaywrightMCPServer from'../mcp/server.js';
class AIHandler {
constructor(apiKey) {
this.anthropic = new Anthropic({ apiKey });
this.mcpServer = new PlaywrightMCPServer();
this.conversationHistory = newMap();
}
// 初始化會話
async initializeSession(sessionId) {
const result = awaitthis.mcpServer.initialize(sessionId);
if (!this.conversationHistory.has(sessionId)) {
this.conversationHistory.set(sessionId, []);
}
return result;
}
// 處理用戶指令
async processInstruction(sessionId, instruction, context = {}) {
try {
const history = this.conversationHistory.get(sessionId) || [];
// 構(gòu)建系統(tǒng)提示詞
const systemPrompt = this.buildSystemPrompt(context);
// 獲取可用工具
const availableTools = this.mcpServer.getTools();
// 調(diào)用 Claude 模型
const message = awaitthis.anthropic.messages.create({
model: "claude-3-sonnet-20240229",
max_tokens: 4096,
system: systemPrompt,
messages: [
...history,
{ role: "user", content: instruction }
],
tools: Object.values(availableTools)
});
let finalResponse = '';
let currentMessage = message;
// 處理工具調(diào)用
while (currentMessage.content.some(item => item.type === 'tool_use')) {
const toolResults = [];
for (const contentItem of currentMessage.content) {
if (contentItem.type === 'tool_use') {
const toolName = contentItem.name;
const parameters = contentItem.input;
// 執(zhí)行工具
const toolResult = awaitthis.mcpServer.executeTool(toolName, parameters);
toolResults.push({
type: 'tool_result',
tool_use_id: contentItem.id,
content: JSON.stringify(toolResult)
});
}
}
// 繼續(xù)對話
currentMessage = awaitthis.anthropic.messages.create({
model: "claude-3-sonnet-20240229",
max_tokens: 4096,
messages: [
...history,
{ role: "user", content: instruction },
{ role: "assistant", content: currentMessage.content },
{ role: "user", content: toolResults }
],
tools: Object.values(availableTools)
});
}
// 提取最終響應(yīng)
const textContent = currentMessage.content.find(item => item.type === 'text');
finalResponse = textContent ? textContent.text : '操作完成';
// 更新對話歷史
history.push(
{ role: "user", content: instruction },
{ role: "assistant", content: currentMessage.content }
);
// 保持最近10輪對話
if (history.length > 20) {
history.splice(0, 4);
}
return {
success: true,
response: finalResponse,
sessionId
};
} catch (error) {
console.error('AI processing failed:', error);
return {
success: false,
error: error.message,
sessionId
};
}
}
// 構(gòu)建系統(tǒng)提示詞
buildSystemPrompt(context) {
return`你是一個專業(yè)的網(wǎng)頁操作助手,可以通過瀏覽器自動化工具執(zhí)行各種網(wǎng)頁任務(wù)。
你的能力包括:

  • 導航到指定網(wǎng)址
  • 點擊按鈕和鏈接
  • 填寫表單和輸入框
  • 提取網(wǎng)頁數(shù)據(jù)
  • 等待頁面加載
  • 處理復雜交互
    重要指導原則:
  1. 在執(zhí)行操作前先分析頁面結(jié)構(gòu)
  2. 使用合適的選擇器定位元素
  3. 處理可能出現(xiàn)的錯誤和異常
  4. 提供清晰的操作反饋
  5. 對于復雜任務(wù),分解為多個步驟執(zhí)行
    當前上下文:${JSON.stringify(context)}
    請謹慎操作,確保每一步都正確執(zhí)行。如果遇到錯誤,請分析原因并提供解決方案。`;
    }
    // 獲取會話歷史
    getSessionHistory(sessionId) {
    returnthis.conversationHistory.get(sessionId) || [];
    }
    // 清理會話
    async cleanupSession(sessionId) {
    this.conversationHistory.delete(sessionId);
    returnawaitthis.mcpServer.cleanup();
    }
    }
    exportdefault AIHandler;

3. Express 服務(wù)器和路由
創(chuàng)建 server.js:

import express from'express';
import cors from'cors';
import helmet from'helmet';
import rateLimit from'express-rate-limit';
import dotenv from'dotenv';
import AIHandler from'./src/ai/handler.js';
// 加載環(huán)境變量
dotenv.config();
const app = express();
const PORT = process.env.PORT || 3000;
// 初始化 AI 處理器
const aiHandler = new AIHandler(process.env.ANTHROPIC_API_KEY);
// 中間件配置
app.use(helmet());
app.use(cors());
app.use(express.json({ limit: '10mb' }));
// 速率限制
const limiter = rateLimit({
windowMs: parseInt(process.env.RATE_LIMIT_WINDOW) || 15 * 60 * 1000,
max: parseInt(process.env.RATE_LIMIT_MAX) || 100,
message: '請求過于頻繁,請稍后再試'
});
app.use(limiter);
// 會話存儲
const sessions = newMap();
// API 路由
// 健康檢查
app.get('/health', (req, res) => {
res.json({ status: 'ok', timestamp: newDate().toISOString() });
});
// 初始化會話
app.post('/api/session/init', async (req, res) => {
try {
const sessionId = req.body.sessionId || session_${Date.now()}_${Math.random().toString(36).substr(2, 9)};
const result = await aiHandler.initializeSession(sessionId);
if (result.success) {
sessions.set(sessionId, {
createdAt: newDate(),
lastActivity: newDate()
});
res.json({
success: true,
sessionId,
message: '會話初始化成功'
});
} else {
res.status(500).json({
success: false,
error: result.error
});
}
} catch (error) {
console.error('Session init error:', error);
res.status(500).json({
success: false,
error: error.message
});
}
});
// 處理用戶指令
app.post('/api/instruction', async (req, res) => {
try {
const { sessionId, instruction, context = {} } = req.body;
if (!sessionId || !instruction) {
return res.status(400).json({
success: false,
error: '缺少必要參數(shù):sessionId 和 instruction'
});
}
// 更新會話活動時間
const session = sessions.get(sessionId);
if (session) {
session.lastActivity = newDate();
}
const result = await aiHandler.processInstruction(sessionId, instruction, context);
res.json(result);
} catch (error) {
console.error('Instruction processing error:', error);
res.status(500).json({
success: false,
error: error.message
});
}
});
// 獲取會話歷史
app.get('/api/session/:sessionId/history', (req, res) => {
const { sessionId } = req.params;
const history = aiHandler.getSessionHistory(sessionId);
res.json({
success: true,
sessionId,
history
});
});
// 清理會話
app.delete('/api/session/:sessionId', async (req, res) => {
try {
const { sessionId } = req.params;
const result = await aiHandler.cleanupSession(sessionId);
sessions.delete(sessionId);
res.json({
success: true,
sessionId,
message: '會話清理成功'
});
} catch (error) {
console.error('Session cleanup error:', error);
res.status(500).json({
success: false,
error: error.message
});
}
});
// 會話清理任務(wù)(定期清理過期會話)
setInterval(() => {
const now = newDate();
const SESSION_TIMEOUT = 30 * 60 * 1000; // 30分鐘
for (const [sessionId, session] of sessions.entries()) {
if (now - session.lastActivity > SESSION_TIMEOUT) {
console.log(清理過期會話: ${sessionId});
aiHandler.cleanupSession(sessionId);
sessions.delete(sessionId);
}
}
}, 5 * 60 * 1000); // 每5分鐘檢查一次
// 錯誤處理中間件
app.use((error, req, res, next) => {
console.error('Unhandled error:', error);
res.status(500).json({
success: false,
error: '服務(wù)器內(nèi)部錯誤'
});
});
// 404 處理
app.use('*', (req, res) => {
res.status(404).json({
success: false,
error: '接口不存在'
});
});
// 啟動服務(wù)器
app.listen(PORT, () => {
console.log(AI Web Assistant 服務(wù)器運行在端口 ${PORT});
console.log(環(huán)境: ${process.env.NODE_ENV});
});
exportdefault app;

四、前端界面實現(xiàn)

創(chuàng)建 public/index.html:

<!DOCTYPE html>
<html lang="zh-CN">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>AI 網(wǎng)頁操作助手</title>
<script src="https://cdn.tailwindcss.com"></script>
<style>
.message-user { background-color: #3b82f6; color: white; }
.message-assistant { background-color: #e5e7eb; color: #374151; }
.typing-indicator { display: inline-block; }
.typing-dot {
display: inline-block;
width: 8px; height: 8px;
background-color: #9ca3af;
border-radius: 50%;
margin: 02px;
animation: typing 1.4s infinite ease-in-out;
}
.typing-dot:nth-child(1) { animation-delay: -0.32s; }
.typing-dot:nth-child(2) { animation-delay: -0.16s; }
@keyframes typing {
0%, 80%, 100% { transform: scale(0); }
40% { transform: scale(1); }
}
</style>
</head>
<body class="bg-gray-100 min-h-screen">
<div class="container mx-auto px-4 py-8 max-w-4xl">

<header class="text-center mb-8">
<h1 class="text-3xl font-bold text-gray-800 mb-2">AI 網(wǎng)頁操作助手</h1>
<p class="text-gray-600">使用自然語言指令自動化網(wǎng)頁操作</p>
</header>

<div class="bg-white rounded-lg shadow-lg overflow-hidden">

<div class="bg-gray-800 text-white p-4 flex justify-between items-center">
<div>

                <span id="sessionStatus" class="text-sm">未連接</span>
            </div>
            <div class="space-x-2">
                <button id="initSession" class="bg-green-600 hover:bg-green-700 px-4 py-2 rounded text-sm">
                    開始新會話
                </button>
                <button id="clearSession" class="bg-red-600 hover:bg-red-700 px-4 py-2 rounded text-sm" disabled>
                    結(jié)束會話
                </button>
            </div>
        </div>
        <!-- 聊天區(qū)域 -->
        <div class="h-96 overflow-y-auto p-4 space-y-4" id="chatMessages">
            <div class="text-center text-gray-500 py-8">
                發(fā)送指令開始與AI助手對話
            </div>
        </div>
        <!-- 輸入?yún)^(qū)域 -->
        <div class="border-t p-4">
            <div class="flex space-x-2">
                <input 
                    type="text" 
                    id="instructionInput" 
                    placeholder="輸入你的指令,例如:打開百度并搜索AI最新進展..." 
                    class="flex-1 border rounded-lg px-4 py-2 focus:outline-none focus:ring-2 focus:ring-blue-500"
                    disabled>
                <button 
                    id="sendButton" 
                    class="bg-blue-600 hover:bg-blue-700 text-white px-6 py-2 rounded-lg disabled:bg-gray-400 disabled:cursor-not-allowed"
                    disabled>
                    發(fā)送
                </button>
            </div>
            <div class="mt-2 text-sm text-gray-500">
                <p>示例指令:</p>
                <div class="flex flex-wrap gap-2 mt-1">
                    <button class="example-instruction text-xs bg-gray-200 hover:bg-gray-300 px-2 py-1 rounded" data-instruction="打開百度首頁">打開百度</button>
                    <button class="example-instruction text-xs bg-gray-200 hover:bg-gray-300 px-2 py-1 rounded" data-instruction="搜索今天的熱門新聞">搜索新聞</button>
                    <button class="example-instruction text-xs bg-gray-200 hover:bg-gray-300 px-2 py-1 rounded" data-instruction="提取當前頁面的所有標題">提取標題</button>
                </div>
            </div>
        </div>
    </div>
    <!-- 會話信息 -->
    <div class="mt-4 bg-white rounded-lg shadow p-4">
        <h3 class="font-semibold mb-2">會話信息</h3>
        <div class="text-sm space-y-1">
            <div>會話ID: <span id="sessionIdDisplay" class="font-mono">-</span></div>
            <div>狀態(tài): <span id="connectionStatus">未連接</span></div>
            <div>消息數(shù): <span id="messageCount">0</span></div>
        </div>
    </div>
</div>
<script>
    class AIAssistant {
        constructor() {
            this.sessionId = null;
            this.isConnected = false;
            this.messageCount = 0;
            this.initializeElements();
            this.attachEventListeners();
        }
        initializeElements() {
            this.sessionStatus = document.getElementById('sessionStatus');
            this.sessionIdDisplay = document.getElementById('sessionIdDisplay');
            this.connectionStatus = document.getElementById('connectionStatus');
            this.messageCountDisplay = document.getElementById('messageCount');
            this.chatMessages = document.getElementById('chatMessages');
            this.instructionInput = document.getElementById('instructionInput');
            this.sendButton = document.getElementById('sendButton');
            this.initSessionBtn = document.getElementById('initSession');
            this.clearSessionBtn = document.getElementById('clearSession');
        }
        attachEventListeners() {
            this.initSessionBtn.addEventListener('click', () => this.initializeSession());
            this.clearSessionBtn.addEventListener('click', () => this.clearSession());
            this.sendButton.addEventListener('click', () => this.sendInstruction());
        this.instructionInput.addEventListener('keypress', (e) => {
                if (e.key === 'Enter') this.sendInstruction();
            });
            // 示例指令點擊事件
            document.querySelectorAll('.example-instruction').forEach(btn => {
                btn.addEventListener('click', (e) => {
                    this.instructionInput.value = e.target.dataset.instruction;
                    this.sendInstruction();
                });
            });
        }
        async initializeSession() {
            try {
                this.showLoading('正在初始化會話...');
                const response = await fetch('/api/session/init', {
                    method: 'POST',
                    headers: { 'Content-Type': 'application/json' },
                    body: JSON.stringify({})
                });
                const data = await response.json();
                if (data.success) {
                    this.sessionId = data.sessionId;
                    this.isConnected = true;
                    this.messageCount = 0;       
                    this.updateUI();
                    this.addMessage('system', '會話已初始化,你可以開始發(fā)送指令了。');
                } else {
                    thrownewError(data.error);
                }
            } catch (error) {
                this.addMessage('error', `初始化失敗: ${error.message}`);
            } finally {
                this.hideLoading();
            }
        }
        async sendInstruction() {
            const instruction = this.instructionInput.value.trim();
            if (!instruction || !this.isConnected) return;
            // 添加用戶消息
            this.addMessage('user', instruction);
            this.instructionInput.value = '';
            // 顯示輸入狀態(tài)
            const thinkingMessage = this.addMessage('assistant', '');
            this.showTypingIndicator(thinkingMessage);
            try {
                const response = await fetch('/api/instruction', {
                    method: 'POST',
                    headers: { 'Content-Type': 'application/json' },
                    body: JSON.stringify({
                        sessionId: this.sessionId,
                        instruction: instruction
                    })
                });
                const data = await response.json();   
                // 移除輸入狀態(tài)
              this.removeTypingIndicator(thinkingMessage);
                if (data.success) {
                    this.addMessage('assistant', data.response);
                } else {
                    this.addMessage('error', `操作失敗: ${data.error}`);
                }
            } catch (error) {
              this.removeTypingIndicator(thinkingMessage);
                this.addMessage('error', `網(wǎng)絡(luò)錯誤: ${error.message}`);
            }
        }
        async clearSession() {
            if (!this.sessionId) return;
            try {
                await fetch(`/api/session/${this.sessionId}`, {
                    method: 'DELETE'
                });
            } catch (error) {
                console.error('清理會話失敗:', error);
            }
            this.sessionId = null;
            this.isConnected = false;
            this.messageCount = 0;
            this.updateUI();
            this.clearMessages();
            this.addMessage('system', '會話已結(jié)束。點擊"開始新會話"重新開始。');
        }
        addMessage(role, content) {
            this.messageCount++;
            this.messageCountDisplay.textContent = this.messageCount;
            const messageDiv = document.createElement('div');
            messageDiv.className = `p-3 rounded-lg max-w-3/4 ${
                role === 'user' ? 'message-user ml-auto' : 
                role === 'error' ? 'bg-red-100 text-red-800 border border-red-200' :
                'message-assistant'
            }`;
            if (role === 'thinking') {
                messageDiv.innerHTML = '<div class="typing-indicator"><span class="typing-dot"></span><span class="typing-dot"></span><span class="typing-dot"></span></div>';
            } else {
                messageDiv.textContent = content;
            }
            this.chatMessages.appendChild(messageDiv);
            this.chatMessages.scrollTop = this.chatMessages.scrollHeight;
            return messageDiv;
        }
        showTypingIndicator(messageElement) {
            messageElement.innerHTML = '<div class="typing-indicator"><span class="typing-dot"></span><span class="typing-dot"></span><span class="typing-dot"></span></div>';
        }
        removeTypingIndicator(messageElement) {
            messageElement.innerHTML = '';
        }
        clearMessages() {
            this.chatMessages.innerHTML = '<div class="text-center text-gray-500 py-8">發(fā)送指令開始與AI助手對話</div>';
        }
        showLoading(message) {
            this.initSessionBtn.disabled = true;
            this.initSessionBtn.textContent = message;
        }
        hideLoading() {
            this.initSessionBtn.disabled = false;
            this.initSessionBtn.textContent = '開始新會話';
        }
        updateUI() {
            this.sessionStatus.textContent = this.isConnected ? '已連接' : '未連接';
            this.sessionIdDisplay.textContent = this.sessionId || '-';
            this.connectionStatus.textContent = this.isConnected ? '活躍' : '未連接';
            this.connectionStatus.className = this.isConnected ? 'text-green-600' : 'text-red-600';
            this.instructionInput.disabled = !this.isConnected;
            this.sendButton.disabled = !this.isConnected;
            this.clearSessionBtn.disabled = !this.isConnected;
        }
    }
    // 初始化應(yīng)用
    document.addEventListener('DOMContentLoaded', () => {
        new AIAssistant();
    });
</script>

</body>
</html>

五、測試與驗證

1. 創(chuàng)建測試腳本
創(chuàng)建 tests/integration.test.js:

import { test, expect } from'@playwright/test';
import AIHandler from'../src/ai/handler.js';
import dotenv from'dotenv';
dotenv.config();
test.describe('AI Web Assistant Integration Tests', () => {
let aiHandler;
let sessionId;
test.beforeEach(async () => {
aiHandler = new AIHandler(process.env.ANTHROPIC_API_KEY);
const initResult = await aiHandler.initializeSession();
sessionId = initResult.sessionId;
});
test.afterEach(async () => {
await aiHandler.cleanupSession(sessionId);
});
test('should initialize session successfully', async () => {
expect(sessionId).toBeDefined();
expect(typeof sessionId).toBe('string');
});
test('should process simple navigation instruction', async () => {
const result = await aiHandler.processInstruction(
sessionId,
'請打開百度首頁 https://www.baidu.com'
);
expect(result.success).toBe(true);
expect(result.response).toBeDefined();
});
test('should handle invalid instruction gracefully', async () => {
const result = await aiHandler.processInstruction(
sessionId,
'執(zhí)行一個不存在的操作'
);
// 即使指令有問題,也應(yīng)該有合理的響應(yīng)
expect(result.response).toBeDefined();
});
});

2. 運行測試
npm test

六、部署與運行

1. 生產(chǎn)環(huán)境配置
創(chuàng)建 ecosystem.config.js:

module.exports = {
apps: [{
name: 'ai-web-assistant',
script: 'server.js',
instances: 'max',
exec_mode: 'cluster',
env: {
NODE_ENV: 'production',
PORT: 3000
},
env_production: {
NODE_ENV: 'production'
}
}]
};

2. Docker 配置
創(chuàng)建 Dockerfile:

FROM node:18-alpine
WORKDIR /app

安裝 Playwright 依賴

RUN apk add --no-cache
chromium
nss
freetype
freetype-dev
harfbuzz
ca-certificates
ttf-freefont

設(shè)置環(huán)境變量

ENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=true
ENV PUPPETEER_EXECUTABLE_PATH=/usr/bin/chromium-browser

復制 package.json 并安裝依賴

COPY package*.json ./
RUN npm ci --only=production

復制源代碼

COPY . .

創(chuàng)建非root用戶

RUN addgroup -g 1001 -S nodejs
RUN adduser -S nextjs -u 1001
USER nextjs
EXPOSE3000
CMD ["npm", "start"]

3. 啟動應(yīng)用

開發(fā)模式

npm run dev

生產(chǎn)模式

npm start

七、實際應(yīng)用場景

場景1:自動化數(shù)據(jù)收集

// 指令:收集 GitHub 趨勢項目
const instruction = 請訪問 GitHub Trending 頁面 (https://github.com/trending), 收集今天最流行的 JavaScript 項目的前5名, 包括項目名稱、星標數(shù)和描述, 并整理成 JSON 格式返回。;

場景2:自動化表單填寫

// 指令:注冊測試用戶
const instruction = `
請打開我們的測試注冊頁面 http://localhost:3000/register,
填寫以下信息:

  • 用戶名: testuser_${Date.now()}
  • 郵箱: test${Date.now()}@example.com
  • 密碼: TestPassword123
    然后點擊注冊按鈕,并確認注冊成功。
    `;

場景3:復雜工作流程

// 指令:完整的電商流程測試
const instruction = `
請執(zhí)行以下電商購物流程:

  1. 登錄到測試電商網(wǎng)站
  2. 搜索"筆記本電腦"
  3. 選擇第一個商品
  4. 添加到購物車
  5. 進入結(jié)算流程
  6. 填寫測試配送信息
  7. 確認訂單
    請在每個步驟完成后報告狀態(tài)。
    `;

總結(jié)

通過本教程,我們成功構(gòu)建了一個功能完整的 AI 網(wǎng)頁操作助手,具備以下特點:

  • 完整的架構(gòu):從前端界面到后端服務(wù),再到瀏覽器自動化層
  • 靈活的 MCP 協(xié)議:支持多種網(wǎng)頁操作工具
  • 智能的 AI 集成:利用 Claude 模型理解自然語言指令
  • 健壯的錯誤處理:能夠應(yīng)對各種網(wǎng)頁異常情況
  • 可擴展的設(shè)計:易于添加新的工具和功能

這個項目展示了如何將現(xiàn)代 AI 技術(shù)與瀏覽器自動化相結(jié)合,創(chuàng)造出能夠理解并執(zhí)行復雜網(wǎng)頁操作的智能助手。你可以在此基礎(chǔ)上繼續(xù)擴展,比如添加視覺識別、多瀏覽器支持、分布式任務(wù)處理等功能,打造更強大的自動化解決方案。
立即開始構(gòu)建你自己的 AI 網(wǎng)頁助手,釋放自動化的無限可能!

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容