JavaScript中LangChain实现多模态应用：聊天机器人实战案例

引言

在2025年的今天，多模态AI应用已经成为主流趋势。本文将带你使用JavaScript和LangChain框架，构建一个能够处理文本、图像甚至语音的多模态聊天机器人。这个实战案例非常适合前端开发者入门AI应用开发。

准备工作

环境要求

Node.js 18+ (推荐20+)
npm 9+
LangChain.js 最新版
OpenAI API密钥 (或其他支持的LLM提供商)

安装依赖

代码片段

npm install langchain @langchain/core @langchain/community dotenv

项目结构

代码片段

/multimodal-chatbot
  |- /src
    |- index.js        # 主入口文件
    |- config.js       # 配置管理
    |- utils.js        # 工具函数
  |- .env              # 环境变量
  |- package.json

基础聊天机器人实现

1. 初始化LangChain环境

首先创建.env文件存储API密钥：

代码片段

OPENAI_API_KEY=your_api_key_here

然后在config.js中配置基础设置：

代码片段

import 'dotenv/config';

export const config = {
  openAIApiKey: process.env.OPENAI_API_KEY,
  modelName: 'gpt-4-turbo', // OpenAI最新模型(2025)
};

2. 创建基础聊天链

在index.js中：

代码片段

import { ChatOpenAI } from "@langchain/openai";
import { HumanMessage, SystemMessage } from "@langchain/core/messages";
import { config } from "./config.js";

// 初始化聊天模型
const chatModel = new ChatOpenAI({
  openAIApiKey: config.openAIApiKey,
  modelName: config.modelName,
});

// 定义系统角色设定
const systemMessage = new SystemMessage(
 "你是一个友好的多模态助手，可以处理文本、图像等多种输入"
);

// 简单的聊天函数
export async function chatWithBot(userInput) {
 const humanMessage = new HumanMessage(userInput);

 const response = await chatModel.invoke([
   systemMessage,
   humanMessage,
 ]);

 return response.content;
}

// 测试对话
(async () => {
 const response = await chatWithBot("你好！");
 console.log(response);
})();

添加多模态支持

3. 处理图像输入

我们需要扩展功能来处理图像输入：

代码片段

import { ChatOpenAI } from "@langchain/openai";
import { HumanMessage, SystemMessage } from "@langchain/core/messages";

// ...之前的配置代码...

// enhancedChatModel支持多模态输入
const multimodalChatModel = new ChatOpenAI({
 openAIApiKey: config.openAIApiKey,
 modelName: "gpt-4-vision-preview", // OpenAI视觉模型(2025年可能已更新)
 maxTokens: 1000,
});

// base64编码图像处理函数(在utils.js中)
export function encodeImageToBase64(fileBuffer) {
 return `data:image/jpeg;base64,${fileBuffer.toString('base64')}`;
}

// enhancedChatWithBot处理多模态输入
export async function enhancedChatWithBot(input) {
 let humanMessage;

 if (input.type === 'text') {
   humanMessage = new HumanMessage({
     content: [
       { type: "text", text: input.content },
     ],
   });
 } else if (input.type === 'image') {
   const base64Image = encodeImageToBase64(input.content);

   humanMessage = new HumanMessage({
     content: [
       {
         type: "image_url",
         image_url: base64Image,
       },
     ],
   });
 }

 const response = await multimodalChatModel.invoke([
   systemMessage,
   humanMessage,
 ]);

 return response.content;
}

4. Web集成示例

以下是简单的Express服务器集成示例：

代码片段

import express from 'express';
import multer from 'multer';
import { enhancedChatWithBot } from './index.js';

const app = express();
const upload = multer();
app.use(express.json());

app.post('/api/chat', async (req, res) => {
 try {
   const { message } = req.body;

   if (!message) {
     return res.status(400).json({ error: '消息不能为空' });
   }

   const response = await enhancedChatWithBot({
     type: 'text',
     content: message,
   });

   res.json({ response });
 } catch (error) {
   console.error(error);
   res.status(500).json({ error: '服务器错误' });
 }
});

app.post('/api/chat/image', upload.single('image'), async (req, res) => {
 try {
   if (!req.file) {
     return res.status(400).json({ error: '请上传图片' });
   }

   const response = await enhancedChatWithBot({
     type: 'image',
     content: req.file.buffer,
   });

   res.json({ response });
 } catch (error) {
   console.error(error);
   res.status(500).json({ error: '服务器错误' });
 }
});

const PORT = process.env.PORT || 3000;
app.listen(PORT, () => {
 console.log(`服务器运行在 http://localhost:${PORT}`);
});

LangChain高级功能集成

5.添加记忆功能

让机器人记住对话上下文：

代码片段

import { ConversationChain } from "langchain/chains";
import { BufferMemory } from "langchain/memory";

// ...之前的代码...

// memory实例保持对话上下文
const memory = new BufferMemory();

// conversationChain整合记忆功能
const conversationChain = new ConversationChain({
 llm: multimodalChatModel,
 memory,
});

export async function chatWithMemory(input) {
 let formattedInput;

 if (input.type === 'text') {
   formattedInput = input.content;
 } else if (input.type === 'image') {
   formattedInput = "用户上传了一张图片";
 }

 const response = await conversationChain.invoke({
 input: formattedInput,
 });

 return response.response; // ConversationChain返回的结构不同
}

6.添加工具调用能力

让机器人可以执行外部操作：

代码片段

import { ToolExecutor } from "@langchain/langgraph/prebuilt";
import { TavilySearchResults } from "@langchain/community/tools/tavily_search";

// ...之前的代码...

// tools实例添加外部能力(如网络搜索)
const tools = [new TavilySearchResults()];
const toolExecutor = new ToolExecutor({ tools });

export async function chatWithTools(input) {
 // ...之前的消息格式化逻辑...

 // tool calling流程控制更复杂，这里简化示例...
}

实践经验和注意事项

性能优化：
- API调用有延迟，考虑添加加载状态和流式响应。
错误处理：
“`javascript
try {
// LangChain调用代码…
} catch (error) {
if (error.name === ‘AbortError’) {
console.log(‘请求被用户取消’);
} else if (error.response?.status === 429) {
console.log(‘API速率限制’);
}
throw error; // or handle gracefully in UI.
}
成本控制：
安全性：
本地开发技巧：使用Mock服务减少API调用。

完整示例代码整合

以下是完整的index.js整合版本：

代码片段

import { ChatOpenAI } from "@langchain/openai";
import { HumanMessage, SystemMessage } from "@langchain/core/messages";
import { ConversationChain } from "langchain/chains";
import { BufferMemory } from "langchain/memory";
import { config } from "./config.js";

// Initialize models and memory.
const chatModel = new ChatOpenAI({
 openAIApiKey: config.openAIApiKey,
 modelName: config.modelName,
});

const multimodalChatModel = new ChatOpenAI({
 openAIApiKey: config.openAIApiKey,
 modelName: "gpt-4-vision-preview",
 maxTokens:1000,
});

const memory=new BufferMemory();
const conversationChain=new ConversationChain({
 llm : multimodalChatModel ,
 memory ,
});

const systemMsg=new SystemMessage(
 `你是一个友好的多模态助手，可以处理文本、图像等多种输入。
 Current date : ${new Date().toISOString()}`
);

export async function enhancedChatWithBot(input){
 let humanMsg;

 if(input.type==='text'){
    humanMsg=new HumanMessage({content:[{type:"text",text : input.content}]});
 }
 else if(input.type==='image'){
    const base64Img=`data : image / jpeg ; base64 , ${input.content.toString('base64')}`;
    humanMsg=new HumanMessage({content:[{type:"image_url",image_url : base64Img}]});
 }

 try{
    const resp=await conversationChain.invoke({input : humanMsg.content});
    return resp.response;
 }
 catch(err){
    console.error("Error in LLM invocation:",err);
    throw err;
 }
}

/* Example usage:
(async () =>{
 const textResp=await enhancedChatWithBot({type:'text',content:'你好'});
 console.log(textResp);

 const imgBuffer=fs.readFileSync('./test.jpg');
 const imgResp=await enhancedChatWithBot({type:'image',content : imgBuffer});
 console.log(imgResp); 
})();*/

总结与展望

通过本教程，我们实现了：
1. JavaScript环境中LangChain的基本集成。
2. GPT-4视觉模型的多模态交互能力。
3. Express服务器的简单API端点。
4.对话记忆和工具调用的高级功能。

微信扫码登录

JavaScript中LangChain实现多模态应用：聊天机器人实战案例 (2025年05月)

JavaScript中LangChain实现多模态应用：聊天机器人实战案例

引言

准备工作

环境要求

安装依赖

项目结构

基础聊天机器人实现

1. 初始化LangChain环境

2. 创建基础聊天链

添加多模态支持

3. 处理图像输入

4. Web集成示例

LangChain高级功能集成

5.添加记忆功能

6.添加工具调用能力

实践经验和注意事项

完整示例代码整合

总结与展望