Windows开发者必看：从零开始用LangChain和Ollama构建本地AI应用

引言

在AI技术快速发展的今天，许多开发者希望能在本地运行AI模型而不依赖云服务。本文将带你使用LangChain框架和Ollama工具，在Windows系统上从零开始构建一个本地AI应用。无需昂贵的GPU，只需普通PC即可运行！

准备工作

环境要求

Windows 10/11（64位）
Python 3.8或更高版本
至少8GB内存（推荐16GB）
20GB可用磁盘空间

需要安装的软件

Python环境
Git（可选，用于克隆示例仓库）

第一步：安装Python和必要工具

安装Python
- 访问Python官网
- 下载最新稳定版（如3.11.x）
- 安装时勾选”Add Python to PATH”
验证安装
打开命令提示符(CMD)或PowerShell，输入：
代码片段
```
python --version
pip --version
```
应该显示Python和pip的版本信息。

第二步：安装Ollama

Ollama是一个简化本地运行大型语言模型的工具。

下载Ollama
访问Ollama官网下载Windows版本
安装并运行
双击下载的安装包，按照向导完成安装。

安装完成后，打开PowerShell运行：
代码片段
```
ollama --version
```
应该显示版本信息。
下载模型
Ollama支持多种模型，我们先下载一个较小的模型测试：
代码片段
```
ollama pull llama2
```
这会下载约4GB的模型文件，具体大小取决于你的网络速度。

第三步：设置Python虚拟环境

为避免依赖冲突，我们创建一个专用虚拟环境：

代码片段

python -m venv langchain_env
.\langchain_env\Scripts\activate

激活后，提示符前会出现(langchain_env)标记。

第四步：安装LangChain和相关库

在激活的虚拟环境中运行：

代码片段

pip install langchain openai tiktoken python-dotenv gradio

langchain: AI应用开发框架
openai/tiktoken: OpenAI兼容API和token计数工具
python-dotenv: 环境变量管理
gradio: 快速创建Web界面

第五步：编写第一个LangChain应用

创建一个app.py文件，内容如下：

代码片段

from langchain_community.llms import Ollama
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain

# 初始化Ollama LLM实例（连接本地运行的Ollama服务）
llm = Ollama(model="llama2")

# 创建提示模板
prompt = PromptTemplate(
    input_variables=["topic"],
    template="请用简单的语言解释什么是{topic}?"
)

# 创建链式处理器
chain = LLMChain(llm=llm, prompt=prompt)

# 运行链式处理并打印结果
result = chain.run(topic="量子计算")
print(result)

运行这个脚本：

代码片段

python app.py

第一次运行时可能需要较长时间加载模型（1-5分钟），后续调用会快很多。

第六步：创建交互式Web界面

使用Gradio创建一个简单的Web界面来交互：

修改app.py为：

代码片段

import gradio as gr
from langchain_community.llms import Ollama

# 初始化LLM (确保Ollama服务正在运行)
llm = Ollama(model="llama2")

def generate_response(prompt):
    # LangChain的简单调用方式可以直接使用.predict()
    response = llm.predict(prompt)
    return response

# 创建Gradio界面
demo = gr.Interface(
    fn=generate_response,
    inputs=gr.Textbox(lines=2, placeholder="输入你的问题..."),
    outputs="text",
    title="本地AI助手",
    description="基于LangChain和Ollama构建的本地AI应用"
)

if __name__ == "__main__":
    demo.launch(server_name="0.0.0.0", server_port=7860)

运行后访问 http://localhost:7860 ，你将看到一个简单的聊天界面。

高级功能：添加记忆能力

让AI记住对话上下文：

代码片段

from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory()

def chat_with_memory(message, history):
    # history参数由Gradio自动维护

    # LangChain的记忆集成方式稍微不同需要手动处理上下文

    # 将历史对话添加到内存中（简化处理）
    for human, ai in history:
        memory.save_context({"input": human}, {"output": ai})

    # 获取当前输入的响应    
    response = llm.predict(message)

    # 保存当前对话到内存中    
    memory.save_context({"input": message}, {"output": response})

    return response

demo = gr.ChatInterface(
    fn=chat_with_memory,
    title="带记忆的AI助手",
)

demo.launch()

Ollama常用命令参考

查看可用模型:
代码片段
```
ollama list
```
删除模型:
代码片段
```
ollama rm <model-name>
```
启动服务:
代码片段
```
ollama serve 
```
升级Ollama:
代码片段
```
ollama upgrade 
```

常见问题解决

OLLAMA_MODELS路径错误:
- Windows默认路径是 C:\Users\<username>\.ollama\models
端口冲突:
- Ollama默认使用11434端口，确保没有被其他程序占用
内存不足:
- ollama pull llama2:7b-chat (7B参数的小型版)
中文支持不好:
- ollama pull qwen:7b (阿里通义千问中文模型)

性能优化建议

量化模型:

代码片段

ollama pull llama2:7b-chat-q4_0

q4_0表示4位量化版本，内存占用更少但精度略低。

关闭不必要的后台程序
调整批处理大小
在代码中添加：

代码片段

llm = Ollama(model="llam2", num_ctx=2048) #减少上下文长度以节省内存

总结

通过本文你学会了：
1. Windows下配置Python和Ollama环境 ✅
2. LangChain框架的基本使用方法 ✅
3. Gradio创建简单Web界面 ✅
4. AI对话的记忆功能实现 ✅

接下来你可以尝试：
– [ ] Fine-tune自己的模型
– [ ] RAG(检索增强生成)实现文档问答
– [ ] Agent实现多步骤任务

完整代码已上传GitHub：示例仓库链接