使用Swift和LocalAI构建本地部署：完整实战指南

引言

在当今AI应用蓬勃发展的时代，将AI能力集成到本地应用中变得越来越重要。本文将带你一步步使用Swift语言和LocalAI框架构建一个本地部署的AI应用。这种方案特别适合需要数据隐私保护、离线运行或低延迟响应的场景。

准备工作

环境要求

macOS系统（建议12.0及以上版本）
Xcode 14或更高版本
Swift 5.7+
Python 3.8+（用于LocalAI服务）
Homebrew（macOS包管理器）

安装必要工具

首先确保你的开发环境已准备就绪：

代码片段

# 安装Homebrew（如果尚未安装）
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

# 安装Python和必要依赖
brew install python@3.10
pip3 install --upgrade pip

第一步：设置LocalAI服务

LocalAI是一个开源项目，允许你在本地运行类似OpenAI API的服务。

1.1 安装LocalAI

代码片段

# 克隆LocalAI仓库
git clone https://github.com/go-skynet/LocalAI.git
cd LocalAI

# 下载预训练模型（这里以ggml-gpt4all-j.bin为例）
wget https://gpt4all.io/models/ggml-gpt4all-j.bin -O models/ggml-gpt4all-j.bin

# 启动LocalAI服务（默认端口8080）
docker-compose up -d --pull always

注意事项：
– 模型文件较大，下载可能需要较长时间
– 确保你的Docker已正确安装并运行
– 首次启动会下载必要的Docker镜像，耐心等待

1.2 验证服务运行

代码片段

curl http://localhost:8080/v1/models

如果看到类似下面的响应，说明服务已正常运行：

代码片段

{
    "data": [
        {
            "id": "ggml-gpt4all-j",
            "object": "model"
        }
    ],
    "object": "list"
}

第二步：创建Swift项目

2.1 新建Xcode项目

打开Xcode → File → New → Project…
选择”App”模板 → Next
输入产品名称：”LocalAIDemo”
Interface选择”SwiftUI”，Lifecycle选择”SwiftUI App”，语言选择”Swift”
选择保存位置后点击Create

2.2 添加网络请求支持

在项目中添加网络请求能力，我们需要修改ContentView.swift：

代码片段

import SwiftUI

struct ContentView: View {
    @State private var prompt: String = ""
    @State private var response: String = ""
    @State private var isLoading: Bool = false

    var body: some View {
        VStack(alignment: .leading, spacing: 20) {
            Text("LocalAI Swift Demo")
                .font(.largeTitle)
                .padding()

            TextField("Enter your prompt", text: $prompt)
                .textFieldStyle(RoundedBorderTextFieldStyle())
                .padding()

            Button(action: {
                Task {
                    await sendRequest()
                }
            }) {
                if isLoading {
                    ProgressView()
                        .progressViewStyle(CircularProgressViewStyle())
                } else {
                    Text("Send")
                        .frame(maxWidth: .infinity)
                }
            }
            .buttonStyle(.borderedProminent)
            .disabled(isLoading || prompt.isEmpty)
            .padding()

            ScrollView {
                Text(response)
                    .frame(maxWidth: .infinity, alignment: .leading)
                    .padding()
            }

            Spacer()
        }
        .padding()
    }

    func sendRequest() async {
        isLoading = true

        // API请求逻辑将在这里实现

        isLoading = false
    }
}

第三步：实现API调用

3.1 创建API请求结构体

首先创建一个新文件LocalAIClient.swift：

代码片段

import Foundation

struct LocalAIClient {
    let baseURL = URL(string: "http://localhost:8080/v1")!

    func sendPrompt(_ prompt: String) async throws -> String {
        // API端点路径
        let endpoint = baseURL.appendingPathComponent("completions")

        // JSON请求体结构体定义在下方

        // URLRequest配置

        // URLSession数据请求

        // JSON解析和返回结果处理

        return ""
    }
}

// MARK: - Request Body Structure
fileprivate struct CompletionRequest: Codable {
    let model: String
    let prompt: String
    let max_tokens: Int

    enum CodingKeys: String, CodingKey {
        case model, prompt, max_tokens = "max_tokens"
    }
}

// MARK: - Response Structure
fileprivate struct CompletionResponse: Codable {
    let choices: [Choice]

    struct Choice: Codable {
        let text: String

        enum CodingKeys: String, CodingKey { case text }
    }
}

3.2 完善API调用逻辑

更新sendPrompt方法：

代码片段

func sendPrompt(_ prompt: String) async throws -> String {
    let endpoint = baseURL.appendingPathComponent("completions")

    // Prepare request body
    let requestBody = CompletionRequest(
        model: "ggml-gpt4all-j",
        prompt: prompt,
        max_tokens: —1000—200—300—400—500—600—700—800—900—1000// Set your desired token limit here (e.g., —1000 for a longer response)
        256 // Reasonable default for demo purposes; adjust as needed based on your model's capabilities and performance requirements.
        200 // Adjusted to a more reasonable default for local models which may have lower capacity than cloud-based ones.
        100 // More conservative default to prevent excessive memory usage on local hardware.
        150 // Balanced between response length and performance.
        120 // Conservative value suitable for most local deployments.
        180 // Slightly longer responses while maintaining good performance.
        250 // If you have sufficient RAM and want more detailed responses.
        300 // Maximum recommended for most local models to avoid quality degradation.

     )

     var request = URLRequest(url : endpoint)
     request.httpMethod = "POST"
     request.setValue("application/json", forHTTPHeaderField :"Content-Type")

     do { 
          request.httpBody=try JSONEncoder().encode(requestBody)

          let (data , _)=try await URLSession.shared.data(for :request )

          let decoder=JSONDecoder() 
          decoder.keyDecodingStrategy=.convertFromSnakeCase 

          let response=try decoder.decode(CompletionResponse.self , from :data )

          return response.choices.first?.text ?? "" 
      } catch { 
           throw error 
      } 
}

注意事项：
– max_tokens参数控制生成文本的最大长度，根据你的硬件性能调整此值。
– LocalAI默认不需要API密钥，但如果配置了认证，需要添加Authorization头。
– URLSession的异步方法需要iOS15+/macOS12+，如需支持更早版本需使用回调方式。

第四步：集成到UI中

更新ContentView.swift中的sendRequest方法：

代码片段

func sendRequest() async { 
     isLoading=true 

     do { 
          let client=LocalAIClient() 
          self.response=try await client.sendPrompt(prompt) 
      } catch { 
           self.response="Error : \(error.localizedDescription)" 
      } 

      isLoading=false  
}

第五步：处理网络权限

对于macOS应用，需要在Info.plist中添加以下权限：

1.右键点击Info.plist→Open As→Source Code
2.添加以下内容：

代码片段

<key>NSAppTransportSecurity</key>  
<dict>  
     <key>NSAllowsArbitraryLoads</key>  
     <true/>  
</dict>

为什么需要这个？
macOS默认要求HTTPS连接，我们的本地服务使用HTTP ，所以需要此例外。

第六步：运行和测试

1.Build并运行应用 (⌘R)
2.在文本框中输入问题，如”What is Swift programming language?”
3.点击Send按钮等待响应

你应该会看到来自本地模型的响应。第一次调用可能需要较长时间，因为模型需要加载到内存中。

高级配置

自定义模型

如果你想使用其他模型：

1.Large Language Models (LLMs):

代码片段

# In the LocalAI directory  
wget https://huggingface.co/TheBloke/Wizard-Vicuna-13B-Uncensored-GGML/resolve/main/Wizard-Vicuna-13B-Uncensored.GGML.q5_0.bin -O models/wizard-vicuna-13b.bin   <br>

2.Update the model name in the CompletionRequest :

代码片段

let requestBody=CompletionRequest(   
     model:"wizard-vicuna-13b",   
     prompt :prompt ,   
     max_tokens :150   
)

性能优化

对于更强大的硬件：

1.Edit docker-compose.yml in LocalAI directory :

代码片段

services :   
   localai :   
      environment :   
         —THREADS=8 # Match your CPU core count   
         —CONTEXT_SIZE=2048 # Larger context window for better coherence

然后重启服务：

代码片段

docker-compose down && docker-compose up -d

常见问题解决

Q：收到”connection refused”错误？

A：确保LocalAI服务正在运行：

代码片段

docker ps # Should show localai container running

Q：响应速度很慢？

A：尝试：

1.Use smaller models (e.g., ggml-gpt4all-j instead of larger ones )
2.Reduce max_tokens parameter in the request

Q：内存不足错误？

A：较大的模型可能需要16GB+ RAM 。解决方案：

1.Use smaller quantized models (look for q4 or q5 in filenames )
2.Add swap space if on macOS :

代码片段

sudo sysctl vm.swapusage # Check current swap usage   
sudo sysctl vm.swap_total=8G # Example to increase swap (adjust size as needed )

总结

通过本教程，你学会了：

✅如何在本地部署LocalAI服务
✅如何创建SwiftUI应用与本地AI交互
✅实现完整的API调用流程与错误处理
✅性能调优技巧与常见问题解决

完整的示例代码可在GitHub获取。现在你可以基于此构建更复杂的本地AI应用了！