零基础入门：Google Cloud Run系统安装Together AI详细步骤

引言

Together AI是一个强大的开源AI平台，可以帮助开发者快速构建和部署AI应用。本文将详细介绍如何在Google Cloud Run上安装和部署Together AI服务，即使你没有任何云计算经验也能轻松上手。

准备工作

在开始之前，你需要：

一个Google Cloud账号（新用户可获赠$300免费额度）
已安装Google Cloud SDK（安装指南见下文）
基本的命令行操作知识

安装Google Cloud SDK

如果你尚未安装GCloud SDK，请执行以下步骤：

代码片段

# 对于Mac用户
brew install --cask google-cloud-sdk

# 对于Linux用户
curl https://sdk.cloud.google.com | bash
exec -l $SHELL

# 对于Windows用户，请从官网下载安装包：
# https://cloud.google.com/sdk/docs/install

安装完成后初始化：

代码片段

gcloud init

按照提示登录并选择项目（如果没有项目会自动创建一个）。

详细步骤

步骤1：创建Docker镜像

首先我们需要为Together AI创建一个Docker镜像。

代码片段

# 创建一个新目录并进入
mkdir together-ai && cd together-ai

# 创建Dockerfile文件
touch Dockerfile

编辑Dockerfile内容如下：

代码片段

# 使用官方Python基础镜像
FROM python:3.9-slim

# 设置工作目录
WORKDIR /app

# 复制当前目录内容到容器中的/app目录
COPY . /app

# 安装依赖项
RUN pip install --no-cache-dir -r requirements.txt

# Together AI需要的特定端口
EXPOSE 5000

# 定义环境变量
ENV NAME TogetherAI

# 容器启动时运行app.py
CMD ["python", "app.py"]

创建requirements.txt文件：

代码片段

flask==2.0.1
torch==1.9.0
transformers==4.11.3
together-ai-sdk==0.1.2

创建app.py作为入口文件：

代码片段

from flask import Flask, request, jsonify
import together_ai_sdk as tai

app = Flask(__name__)

@app.route('/predict', methods=['POST'])
def predict():
    data = request.json

    # Initialize Together AI client with your API key (will be set as env var)
    client = tai.Client(api_key=os.getenv('TOGETHER_API_KEY'))

    # Make prediction using Together AI model (example uses text generation)
    response = client.generate(
        prompt=data['prompt'],
        max_length=data.get('max_length', 50)
    )

    return jsonify({"result": response})

if __name__ == '__main__':
    app.run(debug=True, host='0.0.0.0', port=5000)

步骤2：构建和测试Docker镜像本地运行

构建Docker镜像：

代码片段

docker build -t together-ai .

测试本地运行：

代码片段

docker run -p 5000:5000 -e TOGETHER_API_KEY=your_api_key_here together-ai

注意：你需要先在Together AI官网注册获取API密钥。

步骤3：将镜像推送到Google Container Registry (GCR)

首先启用Container Registry API：

代码片段

gcloud services enable containerregistry.googleapis.com

然后构建并推送镜像：

代码片段

# Build the Docker image with GCR tag format [HOSTNAME]/[PROJECT-ID]/[IMAGE]:[TAG]
docker build -t gcr.io/$(gcloud config get-value project)/together-ai:v1 .

# Push the image to GCR (首次需要认证)
gcloud auth configure-docker && docker push gcr.io/$(gcloud config get-value project)/together-ai:v1

步骤4：在Cloud Run上部署服务

现在可以部署到Cloud Run了：

代码片段

gcloud run deploy together-ai \
--image gcr.io/$(gcloud config get-value project)/together-ai:v1 \
--platform managed \
--region us-central1 \
--allow-unauthenticated \
--set-env-vars TOGETHER_API_KEY=your_api_key_here \
--memory=4Gi \
--cpu=2 \
--port=5000 \
--timeout=300s \ 
--concurrency=80

参数说明：
– --memory和--cpu: Together AI需要较多资源，建议至少4GB内存和2个CPU
– --timeout: AI推理可能需要较长时间处理请求
– --concurrency: Cloud Run默认并发数为80，可根据需求调整
– --allow-unauthenticated: 允许公开访问（生产环境建议设置认证）

步骤5：验证部署成功

部署完成后会显示服务URL，你可以这样测试：

代码片段

curl -X POST \ 
-H "Content-Type: application/json" \ 
-d '{"prompt":"Explain quantum computing in simple terms", "max_length":100}' \ 
https://your-service-url.a.run.app/predict

应该会返回类似这样的响应：

代码片段

{
    "result": "Quantum computing is a type of computation that uses quantum bits or qubits..."
}

常见问题解决

问题1: Docker构建时pip install失败
解决:
确保requirements.txt中的包名和版本正确，可以尝试：

代码片段

pip install --upgrade pip setuptools wheel

然后再重新构建。

问题2: Cloud Run服务启动超时
解决:
增加启动超时时间：

代码片段

gcloud run services update together-ai --timeout=600s

同时检查日志查看具体错误：

代码片段

gcloud logging read "resource.type=cloud_run_revision AND resource.labels.service_name=together-ai" --limit=50 --format json

问题3: API请求返回503错误
解决:
通常是因为实例冷启动或资源不足。可以：
1. 增加内存和CPU分配
2. 设置最小实例数避免冷启动：

代码片段

gcloud run services update together-ai --min-instances=1

检查是否设置了正确的环境变量

最佳实践建议

密钥管理：不要将API密钥硬编码在代码中，使用Google Secret Manager更安全：

代码片段

echo -n "your-api-key" | gcloud secrets create together-api-key --data-file=- 

# Then in deployment:
gcloud run services update together-ai \ 
--update-secrets TOGETHER_API_KEY=together-api-key:latest

自动伸缩：根据负载情况调整自动伸缩配置：

代码片段

# Set max instances to prevent cost overrun   
gcloud run services update together-ai --max-instances=10 

# Enable CPU-based autoscaling   
gcloud run services update together-ai --cpu-throttling

监控设置：启用Cloud Monitoring查看性能指标：
代码片段
```
gcloud services enable monitoring.googleapis.com   
```

总结

通过本文的详细步骤，你已经成功在Google Cloud Run上部署了Together AI服务。关键点回顾：

Docker容器化是Cloud Run部署的基础
GCR是存储容器镜像的推荐位置
Cloud Run提供无服务器托管方案，自动处理扩缩容
Together AI需要足够的计算资源才能稳定运行

现在你可以基于这个基础服务开发各种AI应用了！如需扩展功能，可以考虑添加API网关、身份验证等组件来完善你的解决方案。