Ruby开发者的RAG入门到精通指南 (2025年05月)

引言

在当今AI驱动的开发环境中，检索增强生成(RAG)已成为Ruby开发者必备的技能之一。本文将带你从零开始，逐步掌握如何在Ruby中实现RAG系统，让你的应用能够智能地检索信息并生成高质量内容。

什么是RAG？

RAG(Retrieval-Augmented Generation)是一种结合信息检索和文本生成的技术。它首先从知识库中检索相关信息，然后基于这些信息生成更准确、更有上下文的回答。

准备工作

环境要求

Ruby 3.2+
Bundler
PostgreSQL (用于向量数据库)
OpenAI API密钥 (或其他LLM服务)

安装必要gem

代码片段

gem install pg vector openai pry

或者添加到你的Gemfile:

代码片段

gem 'pg'
gem 'vector'
gem 'openai'
gem 'pry', group: :development

第一步：设置向量数据库

我们将使用PostgreSQL的pgvector扩展来存储和检索向量数据。

代码片段

# database.rb
require 'pg'

def setup_database
  conn = PG.connect(dbname: 'postgres')

  begin
    conn.exec("CREATE DATABASE rag_demo")
  rescue PG::Error => e
    puts "Database already exists or error: #{e.message}"
  end

  conn.close

  # Connect to new database and setup pgvector
  conn = PG.connect(dbname: 'rag_demo')

  # Enable pgvector extension
  conn.exec("CREATE EXTENSION IF NOT EXISTS vector")

  # Create table for documents
  conn.exec(<<~SQL)
    CREATE TABLE IF NOT EXISTS documents (
      id SERIAL PRIMARY KEY,
      content TEXT,
      embedding VECTOR(1536), -- OpenAI embeddings are 1536-dimensional
      metadata JSONB,
      created_at TIMESTAMP DEFAULT NOW()
    )
  SQL

  conn.close
end

setup_database

原理说明:
1. pgvector是PostgreSQL的扩展，允许我们存储和查询向量数据
2. OpenAI的text-embedding模型生成的向量是1536维的
3. metadata字段可以存储文档的额外信息，如来源、作者等

第二步：文档嵌入和存储

我们需要将文档转换为向量并存入数据库。

代码片段

# embedding_service.rb
require 'openai'
require 'json'

class EmbeddingService
  def initialize(api_key)
    @client = OpenAI::Client.new(access_token: api_key)
    @conn = PG.connect(dbname: 'rag_demo')
  end

  def embed_text(text)
    response = @client.embeddings(
      parameters: {
        model: "text-embedding-3-small",
        input: text
      }
    )

    response.dig('data', 0, 'embedding')
  end

  def store_document(content, metadata = {})
    embedding = embed_text(content)

    @conn.exec_params(
      "INSERT INTO documents (content, embedding, metadata) VALUES ($1, $2, $3)",
      [content, embedding.to_json, metadata.to_json]
    )

    puts "Document stored successfully!"
  end

  def close_connection
    @conn.close if @conn && !@conn.finished?
  end
end

# Usage example:
api_key = ENV['OPENAI_API_KEY'] || 'your-api-key'
service = EmbeddingService.new(api_key)

sample_text = "Ruby is a dynamic, open source programming language with a focus on simplicity and productivity."
service.store_document(sample_text, { source: "ruby-lang.org", type: "description" })

service.close_connection

注意事项:
1. API调用可能会产生费用，建议在开发环境使用小规模数据测试
2. OpenAI API有速率限制，大量文档处理时需要考虑分批处理

第三步：实现检索功能

现在我们可以基于语义相似度检索相关文档。

代码片段

# retriever.rb
require 'pg'

class Retriever
  def initialize(top_k = 3)
    @conn = PG.connect(dbname: 'rag_demo')
    @top_k = top_k # Number of documents to retrieve

    # Prepare the similarity search function if not exists
    @conn.exec(<<~SQL)
      CREATE OR REPLACE FUNCTION similar_documents(query_embedding VECTOR(1536), match_count INT)
      RETURNS TABLE(id INT, content TEXT, similarity FLOAT)
      AS $$
        SELECT 
          id,
          content,
          1 - (embedding <=> query_embedding) AS similarity -- Cosine distance to similarity conversion (1 - distance)
        FROM documents 
        ORDER BY embedding <=> query_embedding -- Using cosine distance operator <=>
        LIMIT match_count;
      $$ LANGUAGE SQL;
    SQL

    puts "Similarity search function prepared"

    rescue PG::Error => e
      puts "Error setting up search function: #{e.message}"
    end

    def retrieve(query_embedding)
      results = @conn.exec_params(
        "SELECT * FROM similar_documents($1::vector, $2)",
        [query_embedding.to_json, @top_k]
      )

      results.map do |row|
        {
          id: row['id'],
          content: row['content'],
          similarity: row['similarity'].to_f.round(4)
        }
      end      
    end

    def close_connection 
      @conn.close if @conn && !@conn.finished?
     end 
end 

# Usage example with EmbeddingService:
api_key = ENV['OPENAI_API_KEY'] || 'your-api-key'
embedder = EmbeddingService.new(api_key)
retriever = Retriever.new

query = "What is Ruby programming language?"
query_embedding = embedder.embed_text(query)

results = retriever.retrieve(query_embedding)
puts "Retrieved documents:\n#{results.to_json}"

embedder.close_connection 
retriever.close_connection

原理说明:
1. <=>是pgvector提供的余弦距离操作符，距离越小表示越相似
2. similarity计算为1 - distance，将距离转换为相似度分数（0到1之间）
3. top_k参数控制返回的相关文档数量

第四步：构建完整的RAG系统

现在我们将检索和生成结合起来。

代码片段

# rag_system.rb 
require './embedding_service' 
require './retriever' 

class RAGSystem 
   def initialize(api_key) 
     @llm_client = OpenAI::Client.new(access_token: api_key) 
     @embedder = EmbeddingService.new(api_key) 
     @retriever = Retriever.new(3) # Default to top-3 documents 
   end 

   def generate_response(prompt) 
     # Step1 : Retrieve relevant context 
     query_embedding=@ embedder . embed_text(prompt ) 
     contexts=@ retriever . retrieve(query_embedding ) 

     context_str=contexts . map{|c| c[:content ]} . join("\n---\n") 

     # Step2 : Generate response with context 
     response=@ llm_client . chat(
       parameters:{ 
         model:"gpt-4-turbo-preview", # Use latest available model in2025 
         messages:[ 
           { role:"system", content:"You are a helpful assistant that answers questions based on the provided context." },  
           { role:"user", content:"Context:\n#{context_str}\n\nQuestion:#{prompt}\n\nAnswer:" }  
         ],  
         temperature:0.7  
       }  
     )  

     {  
       answer:response . dig("choices",0,"message","content"),  
       sources:contexts.map{|c|{id:c[:id], similarity:c[:similarity]}}  
     }  
   rescue=>e  
     puts"Error generating response:#{e.message}"  
     nil  
   ensure  
     cleanup  
   end  

   private  

   def cleanup  
     @ embedder &. close_connection  
     @ retriever &. close_connection  
   end  

end  

# Usage example :  

api_key=ENV['OPENAI_API_KEY']||'your-api-key'  

# First , let's seed some data for demonstration  

documents=[  
"Ruby is an interpreted , high-level , general-purpose programming language.",   
"Ruby was designed and developed in the mid-1990s by Yukihiro Matsumoto in Japan.",   
"Ruby supports multiple programming paradigms , including procedural , object-oriented , and functional programming.",   
"The latest stable version of Ruby as of May2025 is Ruby3.4 which includes JIT compilation improvements."   
]  

service=EmbeddingService.new(api_key )  

documents.each_with_index do |doc , i|   
 service.store_document(doc ,{source:"wikipedia", doc_id:i+1})   
end   

service.close_connection   

# Now use RAG system   

rag=RAGSystem.new(api_key )   

question="Who created the Ruby language and what paradigms does it support?"   

response=rag.generate_response(question )   

puts"\nQuestion:#{question}"   
puts"\nAnswer:\n#{response[:answer]}"   
puts"\nSources used:#{response[:sources].inspect}"

实践经验分享:
1. 提示工程:系统消息中明确角色设定能显著提高回答质量。
2. 温度参数:对于事实性回答,temperature设为0.7左右比较平衡。
3. 上下文格式:使用分隔符(---)区分不同来源的上下文有助于模型理解。
4. 错误处理:确保所有数据库连接都能正确关闭。

RAG系统优化技巧

1 .分块策略优化

过长的文档会影响嵌入质量和检索效果。建议将大文档分块:

代码片段

def chunk_text(text , chunk_size=500 , overlap=50 )   
 sentences=text.split(/[.!?]/).map(& :strip).reject(& :empty?)   
 chunks=[]   
 current_chunk=""   

 sentences.each do |sentence|   
   if current_chunk.size+sentence.size>chunk_size   
     chunks<<current_chunk.strip   
     current_chunk=sentence+"."+current_chunk.split.last(overlap).join("")+"."    
   else    
     current_chunk+=sentence+"."    
   end    
 end    

 chunks<<current_chunk unless current_chunk.empty?    
 chunks    
end    

# Example usage :     
long_text="..." # Your long document     
chunks=chunk_text(long_text )     
chunks.each{|chunk| store_document(chunk ) }

2 .混合搜索策略

结合关键词搜索和语义搜索可以提高召回率:

代码片段

def hybrid_search(query , query_embedding , keyword_weight=0.3 , semantic_weight=0.7 )     
 keyword_results=@ conn.exec_params("SELECT id FROM documents WHERE content ILIKE $1 LIMIT10",["%#{query}%"])     
 semantic_results=retrieve(query_embedding )     

 combined_results=(       
 keyword_results.map{|r|[r["id"].to_i , keyword_weight ]}+       
 semantic_results.map{|r|[r[:id ], semantic_weight*r[:similarity ]]}       
 ).group_by(& :first ).map do |id , scores|        
 {id:id , score:scores.sum(& :last ), type:"hybrid"}        
 end.sort_by{|r|-r[:score ]}.first(@ top_k )        

 combined_results.map do |result|        
 doc_result=@ conn.exec_params("SELECT * FROM documents WHERE id=$1",[result[:id ]]).first        
 {id:result[:id ], content:doc_result["content"], score:result[:score ]}        
 end        
end

3 .缓存机制

频繁查询相同问题会浪费资源:

代码片段

class ResponseCache     
 def initialize         
   @cache={}         
   @mutex=Mutex.new         
 end         

 def get_or_set(key )         
   return yield unless ENV["ENABLE_CACHE"]=="true"         

   cached_value=nil         
   mutex.synchronize{cached_value=cache[key]}         

   return cached_value if cached_value         

   result=yield         
   mutex.synchronize{cache[key]=result } if result         
 result         
 ensure         
 nil         
end         

 private attr_accessor :cache,:mutex        
end        

# Usage in RAGSystem :        
def generate_response(prompt )        
 cache_key="#{prompt.hash}_#{Time.now.hour}" # Cache by hour        

 ResponseCache.instance.get_or_set(cache_key ) do        
 super(prompt ) # Original implementation        
 end        
end

总结

通过本指南,你已经学会了如何:

1.[✅]设置支持向量的PostgreSQL数据库作为知识库。
2.[✅]使用OpenAIAPI将文本转换为嵌入向量并存储。
3.[✅]实现基于语义相似度的文档检索功能。
4.[✅]构建完整的RAG系统来回答问题。
5.[✅]应用优化技巧提升RAG系统性能。

作为Ruby开发者,RAG技术可以显著增强你的应用程序能力。随着2025年AI技术的进步,RAG将成为构建智能系统的标准组件。现在就开始实践吧!

下一步学习方向:
-探索本地LLM替代方案如Llama3以减少API依赖。
-研究更高级的重新排序(re-ranking)技术提高结果质量。