Why Rig? 5 Compelling Reasons to Use Rig for Your Next LLM Project
TL;DR
- Rig: Rust library simplifying LLM app development
- Developer-Friendly: Intuitive API design, comprehensive documentation, and scalability from simple chatbots to complex AI systems.
- Key Features: Unified API across LLM providers (OpenAI, Cohere); Streamlined embeddings and vector store support; High-level abstractions for complex AI workflows (e.g., RAG); Leverages Rust’s performance and safety; Extensible for custom implementations
- Contribute: Build with Rig, provide feedback, win $100
- Resources: GitHub, Examples, Docs.
Large Language Models (LLMs) have become a cornerstone of modern AI applications. However, building with LLMs often involves wrestling with complex APIs and repetitive boilerplate code. That’s where Rig comes in - a Rust library designed to simplify LLM application development.
In my previous post, I introduced Rig. Today, we’re diving deeper into five key features that make Rig a compelling choice for building your next LLM project.
1. Unified API Across LLM Providers
One of Rig’s standout features is its consistent API that works seamlessly with different LLM providers. This abstraction layer allows you to switch between providers or use multiple providers in the same project with minimal code changes.
Let’s look at an example using both OpenAI and Cohere models with Rig:
use rig::providers::{openai, cohere};
use rig::completion::Prompt;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Initialize OpenAI client using environment variables
let openai_client = openai::Client::from_env();
let gpt4 = openai_client.model("gpt-4").build();
// Initialize Cohere client with API key from environment variable
let cohere_client = cohere::Client::new(&std::env::var("COHERE_API_KEY")?);
let command = cohere_client.model("command").build();
// Use OpenAI's GPT-4 to explain quantum computing
let gpt4_response = gpt4.prompt("Explain quantum computing in one sentence.").await?;
println!("GPT-4: {}", gpt4_response);
// Use Cohere's Command model to explain quantum computing
let command_response = command.prompt("Explain quantum computing in one sentence.").await?;
println!("Cohere Command: {}", command_response);
Ok(())
}
In this example, we’re using the same prompt
method for both OpenAI and Cohere models. This consistency allows you to focus on your application logic rather than provider-specific APIs.
Here’s what the output might look like:
GPT-4: Quantum computing utilizes quantum mechanics principles to perform complex computations exponentially faster than classical computers.
Cohere Command: Quantum computing harnesses quantum superposition and entanglement to process information in ways impossible for classical computers.
As you can see, we’ve obtained responses from two different LLM providers using nearly identical code, showcasing Rig’s unified API.
2. Streamlined Embedding and Vector Store Support
Rig provides robust support for embeddings and vector stores, crucial components for semantic search and retrieval-augmented generation (RAG). Let’s explore a simple example using Rig’s in-memory vector store:
use rig::providers::openai;
use rig::embeddings::EmbeddingsBuilder;
use rig::vector_store::{in_memory_store::InMemoryVectorStore, VectorStore};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Initialize OpenAI client and create an embedding model
let openai_client = openai::Client::from_env();
let embedding_model = openai_client.embedding_model("text-embedding-ada-002");
// Create an in-memory vector store
let mut vector_store = InMemoryVectorStore::default();
// Generate embeddings for two documents
let embeddings = EmbeddingsBuilder::new(embedding_model.clone())
.simple_document("doc1", "Rust is a systems programming language.")
.simple_document("doc2", "Python is known for its simplicity.")
.build()
.await?;
// Add the embeddings to the vector store
vector_store.add_documents(embeddings).await?;
// Create an index from the vector store
let index = vector_store.index(embedding_model);
// Query the index for the most relevant document to "What is Rust?"
let results = index.top_n_from_query("What is Rust?", 1).await?;
// Print the most relevant document
println!("Most relevant document: {:?}", results[0].1.document);
Ok(())
}
This code demonstrates how to create embeddings, store them in a vector database, and perform a semantic search. Rig handles the complex processes behind the scenes making your life a lot easier.
When you run this code, you’ll see output similar to:
Most relevant document: "Rust is a systems programming language."
This result shows that Rig’s vector store successfully retrieved the most semantically relevant document to our query about Rust.
3. Powerful Abstractions for Agents and RAG Systems
Rig provides high-level abstractions for building complex LLM applications, including agents and Retrieval-Augmented Generation (RAG) systems. Here’s an example of setting up a RAG system with Rig:
use rig::providers::openai;
use rig::embeddings::EmbeddingsBuilder;
use rig::vector_store::{in_memory_store::InMemoryVectorStore, VectorStore};
use rig::completion::Prompt;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Initialize OpenAI client and create an embedding model
let openai_client = openai::Client::from_env();
let embedding_model = openai_client.embedding_model("text-embedding-ada-002");
// Create an in-memory vector store
let mut vector_store = InMemoryVectorStore::default();
// Generate embeddings for two documents about Rust
let embeddings = EmbeddingsBuilder::new(embedding_model.clone())
.simple_document("doc1", "Rust was initially designed by Graydon Hoare at Mozilla Research.")
.simple_document("doc2", "Rust's first stable release (1.0) was on May 15, 2015.")
.build()
.await?;
// Add the embeddings to the vector store
vector_store.add_documents(embeddings).await?;
// Create an index from the vector store
let index = vector_store.index(embedding_model);
// Create a RAG agent using GPT-4 with the vector store as context
let rag_agent = openai_client.context_rag_agent("gpt-4")
.preamble("You are an assistant that answers questions about Rust programming language.")
.dynamic_context(1, index)
.build();
// Use the RAG agent to answer a question about Rust
let response = rag_agent.prompt("When was Rust's first stable release?").await?;
println!("RAG Agent Response: {}", response);
Ok(())
}
This code sets up a complete RAG system with just a few lines of clean and readable code. Rig’s abstractions allow you to focus on the high-level architecture of your application, rather than getting bogged down in implementation details that is normally associated with building RAG agents.
Running this code will produce output similar to:
RAG Agent Response: Rust's first stable release (1.0) was on May 15, 2015.
This response demonstrates how the RAG agent uses information from the vector store to provide an accurate answer, showcasing Rig’s ability to seamlessly combine retrieval and generation.
4. Efficient Memory Management and Thread Safety
One of Rig’s standout features is its ability to handle multiple LLM requests efficiently and safely, thanks to Rust’s ownership model and zero-cost abstractions. This makes Rig particularly well-suited for building high-performance, concurrent LLM applications.
Let’s look at an example that demonstrates how Rig handles multiple LLM requests simultaneously:
use rig::providers::openai;
use rig::completion::Prompt;
use tokio::task;
use std::time::Instant;
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let openai_client = openai::Client::from_env();
let model = Arc::new(openai_client.model("gpt-3.5-turbo").build());
let start = Instant::now();
let mut handles = vec![];
// Spawn 10 concurrent tasks
for i in 0..10 {
let model_clone = Arc::clone(&model);
let handle = task::spawn(async move {
let prompt = format!("Generate a random fact about the number {}", i);
model_clone.prompt(&prompt).await
});
handles.push(handle);
}
// Collect results
for handle in handles {
let result = handle.await??;
println!("Result: {}", result);
}
println!("Time elapsed: {:?}", start.elapsed());
Ok(())
}
This example showcases several key strengths of Rig as a Rust-native library:
-
Concurrent Processing: We’re able to handle multiple tasks or LLM requests that run concurrently by leveraging Rust’s async capabilities and Tokio runtime to significantly speed up batch operations.
-
Efficient Resource Sharing: The LLM model is shared across all requests without the need for costly copying, saving memory and improving performance.
-
Memory Safety: Despite the concurrent operations, Rust’s ownership model and the use of
Arc
ensures that we don’t run into data races or memory leaks. -
Resource Management: Rig takes care of properly managing the lifecycle of resources like the OpenAI client and model, preventing resource leaks.
When you run this code, you’ll see multiple generated facts printed concurrently, followed by the total time taken. The exact output will vary, but it might look something like this:
Result: The number 0 is the only number that is neither positive nor negative.
Result: The number 1 is the only number that is neither prime nor composite.
Result: The number 2 is the only even prime number.
Result: The number 3 is the only prime number that is one less than a perfect square.
...
Time elapsed: 1.502160549s
This example shows how Rig enables you to easily create applications that can handle multiple LLM requests at once, maintaining high performance while ensuring safety. Whether you’re building a chatbot that needs to handle multiple users simultaneously or a data processing pipeline that needs to analyze large volumes of text quickly, Rig provides the tools to do so efficiently and reliably.
5. Extensibility for Custom Implementations
Rig is designed to be extensible, allowing you to implement custom functionality when needed. For example, you can create custom tools for your agents:
use rig::tool::Tool;
use rig::completion::ToolDefinition;
use serde::{Deserialize, Serialize};
use serde_json::json;
// Define the arguments for the addition operation
#[derive(Deserialize)]
struct AddArgs {
x: i32,
y: i32,
}
// Define a custom error type for math operations
#[derive(Debug, thiserror::Error)]
#[error("Math error")]
struct MathError;
// Define the Adder struct
#[derive(Deserialize, Serialize)]
struct Adder;
// Implement the Tool trait for Adder
impl Tool for Adder {
const NAME: &'static str = "add";
type Error = MathError;
type Args = AddArgs;
type Output = i32;
// Define the tool's interface
async fn definition(&self, _prompt: String) -> ToolDefinition {
ToolDefinition {
name: "add".to_string(),
description: "Add two numbers".to_string(),
parameters: json!({
"type": "object",
"properties": {
"x": {
"type": "integer",
"description": "First number to add"
},
"y": {
"type": "integer",
"description": "Second number to add"
}
},
"required": ["x", "y"]
}),
}
}
// Implement the addition operation
async fn call(&self, args: Self::Args) -> Result<Self::Output, Self::Error> {
Ok(args.x + args.y)
}
}
This code defines a custom tool (Adder) that can be used within a Rig agent. While it doesn’t produce output on its own, when used in an agent, it would add two numbers. For instance, if an agent uses this tool with arguments x: 5
and y: 3
, it would return 8
.
This extensibility allows Rig to grow with your project’s needs, accommodating custom functionality without sacrificing the convenience of its high-level abstractions.