TheoBuilder AI Agent Platform: RAG Training Best Practices Guide
What Is RAG and Why It Matters for Your Business
RAG (Retrieval-Augmented Generation) is what makes your TheoBuilder AI agents smart about your specific business information. Instead of giving generic responses, RAG-trained agents can answer questions using your actual company documents, policies, FAQs, and knowledge base.
Business Impact: Companies using properly configured RAG see 67% more accurate responses and 49% fewer "I don't know" answers from their AI agents.
The Complete RAG Training and Testing Process
Step 1: Start with Basic Training Settings
When you first set up your OpenAI GPT node for RAG training, use these recommended starting configurations:
Training Style Selection
- Open your OpenAI GPT node configuration panel
- Find the "Training Style" dropdown in the RAG Training Settings section
- Select your option based on your content type:
- Questions & Answers: Choose this if you have FAQ documents, help desk tickets, or customer service scripts
- Text Documents: Choose this if you have policy manuals, product guides, or research papers
Embedding Model Selection
- In the "Embedding Model" dropdown, start with a small, fast model like "text-embedding-ada-002"
- Small models process faster and cost less while you're testing
- You can upgrade to larger, more accurate models once your system is working well
Initial Parameter Settings
- Set "Minimum Confidence Threshold" to 0 (this captures all possible results for testing)
- Set "Top N Contexts" to 0 (this shows you everything the system finds)
- Set "Target Testing Keywords" weight to 0.81 (this balances accuracy with coverage)
Step 2: Run Your First Tests
Testing Your Setup
- Click the "Train Model" button in your OpenAI GPT node
- Wait for training to complete (this can take several hours for large document sets)
- Use the "Test Configuration" feature to ask sample questions
- Check the debugger results to see what information your system retrieved
What to Look For
- Does the system find the right documents when you ask questions?
- Are the retrieved chunks of text actually relevant to your question?
- Is the final answer based on your business information or generic knowledge?
Step 3: Analyze Token Usage and Content Quality
Using the OpenAI Tokenizer
- Copy the retrieved text from your debugger results
- Visit platform.openai.com/tokenizer in your web browser
- Paste your retrieved content to see how many tokens it uses
- Aim to stay under 75% of your model's token limit for best performance
Cross-Platform Quality Check Test the same questions across different AI platforms to compare quality:
- Ask your question in ChatGPT, Claude, Grok, and Gemini
- Compare which platform gives the most accurate answer using the same source material
- If multiple platforms give good answers with your retrieved content, your RAG system is working correctly
- If all platforms struggle with your content, you need to improve your document quality or chunking
Step 4: Optimize Performance Through Testing
Confidence Threshold Adjustment
- Start increasing your "Minimum Confidence Threshold" from 0 to 0.25
- Test your key questions again
- Gradually increase to 0.4, then 0.6, then 0.8 until you find the sweet spot
- Higher thresholds give more precise answers but may miss relevant information
Context Window Optimization
- Reduce your "Top N Contexts" from unlimited to 25 results
- Test performance and accuracy
- Continue reducing (20, 15, 12, 10, 8, 5) until you find optimal performance
- Most businesses achieve best results with 7-12 contexts
When to Stop Optimizing Stop adjusting settings when:
- Your AI agent consistently gives accurate, complete answers
- Response time is acceptable for your business needs (under 10 seconds typically)
- Token usage stays within your budget constraints
- Customer satisfaction with answers exceeds 85%
Understanding Your RAG Configuration Options
Training Styles: Choosing the Right Approach
Questions & Answers Training
- Best for: Customer support chatbots, FAQ systems, help desk automation
- How it works: The system learns to match customer questions with your prepared answers
- Configuration tip: Use shorter, focused chunks of text (200-400 tokens each)
- Business impact: 23% faster response times and 31% higher customer satisfaction scores
Text Documents Training
- Best for: Policy manuals, product documentation, research libraries, legal documents
- How it works: The system learns to find relevant sections from longer documents
- Configuration tip: Use longer chunks (500-800 tokens) to preserve context
- Business impact: More comprehensive answers but slightly slower response times
Embedding Model Selection Guide
Small Models (Recommended for Starting)
- Examples: "text-embedding-ada-002", "bge-small-en-v1.5"
- Best for: Getting started, high-volume applications, budget-conscious projects
- Performance: 2-5x faster processing, 70-75% accuracy rate
- Cost: Significantly lower - about $0.10 per 1,000 document pages processed
Large Models (For Maximum Accuracy)
- Examples: "text-embedding-3-large", "text-embedding-3-small"
- Best for: High-accuracy requirements, complex technical content, low query volume
- Performance: 80-90% accuracy rate, deeper understanding of context
- Cost: Higher - about $1.30 per 1,000 document pages processed
Selection Guide:
- Start with small models for initial testing
- Upgrade to large models if accuracy isn't meeting your business needs
- Consider your query volume - high-volume applications benefit more from small, fast models
Training Mode Options
Full Training
- When to use: Setting up a new RAG system, major content updates, switching document types
- What happens: Complete reprocessing of all your documents and rebuilding of search indexes
- Time required: 2-24 hours depending on document volume
- Business impact: Maximum accuracy improvement but highest time investment
Rebuild Embeddings
- When to use: Adding new documents, updating existing content, changing embedding models
- What happens: Reprocesses document content but keeps existing search structure
- Time required: 30 minutes to 6 hours
- Business impact: Good balance of improvement and time efficiency
Rebuild Index Only
- When to use: Optimizing search performance, changing distance functions, database maintenance
- What happens: Reconstructs search indexes without reprocessing documents
- Time required: 15 minutes to 2 hours
- Business impact: Performance improvements with minimal downtime
Vector Space Settings Explained
Distance Function Selection
- Cosine Similarity (Recommended default): Best for most text-based applications, focuses on meaning rather than word frequency
- Chebyshev Distance: Alternative option that may work better for highly technical or structured content
- When to change: Only if you're not getting good results with the default option
Confidence Threshold Configuration
- Purpose: Controls how confident the system must be before including information in answers
- Low values (0.1-0.4): More comprehensive answers but may include less relevant information
- High values (0.7-0.9): More precise answers but may miss some relevant information
- Recommended starting point: 0.5 for most business applications
Top N Contexts Setting
- Purpose: Maximum number of document chunks to consider for each question
- Low values (3-5): Faster responses, more focused answers
- High values (15-25): More comprehensive answers, slower responses
- Recommended range: 7-12 for most business applications
Advanced Settings for Large Datasets
Approximate Similarity Index
- When to enable: If you have more than 100,000 documents or pages of content
- What it does: Speeds up searches by using advanced indexing techniques
- Performance impact: 10x faster search speeds with 99% of the accuracy
- Trade-off: Longer initial setup time but much faster ongoing performance
Index Configuration
- Index Trees: Set to 10-50 (higher numbers = better accuracy, longer setup time)
- Index Search Nodes: Leave at -1 for automatic optimization
- When to adjust: Only if you're experiencing slow search performance with large document sets