Skip to content
synthreo.ai

AI Model Comparison in ThreoAI

Use ThreoAI's Model Comparison feature to send the same prompt to two AI models side by side and compare their responses in real time - evaluate accuracy, tone, reasoning, and output quality across models.

Model Comparison is a built-in ThreoAI feature that lets you send the same message to two different AI models simultaneously and view their responses side by side. This is useful for evaluating response quality, comparing reasoning styles, testing how different models interpret your Custom GPT instruction prompts, or simply determining which model is best suited for a specific task before committing to it for ongoing work.

Rather than switching between models in separate conversations and trying to remember or compare outputs manually, Model Comparison displays both responses on a single screen so you can assess them together in real time.

ThreoAI Model Comparison page showing MODEL 1 (GPT 5.4) vs MODEL 2 (Claude Sonnet 4.6)


There are two ways to open Model Comparison:

  1. From any chat: Click the + button in the chat input bar (this button is available in all chat contexts, including the main home chat, Project workspaces, and AI Agent or Custom GPT chats) and select Compare models from the menu that appears.
  2. Direct URL: Navigate directly to https://threo.synthreo.ai/#/comparison in your browser.

Both methods open the same dedicated Model Comparison page with two side-by-side model panels and a shared chat input at the bottom.


The Model Comparison page displays two model selectors labeled MODEL 1 and MODEL 2, separated by a VS indicator in the center. Each model has its own response area where outputs are displayed independently.

To choose which models you want to compare:

  1. Click the MODEL 1 dropdown on the left side of the page. A list of all AI models available in your organization appears.
  2. Select the model you want for the left panel.
  3. Click the MODEL 2 dropdown on the right side and select a different model for the right panel.

Each model indicator displays a colored dot that visually distinguishes the two models, making it easy to tell at a glance which response belongs to which model.

The models available in the dropdowns are the same ones available throughout ThreoAI. They are configured by your organization’s administrator in Tenant Management. If you do not see a model you need, contact your administrator to request that it be enabled for your organization.

Follow these steps to run a side-by-side comparison:

  1. Select the two models you want to compare using the MODEL 1 and MODEL 2 dropdowns.
  2. Type your message in the chat input bar at the bottom of the screen. You can type any prompt, question, or instruction - the same text is sent to both models at the same time.
  3. Press Enter to send. Both models receive the identical prompt simultaneously and begin generating their responses.
  4. The responses appear side by side as they are generated - MODEL 1’s response streams on the left, MODEL 2’s response streams on the right.

You can continue the conversation with follow-up messages, and both models will respond to each subsequent message in parallel. This allows you to test how each model handles multi-turn conversations and whether they maintain context differently.

Using Shift + Enter for Multi-Line Prompts

Section titled “Using Shift + Enter for Multi-Line Prompts”

If you need to send a longer, multi-line prompt to both models, press Shift + Enter to insert a line break in your message without sending it. Press Enter when you are ready to submit the full prompt to both models.


Getting the most value from Model Comparison depends on what you are trying to evaluate. Here are practical strategies for different comparison goals:

  • Test the same task prompt on different models to see which gives more accurate or structured output for your specific use case. For example, send a data extraction request to both models and compare which one returns cleaner, more organized results.
  • Compare reasoning quality on complex analytical questions by checking how each model breaks down its answer. Send a multi-step problem and observe whether one model provides clearer step-by-step reasoning or arrives at a more accurate conclusion.
  • Evaluate tone and style differences between models. Some models tend to be more concise and direct, while others provide more detailed explanations with additional context. This is especially important when choosing a model for customer-facing Custom GPTs where tone matters.
  • Test with your Custom GPT instruction prompts before committing to a model. Paste your GPT instruction prompt as the message and observe how each model interprets the persona, follows the constraints, and handles edge cases. This can save significant time during GPT development.
  • Compare how models handle ambiguous or incomplete inputs. Send a vague request and see which model asks for clarification versus which one attempts to answer with assumptions. This helps you understand each model’s default behavior when instructions are unclear.
  • Evaluate factual accuracy by asking both models about topics you already know well. Compare their answers to identify which model is more reliable for your domain.

When reviewing side-by-side responses, consider these aspects:

AspectWhat to Look For
AccuracyDoes the response contain correct information? Are there any factual errors or hallucinations?
CompletenessDoes the model address all parts of your prompt, or does it skip details?
StructureIs the response well-organized with clear headings, lists, or paragraphs?
ConcisenessDoes the model give you what you need without excessive filler or repetition?
ToneDoes the response match the professional, casual, or technical tone you need?
Instruction-followingIf you gave specific formatting or behavioral instructions, did each model follow them?

  • Choosing a default model for your account: Test your most common types of prompts across different models to decide which one to set as your default in Profile Settings.
  • Selecting a model for a Custom GPT: Before building or editing a Custom GPT, use Model Comparison to test how different models handle your GPT’s instruction prompt, knowledge base questions, and conversation starters.
  • Evaluating model updates: When your administrator adds a new model to your organization, use Model Comparison to test it against your current preferred model before switching.
  • Training and onboarding: Show team members how different models perform on the same task so they can make informed choices about which model to use for their workflows.

The models available for comparison are the same ones available throughout ThreoAI. They are configured by your organization’s administrator in Tenant Management. Common models you may see include options like GPT-5, Claude Sonnet, Grok, and others, depending on what your administrator has enabled.

If you need access to a specific model that is not listed in the comparison dropdowns, contact your organization’s ThreoAI administrator to request that the model be added to your tenant configuration.

You can also set a personal default model for all new chats (outside of Model Comparison) in your Profile Settings under the General tab.