/trainCustomRouter

Train a custom router on your evaluation data to optimize routing for your specific use case.

This endpoint allows you to train a domain-specific router that learns which models perform best for different types of queries in your application. The router analyzes your evaluation dataset, clusters similar queries, and learns model performance patterns.

Training Process:

  1. Upload a CSV file with your evaluation data
  2. Specify which models to route between
  3. Define the evaluation metric (score column)
  4. The system trains asynchronously and returns a preference_id
  5. Use the preference_id in model_select() calls once training completes

Dataset Requirements:

  • Format: CSV file
  • Minimum samples: 25 (more is better for accuracy)
  • Required columns:
    • Prompt column (specified in prompt_column parameter)
    • For each model: {provider}/{model}/score and {provider}/{model}/response

Example CSV structure:

prompt,openai/gpt-4o/score,openai/gpt-4o/response,anthropic/claude-sonnet-4-5-20250929/score,anthropic/claude-sonnet-4-5-20250929/response
"Explain quantum computing",0.95,"Quantum computing uses...",0.87,"Quantum computers leverage..."
"Write a Python function",0.82,"def my_function()...",0.91,"Here's a Python function..."

Model Selection:

  • Specify standard models: {"provider": "openai", "model": "gpt-4o"}
  • Or custom models with pricing: {"provider": "custom", "model": "my-model", "is_custom": true, "input_price": 10.0, "output_price": 30.0, "context_length": 8192, "latency": 1.5}

Training Time:

  • Training is asynchronous and typically takes 5-15 minutes
  • Larger datasets or more models take longer
  • You'll receive a preference_id immediately
  • Check training status by attempting to use the preference_id in model_select()

Best Practices:

  1. Use diverse, representative examples from your production workload
  2. Include at least 50-100 samples for best results
  3. Ensure consistent evaluation metrics across all models
  4. Use the same models you plan to route between in production

Related Documentation: See https://docs.notdiamond.ai/docs/adapting-prompts-to-new-models for detailed guide.

Body Params
string
required

Language of the evaluation data. Use 'english' for English-only data or 'multilingual' for multi-language support

string
required

JSON string array of LLM providers to train the router on. Format: '[{"provider": "openai", "model": "gpt-4o"}, {"provider": "anthropic", "model": "claude-sonnet-4-5-20250929"}]'

string
required

Name of the column in the CSV file that contains the prompts

file
required

CSV file containing evaluation data with prompt column and score/response columns for each model

boolean
required

Whether higher scores are better. Set to true if higher scores indicate better performance, false otherwise

Optional preference ID to update an existing router. If not provided, a new preference will be created

Whether to override an existing custom router for this preference_id

Responses

400

Invalid request (e.g., insufficient samples, invalid dataset format, missing columns)

401

Authentication failed

404

Preference ID not found

408

Request timeout - dataset file took too long to parse

Language
Credentials
Bearer
Request
Response
Click Try It! to start a request and see the response here! Or choose an example:
application/json