/optimize

post

https://api.notdiamond.ai/v2/prompt/optimize

Optimize your prompt from one LLM to work optimally across different target LLMs.

This endpoint automatically optimizes your prompt (system prompt + user message template) to improve accuracy on your use case across various models. Each model has unique characteristics, and what works well for GPT-5 might not work as well for Claude or Gemini.

How Prompt Optimization Works:

You provide your current prompt and optionally your current origin model
You specify the target models you want to optimize your prompt to
You provide evaluation examples (golden records) with expected answers
The system runs optimization to find the best prompt for each target model
You receive optimized prompts that perform well on your target models

Evaluation Metrics:
Choose either a standard metric or provide custom evaluation:

Standard metrics: LLMaaJ:Sem_Sim_1 (semantic similarity), JSON_Match
Custom evaluation: Provide evaluation_config with your own LLM judge, prompt, and cutoff

Dataset Requirements:

Minimum 25 examples in train_goldens (more examples = better optimization)
Prototype mode: Set prototype_mode: true to use as few as 3 examples for prototyping
- Recommended when you don't have enough data yet to build a proof-of-concept
- Note: Performance may be degraded compared to standard mode (25+ examples)
- Trade-off: Faster iteration with less data vs. potentially less generalizability
Each example must have fields matching your template placeholders
Supervised evaluation requires 'answer' field in each golden record
Unsupervised evaluation can work without answers

Training Time:

Processing is asynchronous and typically takes 10-30 minutes
Time depends on: number of target models, dataset size, model availability
Use the returned optimization_run_id to check status and retrieve results

Example Workflow:

1. POST /v2/prompt/optimize - Submit optimization request
2. GET /v2/prompt/optimizeStatus/{id} - Poll status until completed
3. GET /v2/prompt/optimizeResults/{id} - Retrieve optimized prompts
4. Use optimized prompts in production with target models

Body Params

Request model for POST /v2/prompt/optimize endpoint.

Submits a prompt adaptation job to optimize your prompt for different target LLMs.
The system evaluates your original prompt on the origin model, then automatically
generates and tests optimized prompts for each target model to maximize performance.

Key concepts:

system_prompt + template: Your current prompt configuration
origin_model: The model your prompt currently works well with (baseline)
target_models: The models you want to optimize for
train_goldens: Evaluation examples used to optimize the prompts
test_goldens: Held-out examples used to measure final performance

Workflow:

Submit this request to start adaptation
System evaluates baseline performance on origin model
Optimizes prompts for each target model
Returns adaptation_run_id for tracking progress
Poll /adaptStatus until complete
Retrieve optimized prompts from /adaptResults

Requirements:

Minimum 25 examples in train_goldens (or 3 examples with prototype_mode=True)
test_goldens required when using train_goldens
Either use (goldens) or (train_goldens + test_goldens), not both
For supervised metrics, all examples must include 'answer' field

Prototype Mode:

Set prototype_mode=True to allow as few as 3 training examples
Useful for prototyping AI applications when you don't have enough data yet
Note: Performance may be degraded compared to standard mode (25+ examples)

evaluation_metric

evaluation_config

system_prompt

string

required

System prompt to use with the origin model. This sets the context and role for the LLM

template

string

required

User message template with placeholders for fields. Use curly braces for field substitution

fields

array of strings

required

List of field names that will be substituted into the template. Must match keys in golden records

Fields*

goldens

array | null

Training examples (legacy parameter). Use train_goldens and test_goldens for better control. Minimum 25 examples (or 3 with prototype_mode=true)

train_goldens

array | null

Training examples for prompt optimization. Minimum 25 examples required (or 3 with prototype_mode=true). Cannot be used with 'goldens' parameter

test_goldens

array | null

Test examples for evaluation. Required if train_goldens is provided. Used to measure final performance on held-out data

origin_model

RequestProvider | null

The model your current prompt is optimized for (baseline).

target_models

array of objects

required

List of models to optimize the prompt for. Maximum count depends on your subscription tier (Free: 1, Starter: 3, Startup: 5, Enterprise: 10)

Target Models*

origin_model_evaluation_score

Optional baseline score for the origin model. If provided, can skip origin model evaluation

prototype_mode

boolean

Defaults to false

Enable prototype mode to use as few as 3 training examples (instead of 25). Note: Performance may be degraded with fewer examples. Recommended for prototyping AI applications when you don't have enough data yet

Responses

404

Not found

200Successfully started prompt optimization

400Invalid request (e.g., insufficient examples, invalid models, missing required fields)

401Authentication failed or no prompt optimization access

422Validation error (e.g., must provide either goldens or train_goldens/test_goldens)

429Rate limit exceeded for prompt optimizations