Quick start
Prompt adaptation is in betaPlease reach out to us if you want to test it out.
Getting started with prompt adaptation only takes a few minutes. In this quickstart example we'll adapt a prompt for digit counting problems to Llama 3.1 8b and 70b.
What you'll need
- A Not Diamond API key
- Your current prompt (system prompt and user message template)
- An evaluation dataset
1. Setup
Install dependencies:
pip install notdiamondnpm install notdiamondSet your Not Diamond API key
export NOTDIAMOND_API_KEY=YOUR_NOTDIAMOND_API_KEY2. Prepare your data
We'll start by defining 5 training samples and 5 test samples:
# Define your prompts
system_prompt = """You are a mathematical reasoning assistant. Your task is to solve digit counting problems
by first computing the mathematical expression, then carefully analyzing the digits in the result.
Show your work step by step and provide only the final numerical answer."""
prompt_template = """Solve this problem: {question}"""
# Define training samples (5 examples = 20% increments)
train_goldens = [
{
"fields": {"question": "How many digits are in (23874045494*2789392485)?"},
"answer": "20"
},
{
"fields": {"question": "How many odd digits are in (999*777*555*333*111)?"},
"answer": "11"
},
{
"fields": {"question": "How often does the number '17' appear in the digits of (101010101010101010101010101010101*17)?"},
"answer": "17"
},
{
"fields": {"question": "How many even digits are in (222*444*666*888)?"},
"answer": "5"
},
{
"fields": {"question": "How many 0s are in (1234567890*1357908642)?"},
"answer": "4"
}
]
# Define test samples (5 examples = 20% increments)
test_goldens = [
{
"fields": {"question": "How many digits are in (9876543210*123456)?"},
"answer": "16"
},
{
"fields": {"question": "How many odd digits are in (135*579*246)?"},
"answer": "4"
},
{
"fields": {"question": "How often does the number '42' appear in the digits of (1234561010789*42)?"},
"answer": "0"
},
{
"fields": {"question": "How many even digits are in (1111*2222*3333)?"},
"answer": "6"
},
{
"fields": {"question": "How many 9s are in (999999*888888)?"},
"answer": "0"
}
]
print(f"Prepared {len(train_goldens)} training samples and {len(test_goldens)} test samples")// Define your prompts
const systemPrompt = `You are a mathematical reasoning assistant. Your task is to solve digit counting problems
by first computing the mathematical expression, then carefully analyzing the digits in the result.
Show your work step by step and provide only the final numerical answer.`;
const promptTemplate = `Solve this problem: {question}`;
// Define training samples (5 examples = 20% increments)
const trainGoldens = [
{
fields: { question: "How many digits are in (23874045494*2789392485)?" },
answer: "20"
},
{
fields: { question: "How many odd digits are in (999*777*555*333*111)?" },
answer: "11"
},
{
fields: { question: "How often does the number '17' appear in the digits of (101010101010101010101010101010101*17)?" },
answer: "17"
},
{
fields: { question: "How many even digits are in (222*444*666*888)?" },
answer: "5"
},
{
fields: { question: "How many 0s are in (1234567890*1357908642)?" },
answer: "4"
}
];
// Define test samples (5 examples = 20% increments)
const testGoldens = [
{
fields: { question: "How many digits are in (9876543210*123456)?" },
answer: "16"
},
{
fields: { question: "How many odd digits are in (135*579*246)?" },
answer: "4"
},
{
fields: { question: "How often does the number '42' appear in the digits of (1234561010789*42)?" },
answer: "0"
},
{
fields: { question: "How many even digits are in (1111*2222*3333)?" },
answer: "6"
},
{
fields: { question: "How many 9s are in (999999*888888)?" },
answer: "0"
}
];
console.log(`Prepared ${trainGoldens.length} training samples and ${testGoldens.length} test samples`);3. Request prompt adaptation
import os
from notdiamond import NotDiamond
client = NotDiamond(api_key=os.environ.get("NOTDIAMOND_API_KEY"))
# Define target models
target_models = [
{"provider": "meta-llama", "model": "llama-3.1-8b-instruct"},
{"provider": "meta-llama", "model": "llama-3.1-70b-instruct"}
]
# Request adaptation with train/test split
response = client.prompt_adaptation.adapt(
system_prompt=system_prompt,
template=prompt_template,
fields=["question"],
train_goldens=train_goldens,
test_goldens=test_goldens,
target_models=target_models,
evaluation_metric="LLMaaJ:Sem_Sim_1",
prototype_mode=True
)
adaptation_run_id = response.adaptation_run_id
print(f"Adaptation job started: {adaptation_run_id}")import { NotDiamond } from 'notdiamond';
const client = new NotDiamond({api_key: process.env.NOTDIAMOND_API_KEY});
// Define target models
const targetModels = [
{ provider: 'anthropic', model: 'claude-sonnet-4-5-20250929' }
];
// Request adaptation with train/test split
const response = await client.promptAdaptation.adapt({
systemPrompt: systemPrompt,
template: promptTemplate,
fields: ['question'],
trainGoldens: trainGoldens,
testGoldens: testGoldens,
targetModels: targetModels,
evaluationMetric: 'LLMaaJ:Sem_Sim_1',
prototypeMode: true
});
const adaptationRunId = response.adaptationRunId;
console.log(`Adaptation job started: ${adaptationRunId}`);4. Check results
You can monitor your adaptation job status either programmatically via the API or in your Not Diamond dashboard.
Retrieve results programmatically:
# Check status
status = client.prompt_adaptation.get_adapt_status(adaptation_run_id)
print(f"Status: {status.status}")
# Get results when complete
results = client.prompt_adaptation.get_adapt_results(adaptation_run_id)
# View optimized prompt for Claude
for target in results.target_models:
print(f"\nModel: {target.model_name}")
print(f"Pre-optimization score: {target.pre_optimization_score}")
print(f"Post-optimization score: {target.post_optimization_score}")
print(f"Improvement: {target.post_optimization_score - target.pre_optimization_score:.2f}")
print(f"\nOptimized system prompt:\n{target.system_prompt}")
print(f"\nOptimized template:\n{target.user_message_template}")
# Test with a new example
test_question = "How many 8s are in (1234567890*1357908642*5791108642)?"
print(f"\nTest question: {test_question}")
# Use the optimized prompt with your LLM SDK...// Check status
const status = await client.promptAdaptation.getAdaptStatus(adaptationRunId);
console.log(`Status: ${status.status}`);
// Get results when complete
const results = await client.promptAdaptation.getAdaptResults(adaptationRunId);
// View optimized prompt for Claude
for (const target of results.targetModels) {
console.log(`\nModel: ${target.modelName}`);
console.log(`Pre-optimization score: ${target.preOptimizationScore}`);
console.log(`Post-optimization score: ${target.postOptimizationScore}`);
const improvement = target.postOptimizationScore - target.preOptimizationScore;
console.log(`Improvement: ${improvement.toFixed(2)}`);
console.log(`\nOptimized system prompt:\n${target.systemPrompt}`);
console.log(`\nOptimized template:\n${target.userMessageTemplate}`);
}
// Test with a new example
const testQuestion = "How many 0s are in (1234567890*1357908642)?";
console.log(`\nTest question: ${testQuestion}`);
// Use the optimized prompt with your LLM SDK...Retrieve results in your dashboard:
Best practices
- Use multiple target models for the best results: You can define up to 4 target models per prompt adaptation job. If you're unsure about which model you should optimize for, defining multiple target models lets you see at a glance which model is best suited for your data. You can see the full list of supported models for prompt adaptation in Prompt Adaptation Models.
- More evaluation data reduces the risk of overfitting: To ensure robust results, prompt adaptation requires at least 25 training samples and can support up to 200 samples. However, if you're still in the prototyping stage, you can enable
prototype_modeto provide as few as 3 samples. When defining your training and test set sizes, keep in mind that the more samples you provide the longer your job times will be. - Concurrent job limits: Keep in mind that users have a job concurrency limit of 1 job at a time, though each job may include multiple target models.
For higher target model limits, concurrency limits, or any other needs please reach out to our team.
Summary and next steps
In this example, we have taken a simple prompt for math problems and optimized it for Llama 3.1 8b and Llama 3.1 70b. You can monitor job progress and review all your adaptation jobs, performance metrics, and optimized prompts in your Not Diamond dashboard.
Below are some additional resources for building a stronger foundation with prompt adaptation:
- Question answering example - Tutorial with real datasets
- Classification example - Tutorial with real datasets
- Evaluation Metrics - Learn about available metrics and custom options
- Supported Models - View all models available for adaptation
- How Prompt Adaptation Works - Understand the optimization process
Looking for model routing instead? Check out the Model Routing Quickstart to intelligently route queries across LLMs.
Updated about 10 hours ago
