Classification
Prompt adaptation is in private betaPlease reach out to us if you want to test it out.
In this example, we'll show you how to use the prompt adaptation API on classification tasks. Specifically, we'll use the PolyAI/banking77 dataset for this tutorial—a dataset of banking customer support queries that need to be classified into 77 different intent categories.
Setup
First, we will install dependencies and download the banking77 dataset from Hugging Face.
pip install notdiamond datasets==3.6.0npm install notdiamond
# Note: You'll need to download and prepare the dataset separatelySet your Not Diamond API key
export NOTDIAMOND_API_KEY=YOUR_NOTDIAMOND_API_KEYDownload the dataset
from datasets import load_dataset
n_samples = 25
ds = load_dataset("PolyAI/banking77")["train"].select(list(range(n_samples)))Next, we define the system prompt and prompt template for our current workflow.
system_prompt = """
You are a helpful assistant that categorizes banking-related questions provided by the user.
"""
prompt_template = """
Sharing a text from banking domain which needs to be classified into one of the 77 classes mentioned below
The text is customer support query.
Categories to classify the data :
{categories}
Text To classify : {question}
"""Request prompt adaptation
First we will format the dataset for prompt adaptation. Not Diamond expects a list of samples consisting of prompt template field arguments, so ensure that prompt_template.format(**sample['fields']) returns a valid user prompt for each sample. For classification tasks, it is important that the "answer" is the ground truth class name of the sample.
import json
categories = [ # pre-extracted for convenience
"activate_my_card",
"age_limit",
"apple_pay_or_google_pay",
"atm_support",
"automatic_top_up",
"balance_not_updated_after_bank_transfer",
"balance_not_updated_after_cheque_or_cash_deposit",
"beneficiary_not_allowed",
"cancel_transfer",
"card_about_to_expire",
"card_acceptance",
"card_arrival",
"card_delivery_estimate",
"card_linking",
"card_not_working",
"card_payment_fee_charged",
"card_payment_not_recognised",
"card_payment_wrong_exchange_rate",
"card_swallowed",
"cash_withdrawal_charge",
"cash_withdrawal_not_recognised",
"change_pin",
"compromised_card",
"contactless_not_working",
"country_support",
"declined_card_payment",
"declined_cash_withdrawal",
"declined_transfer",
"direct_debit_payment_not_recognised",
"disposable_card_limits",
"edit_personal_details",
"exchange_charge",
"exchange_rate",
"exchange_via_app",
"extra_charge_on_statement",
"failed_transfer",
"fiat_currency_support",
"get_disposable_virtual_card",
"get_physical_card",
"getting_spare_card",
"getting_virtual_card",
"lost_or_stolen_card",
"lost_or_stolen_phone",
"order_physical_card",
"passcode_forgotten",
"pending_card_payment",
"pending_cash_withdrawal",
"pending_top_up",
"pending_transfer",
"pin_blocked",
"receiving_money",
"refund_not_showing_up",
"request_refund",
"reverted_card_payment",
"supported_cards_and_currencies",
"terminate_account",
"top_up_by_bank_transfer_charge",
"top_up_by_card_charge",
"top_up_by_cash_or_cheque",
"top_up_failed",
"top_up_limits",
"top_up_reverted",
"topping_up_by_card",
"transaction_charged_twice",
"transfer_fee_charged",
"transfer_into_account",
"transfer_not_received_by_recipient",
"transfer_timing",
"unable_to_verify_identity",
"verify_my_identity",
"verify_source_of_funds",
"verify_top_up",
"virtual_card_not_working",
"visa_or_mastercard",
"why_verify_identity",
"wrong_amount_of_cash_received",
"wrong_exchange_rate_for_cash_withdrawal"
]
fields = ["question", "categories"]
pa_ds = [
{
"fields": {
"question": sample["text"],
"categories": json.dumps(categories)
},
"answer": json.dumps({
"intent": categories[sample["label"]], # The intent here should be the class name
})
}
for sample in ds
]
print(prompt_template.format(**pa_ds[0]['fields']))Next, specify the origin_model which you query with the current system prompt and prompt template; and your target_models, which you would like to query with adapted prompts. You can list multiple target models.
origin_model = {"provider": "openai", "model": "gpt-4o-2024-08-06"}
target_models = [
{"provider": "anthropic", "model": "claude-sonnet-4-20250514"},
]Finally, call the API and the adaptation request will be submitted to Not Diamond's servers. You will get back an adaptation_run_id.
import os
from notdiamond import NotDiamond
client = NotDiamond(api_key=os.environ.get("NOTDIAMOND_API_KEY"))
response = client.prompt_adaptation.adapt(
system_prompt=system_prompt,
template=prompt_template,
fields=fields,
goldens=pa_ds,
origin_model=origin_model,
target_models=target_models,
evaluation_metric="LLMaaJ:Sem_Sim_1"
)
adaptation_run_id = response.adaptation_run_id
print(f"Adaptation job started: {adaptation_run_id}")import { NotDiamond } from 'notdiamond';
const client = new NotDiamond({api_key: process.env.NOTDIAMOND_API_KEY});
// Note: Prepare your dataset similar to the Python example above
// Define systemPrompt, promptTemplate, fields, goldenData, originModel, and targetModels
const response = await client.promptAdaptation.adapt({
systemPrompt: systemPrompt,
template: promptTemplate,
fields: fields,
goldens: goldenData, // Your prepared dataset
originModel: originModel,
targetModels: targetModels,
evaluationMetric: 'LLMaaJ:Sem_Sim_1'
});
const adaptationRunId = response.adaptationRunId;
console.log(`Adaptation job started: ${adaptationRunId}`);Retrieve results
Once the prompt adaptation job completes, you can retrieve the optimized prompts and evaluation metrics.
# Check job status
status = client.prompt_adaptation.get_adapt_status(adaptation_run_id)
print(f"Status: {status.status}")
# When status is 'completed', fetch results
results = client.prompt_adaptation.get_adapt_results(adaptation_run_id)
# View optimized prompts
for target in results.target_models:
print(f"\nModel: {target.model_name}")
print(f"Pre-optimization score: {target.pre_optimization_score}")
print(f"Post-optimization score: {target.post_optimization_score}")
print(f"\nOptimized system prompt:\n{target.system_prompt}")
print(f"\nOptimized template:\n{target.user_message_template}")// Check job status
const status = await client.promptAdaptation.getAdaptStatus(adaptationRunId);
console.log(`Status: ${status.status}`);
// When status is 'completed', fetch results
const results = await client.promptAdaptation.getAdaptResults(adaptationRunId);
// View optimized prompts
for (const target of results.targetModels) {
console.log(`\nModel: ${target.modelName}`);
console.log(`Pre-optimization score: ${target.preOptimizationScore}`);
console.log(`Post-optimization score: ${target.postOptimizationScore}`);
console.log(`\nOptimized system prompt:\n${target.systemPrompt}`);
console.log(`\nOptimized template:\n${target.userMessageTemplate}`);
}Response format
The API response contains detailed information about the optimization results:
{
"id": "uuid", // The prompt adaptation id
"created_at": "datetime", // Timestamp
"origin_model": {
"model_name": "openai/gpt-4o-2024-08-06", // The original model the prompt was designed for
"score": 0.8, // The original model's score on the dataset before optimization
"evals": {"LLMaaJ:Sem_Sim_1": 0.8}, // The original model's evaluation results on the dataset
"system_prompt": "...", // The baseline system prompt submitted
"user_message_template": "...", // The baseline prompt template submitted
"result_status": "completed"
},
"target_models": [
{
"model_name": "anthropic/claude-sonnet-4-20250514", // The target model
"pre_optimization_score": 0.64, // The target model's score on the dataset before optimization
"pre_optimization_evals": {"LLMaaJ:Sem_Sim_1": 0.64}, // The target model's evaluation results on the dataset before optimization
"post_optimization_score": 0.8, // The target model's score on the dataset after optimization
"post_optimization_evals": {"LLMaaJ:Sem_Sim_1": 0.8}, // The target model's evaluation results on the dataset after optimization
"system_prompt": "...", // The optimized system prompt
"user_message_template": "...", // The optimized prompt template
"user_message_template_fields": ["..."], // Field arguments in the user_message_template
"result_status": "completed"
}
],
}result_status can have one of the following statuses:
created: the optimization job has been received.queued: the optimization job is currently in queue to be processed.processing: the optimization job is currently running. Evaluation scores will benulluntil the job iscompleted.completed: the optimization job is finished and you will see the evaluation scores populated.failed: the optimization job failed, please try again or contact support.
Each model in target_models will have its own results dictionary. If an adaptation failed for a specific target model, please try again or contact support.
Updated 1 day ago
