Installation

pip install notdiamond

npm install notdiamond

Setting up

Set your Not Diamond API key.

export NOTDIAMOND_API_KEY=YOUR_NOTDIAMOND_API_KEY

Sending your first Not Diamond API request

import os

from notdiamond import NotDiamond

# Define the Not Diamond routing client
client = NotDiamond(api_key=os.environ.get("NOTDIAMOND_API_KEY"))

# The best LLM is determined by Not Diamond based on the messages and specified models
result = client.model_router.select_model(
    messages=[ 
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Concisely explain merge sort."}  # Adjust as desired
    ],
    llm_providers=[
        {"provider": "openai", "model": "gpt-5-2025-08-07"},
        {"provider": "openai", "model": "gpt-5-nano-2025-08-07"},
        {"provider": "anthropic", "model": "claude-sonnet-4-20250514"}
    ],
    tradeoff="cost"  # Optional: "cost", "latency", or None (default, optimizes for quality)
)

print("Not Diamond session ID: ", result.session_id)  # A unique ID of Not Diamond's recommendation
print("LLM called: ", result.provider.model)  # The LLM routed to

# Now call the selected model using your preferred SDK (OpenAI, Anthropic, etc.)

import NotDiamond from 'notdiamond';

// Initialize the Not Diamond client
const client = new NotDiamond({api_key: process.env.NOTDIAMOND_API_KEY});

// The best LLM is determined by Not Diamond based on the messages and specified models
const result = await client.modelRouter.selectModel({
  messages: [
    { role: 'system', content: 'You are a world class programmer.' },
    { role: 'user', content: 'Concisely explain merge sort.' }  // Adjust as desired
  ],
  llmProviders: [
    { provider: 'openai', model: 'gpt-5-2025-08-07' },
    { provider: 'openai', model: 'gpt-5-nano-2025-08-07' },
    { provider: 'anthropic', model: 'claude-sonnet-4-20250514' }
  ],
  tradeoff: 'cost'  // Optional: 'cost', 'latency', or omit (default, optimizes for quality)
});

console.log('Not Diamond session ID:', result.sessionId);  // A unique ID of Not Diamond's recommendation
console.log('LLM called:', result.provider.model);  // The LLM routed to

// Now call the selected model using your preferred SDK (OpenAI, Anthropic, etc.)

Breaking down this example

We first define the routing client, which you can think of as a meta-LLM in which we'll combine multiple LLMs. We can define various clients, each with different configurations for different purposes, throughout our application.

client = NotDiamond(api_key=os.environ.get("NOTDIAMOND_API_KEY"))

const client = new NotDiamond({api_key: process.env.NOTDIAMOND_API_KEY});

After initializing the client, we pass in an array of messages and the models we want to route between:

result = client.model_router.select_model(
    messages=[ 
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Concisely explain merge sort."}  # Adjust as desired
    ],
    llm_providers=[
        {"provider": "openai", "model": "gpt-5-2025-08-07"},
        {"provider": "openai", "model": "gpt-5-nano-2025-08-07"},
        {"provider": "anthropic", "model": "claude-sonnet-4-20250514"}
    ],
    tradeoff="cost"  # Optional: "cost", "latency", or None (default, optimizes for quality)
)

const result = await client.modelRouter.selectModel({
  messages: [
    { role: 'system', content: 'You are a world class programmer.' },
    { role: 'user', content: 'Concisely explain merge sort.' }  // Adjust as desired
  ],
  llmProviders: [
    { provider: 'openai', model: 'gpt-5-2025-08-07' },
    { provider: 'openai', model: 'gpt-5-nano-2025-08-07' },
    { provider: 'anthropic', model: 'claude-sonnet-4-20250514' }
  ],
  tradeoff: 'cost'  // Optional: 'cost', 'latency', or omit (default, optimizes for quality)
});

This returns a session ID and a recommended model:

Session ID: a unique ID for this specific recommendation.
Provider: the LLM selected by the Not Diamond API as the most appropriate for responding to the query. You then use this provider to make the actual LLM call with your preferred SDK.
Tradeoff (optional): By default, Not Diamond maximizes quality. Use tradeoff="cost" to prefer cheaper models when quality loss is negligible, or tradeoff="latency" to prefer faster models. You can also define custom cost and latency attributes for specific models.

Next steps

In this example, we've learned how to dynamically route an incoming query to the best-suited LLM amongst a set of various candidates. To explore all the features that Not Diamond offers, checkout the following guides

Training our own custom LLM router