Demo quickstart
Not Diamond is a predictive model recommendation framework that enables you to train a custom router to determine which LLM will provide the highest quality response to any given input based on your evaluation data. To help you understand how Not Diamond works, we've trained a cross-domain demo router that you can use in this quickstart example. You can follow along with the Python and TypeScript code below to make your first API request, or try it in Colab.
Demo router
While the demo router we'll use in this quickstart will give you an understanding of how Not Diamond works and can be leveraged for lightweight general-purpose use, for production-grade applications we encourage teams to train a custom router optimized to their data and evaluation criteria.
Installation
Python: Requires Python 3.9+. Itβs recommended that you create and activate a virtualenv prior to installing the package. For this example, we'll be installing the optional additional create
dependencies, which you can learn more about here.
pip install "notdiamond[create]"
npm install notdiamond dotenv
Setting up
Create a .env
file with your Not Diamond API key and the API keys of the models you want to route between:
NOTDIAMOND_API_KEY = "YOUR_NOTDIAMOND_API_KEY"
OPENAI_API_KEY = "YOUR_OPENAI_API_KEY"
ANTHROPIC_API_KEY = "YOUR_ANTHROPIC_API_KEY"
You can also define API keys programmatically.
Sending your first Not Diamond API request
Create a new file in the same directory as your .env
file and copy and run the code below (you can toggle between Python and TypeScript in the top left of the code block):
from notdiamond import NotDiamond
# Define the Not Diamond routing client
client = NotDiamond()
# The best LLM is determined by Not Diamond based on the messages and specified models
result, session_id, provider = client.chat.completions.create(
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Concisely explain merge sort."} # Adjust as desired
],
model=['openai/gpt-4o', 'openai/gpt-4o-mini', 'anthropic/claude-3-5-sonnet-20240620']
)
print("Not Diamond session ID: ", session_id) # A unique ID of Not Diamond's recommendation
print("LLM called: ", provider.model) # The LLM routed to
print("LLM output: ", result.content) # The LLM response
import { NotDiamond } from 'notdiamond';
import dotenv from 'dotenv';
dotenv.config();
// Initialize the Not Diamond client
const notDiamond = new NotDiamond();
// The best LLM is determined by Not Diamond based on the messages and specified models
const result = await notDiamond.create({
messages: [
{ role: 'system', content: 'You are a world class programmer.' },
{ role: 'user', content: 'Consiely explain merge sort.' } // Adjust as desired
],
llmProviders: [
{ provider: 'openai', model: 'gpt-4o' },
{ provider: 'openai', model: 'gpt-4o-mini' },
{ provider: 'anthropic', model: 'claude-3-5-sonnet-20240620' }
]
});
if ('detail' in result) {
console.error('Error:', result.detail);
}
else {
console.log('Not Diamond session ID:', result.session_id); // A unique ID of Not Diamond's recommendation
console.log('LLM called:', result.providers); // The LLM routed to
console.log('LLM output', result.content); // The LLM response
}
Breaking down this example
We first define the routing client, which you can think of as a meta-LLM in which we'll combine multiple LLMs. We can define various clients, each with different configurations for different purposes, throughout our application.
client = NotDiamond()
// Initialize the Not Diamond client
const notDiamond = new NotDiamond();
After initializing the client and defining our LLMs, we next pass in an array of messages and the models we want to route between:
result, session_id, provider = client.chat.completions.create(
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Concisely explain merge sort."} # Adjust as desired
],
model=['openai/gpt-4o', 'openai/gpt-4o-mini', 'anthropic/claude-3-5-sonnet-20240620']
)
const result = await notDiamond.create({
messages: [
{ role: 'system', content: 'You are a world class programmer.' },
{ role: 'user', content: 'Consiely explain merge sort.' } // Adjust as desired
],
llmProviders: [
{ provider: 'openai', model: 'gpt-4o' },
{ provider: 'openai', model: 'gpt-4o-mini' },
{ provider: 'anthropic', model: 'claude-3-5-sonnet-20240620' }
]
});
This returns a session ID and a recommended model:
- Session ID: a unique ID for this specific recommendation. This is useful for submitting feedback on routing decisions.
- Provider: the LLM selected by the ND API as the most appropriate for responding to the query.
- LLM response (Python only): in addition to returning a recommended LLM, the Not Diamond Python SDK can also facilitate client-side requests to the recommended LLM with the
create
method. Alternatively, we can usemodel_select
to simply return a session ID and a provider. You can learn more about these two methods here
Good for use cases with diverse inputs
Not Diamond's out-of-the-box router (which we leverage in this example) is most useful for applications in which we are handling diverse inputs, such as a chatbot or a code generation assistant. For narrower tasks, we can train a custom router optimized to our data.
Next steps
In this example, we've learned how to dynamically route an array of messages to best-suited LLM amongst a set of various candidates. To explore all the features that Not Diamond offers, checkout the following guides
Model gateway
If you are already using an OpenAI compatible LLM endpoint, Not Diamond provides a convenient gateway endpoint that is OpenAI compatible. Simply swap the base URL with our gateway URL and you can begin using Not Diamond to optimize your LLM use. Check out the docs.
Updated 18 days ago