Fallbacks and timeouts

Not Diamond was specifically designed to not be a proxy layer to eliminate the risk of disruptions if Not Diamond ever fails to return a response.

We can define a timeout for how many seconds to wait for an API response from Not Diamond, and we can configure a fallback model as default in case of error or timeout. The default parameter is of type string and represents the specific model from the llm_providers list we want to use as a fallback.

result, session_id, provider = client.chat.completions.create(
    messages=[ 
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Consiely explain merge sort."}  # Adjust as desired
    ],
    model=['openai/gpt-3.5-turbo', 'openai/gpt-4o', 'anthropic/claude-3-5-sonnet-20240620']
    timeout=5,
    default="openai/gpt-4o-2024-05-13"
)
const result = await notDiamond.create({
  messages: [
    { role: 'system', content: 'You are a world class programmer.' },
    { role: 'user', content: 'Consiely explain merge sort.' }  // Adjust as desired
  ],
  llmProviders: [
    { provider: 'openai', model: 'gpt-3.5-turbo' },
    { provider: 'openai', model: 'gpt-4o' },
    { provider: 'anthropic', model: 'claude-3-5-sonnet-20240620' }
  ],
  timeout: 5,
  default: 'openai/gpt-4o'
});

The default value for timeout is 5 seconds. If no default LLM is defined, Not Diamond will automatically consider the first LLM specified in your list as the default model.

Custom fallback logic

If we want to use custom logic for defining fallbacks for our requests to specific LLMs, we can use Not Diamond to determine the best LLM to call using the model_select method and then decide how we want to implement our API call logic and fallback behavior.

session_id, provider = client.chat.completions.model_select(
    messages=[
        {"role": "system", "content": "You are a world class programmer."},
        {"role": "user", "content": "Write a merge sort in Python. Be as concise as possible."},
    ],
  	model=['openai/gpt-3.5-turbo', 'openai/gpt-4o', 'anthropic/claude-3-5-sonnet-20240620']
)

from openai import OpenAI

openai_client = OpenAI(api_key="OPENAI_API_KEY")

max_retries = 3

if provider.model == "gpt-3.5-turbo":
    for _ in range(max_retries):
        try:
            chat_completion = openai_client.chat.completions.create(
                messages=[
                    {
                        "role": "user",
                        "content": prompt_template.format(),
                    }
                ],
                model="gpt-3.5-turbo",
            )
            return chat_completion.choices[0].message.content
        except:
            continue
import { NotDiamond } from 'notdiamond';
import { OpenAI } from 'openai';
import dotenv from 'dotenv';
dotenv.config();

// Initialize the Not Diamond client
const notDiamond = new NotDiamond({apiKey: process.env.NOTDIAMOND_API_KEY});

// The best LLM is determined by Not Diamond based on the messages and specified models
const result = await notDiamond.modelSelect({
  messages: [
    { role: 'system', content: 'You are a world class programmer.' },
    { role: 'user', content: 'Consiely explain merge sort.' }  // Adjust as desired
  ],
  llmProviders: [
    { provider: 'openai', model: 'gpt-3.5-turbo' },
    { provider: 'openai', model: 'gpt-4o' },
    { provider: 'anthropic', model: 'claude-3-5-sonnet-20240620' }
  ],
  tradeoff: "cost"
});

if ('detail' in result) {
  console.error('Error:', result.detail);
} 
else {
  console.log('Not Diamond session ID:', result.session_id);  // A unique ID of Not Diamond's recommendation
  console.log('LLM called:', result.providers);  // The LLM routed to
    

    const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
    const maxRetries = 3;

    const provider = result.providers[0];

    let finalResult = null;

    if (provider.model === 'gpt-3.5-turbo') {
    for (let i = 0; i < maxRetries; i++) {
        try {
        const completion = await openai.chat.completions.create({
            messages: [
            {
                role: 'user',
                content: 'Write a merge sort in Python. Be as concise as possible.',
            }
            ],
            model: 'gpt-3.5-turbo',
        });
        finalResult = completion.choices[0];
        console.log('Response:', finalResult);
        break;
        } catch {
        continue;
        }
    }
    }
}