Strong vs weak LLM routing

A common use case is to route between a strong model and a weak (cheaper, faster) model. We can achieve this using Not Diamond in the following way:

from notdiamond import NotDiamond

client = NotDiamond()

strong_model='openai/gpt-4o'  # Choose your strong model
weak_model='openai/gpt-4o-mini'  # Choose your weak model

result, session_id, provider = client.chat.completions.create(
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Concisely explain merge sort."}
    ],
  	model=[strong_model, weak_model],
    tradeoff="cost"  # Send queries to the weak model when doing so doesn't degrade quality
)

print("LLM called: ", provider.model)  # The LLM routed to
print("LLM output: ", result.content)  # The LLM response