Joint prompt optimization

To further optimize model routing, we can optimize system prompts for each model so that we're calling the best model with the best prompt.

We've written tutorials for how to leverage two popular prompt optimization libraries— SAMMO and DSPy —to automatically optimize prompts for each model in a data-driven way. We can also manually prompt engineer system prompts for each model. However we arrive at our various prompts, we can assign specific prompts to each model in Not Diamond using the LLMConfig class

from notdiamond.llms.config import LLMConfig

gpt_3_5_turbo_prompt = "Summarize this essay:"
claude_3_opus_prompt = "Distill the essence of this document:"

llms = [
    LLMConfig(
        provider="openai",
        model="gpt-3.5-turbo",
        system_prompt=gpt_3_5_turbo_prompt
    ),
    LLMConfig(
        provider="anthropic",
        model="claude-3-opus-20240229",
      	system_prompt=claude_3_opus_prompt
    ),
]
import { NotDiamond } from 'notdiamond';

const gpt_3_5_turbo_prompt: string = "Summarize this essay:";
const claude_3_opus_prompt: string = "Distill the essence of this document:";

const llms = [
  {
    provider: 'openai',
    model: 'gpt-3.5-turbo',
    systemPrompt: gpt_3_5_turbo_prompt
  },
  {
    provider: 'anthropic',
    model: 'claude-3-opus-20240229',
    systemPrompt: claude_3_opus_prompt
  },
];

We can also run our optimized (prompt, LLM) pairs through an evaluation pipeline to jointly optimize model routing with our prompting in a data-driven way. Once we generate evaluation scores, we can use them to train our own custom router.

For any given system prompt and any given message, Not Diamond will discover and route to the best LLM available, making our prompt engineering more robust by default. However, by jointly optimizing our (prompt, LLM) pairs, we can drive even further improvements in the accuracy and robustness of our applications.