Introduction

Developing retrieval augmented generation (RAG) applications often involves lots of trial and error, and the configuration used for one document store rarely transfers to a different one. When a RAG application fails, it's also hard to know if it's due to poorly optimized retrieval parameters or because the LLM hallucinated the response. To address these problems, Not Diamond offers a suite of tools that can help developers manage the end-to-end development lifecycle of RAG applications.

In the next few sections, we will walk you through a typical RAG workflow. All you need to do is come with your document store and we will show you how you can

  1. Auto-generate test data from your documents.
  2. Auto-optimize retrieval parameters and settings for RAG workflows using the generated test data.
  3. Auto-evaluate the performance of your RAG workflow using the optimized retrieval parameters.

With the evaluation results of your RAG workflow you can then train a custom router that automatically uses the best LLM for each user input, maximizing overall accuracy while reducing the cost and latency of your application.