Skip to content

RAG

Art of Looking at RAG Data

In the past year, I've done a lot of consulting on helping companies improve their RAG applications. One of the biggest things I want to call out is the idea of topics and capabilities.

I use this distinction to train teams to identify and look at the data we have to figure out what we need to build next.

Predictions for the Future of RAG

In the next 6 to 8 months, RAG will be used primarily for report generation. We'll see a shift from using RAG agents as question-answering systems to using them more as report-generation systems. This is because the value you can get from a report is much greater than the current RAG systems in use. I'll explain this by discussing what I've learned as a consultant about understanding value and then how I think companies should describe the value they deliver through RAG.

Rag is the feature, not the benefit.

Systematically Improving Your RAG

RAG Course

Check out this course if you're interested in systematically improving RAG.

These are notes generated after a call I had with Hamel on a 'system' to improve a RAG system. I've also written some other work like Rag is not Embeddings and how to build a Terrible RAG System and how complexity can be broken down into smaller pieces.

By the end of this post, you'll have a clear understanding of my systematic approach to improving RAG applications for the companies I work with. We'll cover key areas such as:

  • Create synthetic questions and answers to quickly evaluate your system's precision and recall
  • Make sure to combine full-text search and vector search for optimal retrieval
  • Implementing the right user feedback mechanisms to capture specifically what you're interested in studying
  • Use clustering to find segments of queries that have issues, broken down into topics and capabilities
  • Build specific systems to improve capabilities
  • Continuously monitoring, evaluating as real-world data grows

Through this step-by-step runbook, you'll gain practical knowledge on how to incrementally enhance the performance and utility of your RAG applications, unlocking their full potential to deliver exceptional user experiences and drive business value. Let's dive in and explore how to systematically improve your RAG systems together!

RAG Course

I'm building a RAG Course right now, if you're interested in the course please fill out this form

RAG (Retrieval-Augmented Generation), is a powerful technique that combines information retrieval with LLMs to provide relevant and accurate responses to user queries. By searching through a large corpus of text and retrieving the most relevant chunks, RAG systems can generate answers that are grounded in factual information.

In this post, we'll explore six key areas where you can focus your efforts to improve your RAG search system. These include using synthetic data for baseline metrics, adding date filters, improving user feedback copy, tracking average cosine distance and Cohere reranking score, incorporating full-text search, and efficiently generating synthetic data for testing.

Levels of Complexity: RAG Applications

RAG Course

Check out this course if you're interested in systematically improving RAG.

This post comprehensive guide to understanding and implementing RAG applications across different levels of complexity. Whether you're a beginner eager to learn the basics or an experienced developer looking to deepen your expertise, you'll find valuable insights and practical knowledge to help you on your journey. Let's embark on this exciting exploration together and unlock the full potential of RAG applications.

If you want to learn about my consulting practice check out my services page. If you're interested in working together please reach out to me via email

This is a work in progress and mostly an outline of what I want to write. I'm mostly looking for feedback

Stop using LGTM@Few as a metric (Better RAG)

I work with a few seed series a startups that are ramping out their retrieval augmented generation systems. I've noticed a lot of unclear thinking around what metrics to use and when to use them. I've seen a lot of people use "LGTM@Few" as a metric, and I think it's a terrible idea. I'm going to explain why and what you should use instead.

If you want to learn about my consulting practice check out my services page. If you're interested in working together please reach out to me via email


When giving advice to developers on improving their retrieval augmented generation, I usually say two things:

  1. Look at the Data
  2. Don't just look at the Data

Wise men speak in paradoxes because we are afraid of half-truths. This blog post will try to capture when to look at data and when to stop looking at data in the context of retrieval augmented generation.

I'll cover the different relevancy and ranking metrics, some stories to help you understand them, their trade-offs, and some general advice on how to think.

How to build a terrible RAG system

RAG Course

I'm building a RAG Course right now, if you're interested in the course please fill out this form

If you've seen any of my work, you know that the main message I have for anyone building a RAG system is to think of it primarily as a recommendation system. Today, I want to introduce the concept of inverted thinking to address how we should approach the challenge of creating an exceptional system.

What is inverted thinking?

Inversion is the practice of thinking through problems in reverse. It's the practice of “inverting” a problem - turning it upside down - to see it from a different perspective. In its most powerful form, inversion is asking how an endeavor could fail, and then being careful to avoid those pitfalls. [1]

RAG Course

Check out this course if you're interested in systematically improving RAG.

With the advent of large language models (LLM), retrieval augmented generation (RAG) has become a hot topic. However throught the past year of helping startups integrate LLMs into their stack I've noticed that the pattern of taking user queries, embedding them, and directly searching a vector store is effectively demoware.

What is RAG?

Retrieval augmented generation (RAG) is a technique that uses an LLM to generate responses, but uses a search backend to augment the generation. In the past year using text embeddings with a vector databases has been the most popular approach I've seen being socialized.

RAG

Simple RAG that embedded the user query and makes a search.

So let's kick things off by examining what I like to call the 'Dumb' RAG Model—a basic setup that's more common than you'd think.