Summary

We built an open source library that traverses an existing vector database in a graph-like manner to answer complex questions. We saw a 1.7x increase in “perfect” retrievals over vector search, as well as a 5x decrease in catastrophic failures on the hotpot_qa dataset.

The architecture stands on Datastax’s Astra DB for easy vector management and Pongo’s semantic filter for pin-point retrieval performance.

Untitled

The Problem

RAG is a powerful tool, but if you ask a complex question like What is the CEO of Pongo's favorite color? , many systems will fall flat due to the fact that the query is incomplete. If the target document says Caleb's favorite color is orange , then naive RAG will fail to retrieve it, even though Caleb is the CEO of Pongo. Things get even worse if you have multiple documents stating people’s favorite colors, which may lead you to returning a confidently incorrect answer about someone else’s Favorite color.

The Solution

There are multiple solutions to this problem, such as setting up and maintaining a graph database, much like Microsoft’s GraphRAG did recently. We propose a simpler approach, which utilizes your existing vector database with the following recursive approach:

  1. Assess if we can answer the question based on the current documents
    1. Yes? Return the answer
    2. No? continue
  2. Expand the query into its individual components based on provided documents (if present)
  3. Fetch relevant document for each sub-query
  4. Take the top 1-3 documents from each sub-query and include them in our current documents, ignore duplicates
  5. Return to step 1

Untitled