I want to improve my retrieval with my vector database in my Cerebral Valley #06-technical-discussion

I want to improve my retrieval with my vector data...

Chris Johnston

09/08/2023, 4:23 PM

I want to improve my retrieval with my vector database (in my case, PGVector). Sometimes, it comes back to some unrelated articles . My plan is too have an LLM look at the results and decide if its actually a good result but I'm afraid it will hurt speed. Has anyone tried to do something like this - AI filtered retrieval results?

Vishwanath Seshagiri

09/08/2023, 4:57 PM

What you're asking is re-ranking the results based on relevance. LLMs cannot do this because they need to be explicitly trained on relevance. On a side note: If you're getting very irrelvant results, you might want to look into the quality and granularity of the embeddings that has been generated. Also what's the number of documents that you're retrieving for each query?

Chris Johnston

09/08/2023, 7:58 PM

I think it could judge relevance. just ask is, does this answer the question. For instance, I'm asking "What team is Cristina Escalante on?" which is only answered in 1 article - "About Cristina Escalante". I'm getting back that but also things shes just mentioned in. I don't want articles that say "if you get stuck, ask Cristina Escalante". I'm also getting "About Priscilla de Roode Torres".. I'm guessing because she has a spanish name. Cristina isn't mentioned at all. Ask GPT, "does this article answer the following question: { Question }. Return a boolean in the JSON in the following format: { isAnswered: BOOLEAN} then check. if false, eliminate it from the final prompt out to GPT.

Chris Johnston

09/08/2023, 8:00 PM

There are lots of documents... all being pulled from our internal Notion. I think we're capping at 7 but if we could return more then sort, I think it would improve

Chris Johnston

09/08/2023, 8:00 PM

The quality could be better of the Notion articles. There are a lot of images and links that don't do very much.

Chris Johnston

09/08/2023, 8:01 PM

That's the bulk of the problem but sometimes, it doesn't find things I think it absolutely should

Jeng Yang Chia

09/09/2023, 1:39 AM

yeah - dm'ed you

🔥 1

10 Views

Open in Slack

Previous Next