@Pasquale Antonante and I just published a deep dive on our findings of RAG Eval. What’s inside: • In-Depth Metric Analysis: pros & cons of various deterministic and LLM-based retrieval metricsComparative Benchmarking: GPT-4, GPT-3.5, and Claude 2.1 in retrieval assessment without ground truth labels • Step-by-Step Guide: using metrics for systematic quality enhancementOpen-Source Tool: continuous-eval to run plug-&-play evaluation on your dataset Whether you are already super experienced with RAG Eval or new to setting them up for your pipeline, we'd love to hear your feedback!
