Introducing document structure-based retrieval in RAG, where developers can navigate and extract chunks from the right sections of the right documents. This is relevant for well-structured documents where sections and sub-sections are clearly defined, such as legal documents, financial reports, healthcare papers, etc.
This builds on our open-source rule-based retrieval python package which provides a rule-based abstraction layer for performing filtered vector similarity retrieval based on page numbers. While page numbers are a good start, developers need retrieval solutions that map to the structure of their documents.
WhyHow.AI is building tools to help developers bring more determinism and control to their RAG pipelines using graph structures. If you would like early access to this tool, email us at
team@whyhow.ai or join the conversation on Discord (
https://lnkd.in/ezzTkqzj)
https://medium.com/enterprise-rag/deterministic-document-structure-based-retrieval-472682f9629a