We built semantic cache from scratch at Portkey - ...
# 07-self-promotion
v
We built semantic cache from scratch at Portkey - already seeing 20% cache hit rates for Q&A and RAG use cases (at 10M GPT4 requests a day, that's $2,700 saved a month) Wrote down the technical details, latency benchmarks etc here: https://blog.portkey.ai/blog/reducing-llm-costs-and-latency-semantic-cache/ Would love to discuss with anyone exploring in this space!
👍 2
c
Awesome idea, I love this. Reading up on it and will sign up for the beta.
🙌 1