Awesome, my use-case is to retrieve a bunch of documents from a vector store at the first query and then make LLM calls on that text, with additional text on chat history. If you were to design the stack how would you guys do it? My current approach is to make the first backend call to a flask application get topk documents from the vector store and then use that information as context for the chatopenai LLM calls through edge service