Folks, a quick question on ConversationalRetrieval...
# 06-technical-discussion
Folks, a quick question on ConversationalRetrievalChain. I am using this chain to read additional data from the vector database instead of just passing the information in context. Specifically, qa = ConversationalRetrievalChain.from_llm( llm=OpenAI(temperature=0), retriever=vectorstore.as_retriever(), memory=memory, ) where the vectorstore is a pinecone database. In my vector database, the text field for each document is fairly large - 8k words. As a result at query time I am exceeding the token limit for the llm. Two questions: 1. At query time how many documents’ text does the llm read? Is it easy to define say 4 or 5 documents and not more? 2. One potential idea could be to chunk the text of the document at insert time, but it would obviously provide incomplete information. I am sure someone has run into this problem. Any suggestions on how to resolve it?