The importance of memory.

gennaio 05, 2024

Well, in a RAG chain we have always several important pieces:

the Embeddings Model, which translates texts in dense vectors, enabling Semantic Search
the Vector Store, where you safely save your texts and vectors, giving you a fast way to find all relevant docs
A Reranker, which helps to refine your search
The Large Language Model (LLM)

In the beginning, you could think that the LLM is only useful at the end of the chain, when all the docs retrieved are put inside the context of the prompt, together with your request. There, it synthetizes the answer.

But all the evolution we see today is often based on ideas on how to use more and more of the incredible power we have in current LLMs.

Let us consider one important feature we want to have in a Knowledge Assistant: we want the assistant to keep the memory of all the previous questions and answers (message history) and use it to enable a more natural kind of conversation.

For example, imagine that one of your questions is: "What is Long COVID?" (I'm working on a demo based on nice documentation from NIH).

Your next question could be: "What are the symptoms?" (and you don't repeat the subject).

Now, when you do the search inside the Vector Store you cannot simply use "What are the symptoms?" The symptoms of what? Every disease in your knowledge base?

You have to rephrase your question based on the message history.

One approach, well supported for example in Llama-index, is called "condense_plus_context": here you're using the LLM twice:

first, you take your last question + the message history and ask the LLM to create a condensed question (for example: "What are the symptoms for Long covid")
Then, you do a search in the Vector Store using the condensed question
and, only at the end, you send all the retrieved docs from the condensed question, plus the question, to the LLM to synthesize the answer

If you have built a chain using Llama-index or LangChain, adding memory and such an approach means a few lines of code.

If you want to see more details, have a look here: chat with memory

Cerca nel blog

Generative AI and more...

The importance of memory.

Commenti

Posta un commento

Post popolari in questo blog