Improving the quality of your Generative AI Solution using RAG and Oracle OCI Data Science.


Since the launch of ChatGPT, in November 2022, we have seen widespread interest in its innovative technology, particularly in its GPT-4 iteration.

Since that moment, we have come to understand the capabilities of Large Language Models (LLMs) and we have entered a new era, where the Generative AI potential of these models can significantly improve both productivity and quality, across numerous professions

We have learned that:

  • LLM can answer our questions
  • LLM can write emails for our customers
  • LLM can perform copy-editing of our papers (like this one)
  • LLM can quickly generate code snippets for prototype
  • ....
But, although LLMs have shown great promises, they also have limitations, especially in the Enterprise sector.

At the beginning of this year, a really good review paper was published, as usual, on Arxiv. In the paper they highlight, among others, these limitations:

  • LLMs have limited knowledge in time (the so-called knowledge cutoff)
  • Even if they don't have an answer, they tend to craft anyway a convincing one (the hallucinations)
  • It is difficult to get from them information on the sources used to produce the answer. They'll tell it is "their internal, parametrized, knowledge"
As stated in the above-mentioned paper: "These shortcomings underscore the impracticality of deploying LLMs as black-box solutions in real-world production environments, without additional safeguards".

A Design Pattern, first proposed in 2020, has emerged as the best solution: Retrieval-Augmented Generation (RAG).

But a "naive" way of describing RAG, simply combining retrieval of information (based on Embeddings) with the Generative capabilities is not the real solution. You need to carefully implement all the pieces of the Design pattern and to have real tools. You need what is called Advanced and better Modular RAG.

I'm working more and more with our customers to translate concepts into reality, combining many of the services available in our OCI Cloud together with modern Open Source frameworks, like LlamaIndex and LangChain.

Just to give you a quick idea of what you can do, today, in OCI:

  • You can use one of the Embeddings models provided by our OCI Generative Service
  • You can store texts + embeddings in Oracle DB Vector Store, using the advanced new functionalities for Similarity Search, together with attribute-based search (in LA today)
  • You can add to the chain a "Reranker", deployed as Managed Model Deployment in OCI Data Science
  • If you need, you can fine-tune an Open Source Embedding model in OCI Data Science, or add on top of a Foundation Embedding model an adapter, fine-tuned on your documents
  • You can use as LLM the Cohere Command model, or Llama2 70B, both part of OCI Data Science Generative AI
  • You can combine everything using LangChain or LlamaIndex and we provide integration in OCI Data Science ADS Library
  • You can deploy the chain as a Model Deployment, again using OCI Data Science
  • You can create the datasets needed and execute all the fine-tuning training, validation, and tests using OCI Data Science, in an environment with the lowest costs in the market.
What else?

We provide you an environment where only you have access to your data. No privacy, no IP protection issues. 


Yes, we have seen wonderful things in 2023. AI is today's reality and not hype. And more is to come this year.








  

Commenti

Post popolari in questo blog

The importance of memory.