When Rag'n'Bone Man dropped his breakthrough single Human in 2016, he likely had no idea his stage name would later share the spotlight with something far less human – artificial intelligence (AI). Fast forward to today, and in the world of large language models (LLMs), RAG stands for something entirely different: Retrieval-Augmented Generation.
LLMs have now caught the attention of pretty much everyone, being trained on a vast amount of data and capable of understanding and generating human language. While these models are powerful on their own, they can be tailored to specific industries through fine-tuning or enhancing it with custom data. RAG takes this a step further by incorporating an external retrieval mechanism to fetch relevant information from document repositories or large knowledge bases, which is then included as context in the prompt.
A RAG-based LLM operates in two key steps:
- Retriever: This component identifies the most relevant information from the knowledge base.
- Generator: The LLM uses the retrieved information to generate a contextual response.
It’s a very powerful approach to make language models smarter by combining their innate ability to generate text with external and specific sources of knowledge. And just like the song, RAG makes the AI a little bit more ”human” - in its ability to provide context-rich responses.
At least, that’s what our colleague Gavris told us when we discussed this topic. During our 2023 Hackathon, his team fine-tuned an open-source language model (Llama 2, to be precise) with specific information from the financial domain. They then fed product documentation and release notes from a financial application to the RAG knowledge base, a vector database used due to its powerful search capabilities. After experimenting with various training scenarios, it all came together as an in-app training robot.
Gavris outlined some benefits of using LLMs with RAG, including „accuracy and specificity, up-to-date information and the avoidance of so-called model hallucinations.” However, he pointed out some challenges as well: „Most crucial is the availability of high quality, accurate data. Also, about the data, the privacy is important and hard to manage in an unstructured data retrieval mechanism.”
We were also wondering what advancements in RAG-based LLMs might further enhance their capabilities and Theo, our managing partner, stepped in to share his take on the matter: „In general a RAG is used to give more specific (domain) context on top of a language model which brings better and more concise prompting capabilities. We are also experimenting with graph builders on top of the LLM where the LLM is requested via the RAG to identify relationships in the information universe and to populate the graph. And imagine you can fly through the information universe along the relationships identified, finding new and possible unexpected insights.”
RAG-based LLMs represent a bold step toward making AI more context-aware and relatable when tailored to a specific domain, for example. Like Rag'n'Bone Man’s Human, which captures the essence of our shared humanity, RAG enables language models to bridge the gap between raw data and human understanding. As we push the boundaries of what’s possible, it’s clear that the journey of refining RAG will lead us to deeper insights and a more intuitive relationship with technology.