top of page

Imagination Augmented Generation: Learning to Imagine Richer Context for Question Answering over Large Language Models

Original research paper here

Original code repo here

How is IAG superior to RAG & GAG ?

Look, ideal solution will optimize for -

1. Increased accuracy of answers

2. Decreased computation costs

3. Reduced inference times

Let's see how current approaches fare in respect to above criterias.

1. RAG (Retrieval Augmented Generation)

Retrieves related documents from relevant sources and sends them along with question to LLM for answer generation.

Problem - depends on external sources, needs much more computational resources & leads to longer inference times.

2. GAG (Generation Augmented Generation)

Documents are generated using LLMs or external APIs to be fed to LLMs along with question for answer generation.

Problem - financial costs due to API calls, still requires lots of computational resources & increases inference times.

A better solution

What if we are able to use latent context inside LLMs to imagine a smaller & more efficient richer context !

Enter IAG (Imagination Augmented Generation)

IAG proposes IMcQA framework (IMagine richer Context method for QA) which consists of two parts -

1. Explicit imagination module - uses symbol distillation to obtain the compressed context and then guides LLMs in generating a short and useful dummy document to be used like RAG

2. Implicit imagination module - used Hypernetworks to generate LoRA weights to activate task processing capability of LLMs. Unlike LoRA, hypernetwork learns to imagine hidden knowledge for each question.

Training of IMcQA model is done in 2 phases.

Phase 1 - an Imagine model (using T5) is trained to imagine a small dummy document based on a question

Phase 2 - Hypernetwork fine-tuning is done using long context distillation to learn a map from question to LoRA wights


Performance gains were observed while reducing computational expenses & time.

Notably, it outperformed even baseline methods which do not use any knowledge augmentation techniques like RAG or GAG.


Proposed method is able to successfully activate the relevant internal knowledge of LLMs to answer a given question, thereby reducing computational resources and decreasing inference times.



Não foi possível carregar comentários
Parece que houve um problema técnico. Tente reconectar ou atualizar a página.
bottom of page