A company is using OCI Generative AI service to power a customer support chatbot. They observe that the chatbot sometimes provides outdated information because the model was trained on data up to 2022. They want to incorporate real-time knowledge without retraining the model. Which approach should they use?
RAG retrieves relevant up-to-date documents and feeds them to the model, enabling current responses without retraining.
Why this answer
Option C is correct because Retrieval-Augmented Generation (RAG) allows the model to access real-time information from an external knowledge base, such as OCI OpenSearch, without retraining. This pattern retrieves relevant documents or data at inference time and injects them into the prompt, enabling the model to answer with up-to-date context. It directly addresses the need for real-time knowledge while keeping the base model static.
Exam trap
The trap here is that candidates often confuse prompt engineering (Option B) as a way to 'override' training data, but in reality, prompt instructions cannot erase the model's learned parameters, making RAG the only viable solution for real-time knowledge without retraining.
How to eliminate wrong answers
Option A is wrong because increasing max-tokens only extends the length of the response, not the recency or accuracy of the information; it does not provide any mechanism to incorporate new data. Option B is wrong because prompt engineering cannot force the model to 'ignore' outdated training data; the model's parametric knowledge is fixed and cannot be selectively suppressed by instructions alone, leading to hallucinations or contradictions. Option D is wrong because fine-tuning requires retraining the model on new data, which contradicts the requirement to avoid retraining and is also resource-intensive and time-consuming.