Courseiva — IT Certification Practice Questions

Building LLM Applications with RAG and Vector Search

medium

You are a data scientist at a legal firm. The firm uses OCR to digitize court documents and then indexes them in OCI OpenSearch for a RAG application. The application uses OCI Generative AI Service (Cohere Command) to answer questions about case law. Recently, the team noticed that the answers are often factually incorrect or include information not present in the retrieved documents. After reviewing the pipeline, you find that the chunking strategy splits documents into 512-token chunks with 128-token overlap. The embedding model is Cohere Embed v3 (English), and the retrieval returns the top 5 chunks. The LLM has a context window of 4096 tokens. The team suspects that the chunking strategy is causing loss of context. What is the best course of action to improve answer accuracy?