A developer is building a RAG application using Oracle Cloud Infrastructure (OCI) Document Understanding and OCI Generative AI. After chunking documents and generating embeddings, the developer observes that the retrieval step often returns chunks that are semantically unrelated to the query. Which action is MOST likely to improve retrieval relevance?
Trap 1: Switch from a dense embedding model to a sparse embedding model.
The embedding model choice is secondary; chunking is the primary issue.
Trap 2: Increase the chunk size to capture more context.
Larger chunks may include irrelevant content, reducing precision.
Trap 3: Reduce the number of retrieved chunks (k) in the vector search.
Reducing k may cause relevant passages to be missed.
- A
Switch from a dense embedding model to a sparse embedding model.
Why wrong: The embedding model choice is secondary; chunking is the primary issue.
- B
Adjust the chunk size and chunk overlap to better capture coherent passages.
Proper chunking helps preserve meaning and improves retrieval accuracy.
- C
Increase the chunk size to capture more context.
Why wrong: Larger chunks may include irrelevant content, reducing precision.
- D
Reduce the number of retrieved chunks (k) in the vector search.
Why wrong: Reducing k may cause relevant passages to be missed.