Practice 1Z0-1127 Building LLM Applications with RAG and Vector Search questions with full explanations on every answer.
Start practicing
Building LLM Applications with RAG and Vector Search — choose a session length
Free · No account required
Click any question to see the full explanation and answer options, or start a focused practice session above.
A developer is building a RAG application using Oracle Cloud Infrastructure (OCI) Document Understanding and OCI Generative AI. After chunking documents and generating embeddings, the developer observes that the retrieval step often returns chunks that are semantically unrelated to the query. Which action is MOST likely to improve retrieval relevance?
2An organization stores its knowledge base in Oracle Autonomous Database and wants to build a RAG chatbot using OCI Generative AI. The chatbot must retrieve the most relevant documents based on user queries. Which indexing approach is BEST suited for efficient similarity search on text embeddings?
3A company is deploying a RAG pipeline using OCI Data Science and OCI Generative AI. The pipeline uses a Cohere command model for generation and a Cohere embed model for retrieval. The team notices that the model occasionally produces hallucinated answers that are not supported by the retrieved context. Which strategy is MOST effective at reducing hallucinations?
4A data scientist is building a RAG application that processes PDF invoices. The extraction step uses OCI Document Understanding to convert PDFs to text. The scientist then splits the text into chunks and generates embeddings using OCI Generative AI. However, the retrieval often misses critical fields like invoice numbers and dates. Which preprocessing step would MOST likely improve retrieval of these specific fields?
5A developer is using OCI Generative AI to build a question-answering system over a large corpus of technical manuals. The developer uses the Cohere Embed model to generate embeddings and stores them in an OCI OpenSearch cluster. Queries are slow and the team needs to reduce latency. Which approach is BEST for improving search speed while maintaining acceptable accuracy?
6A team is deploying a RAG system that uses OCI Generative AI to answer questions about internal HR policies. The system must comply with data residency requirements: all data processing must stay within a specific OCI region. The team uses OCI Data Science for orchestration. Which architecture BEST meets the data residency requirement?
7A developer notices that the RAG system returns irrelevant chunks when the user query contains typos or abbreviations. Which technique would BEST improve retrieval robustness for such queries?
8Which TWO are best practices for building a RAG application on OCI? (Choose two.)
9Which THREE are valid considerations when designing a RAG pipeline that uses OCI Generative AI and OCI OpenSearch? (Choose three.)
10Which TWO are common causes of poor answer quality in a RAG system built on OCI Generative AI? (Choose two.)
11A manufacturing company uses OCI OpenSearch to build a RAG application that retrieves procedural documents. After deployment, queries often return outdated procedures even though the vector index was refreshed. What is the most likely cause?
12A healthcare startup is building a chatbot that retrieves patient treatment guidelines using OCI Generative AI Service and OCI OpenSearch. They require that all retrieved documents are from approved sources only and that the system can explain which source was used for each response. Which combination of features should they implement?
13A company uses a RAG pipeline with OCI Data Science and Cohere embeddings. They notice that retrieval recall is low for domain-specific acronyms. What is the best practice to improve this?
14A financial firm deploys a RAG application using OCI OpenSearch. They observe that the LLM sometimes generates incorrect answers that are not supported by the retrieved documents. Which technique directly addresses this issue?
15A research institution uses OCI Data Flow to process large-scale document corpora for a RAG system. They want to minimize latency for end-user queries. Which architecture decision would most effectively reduce query latency?
16A retail company uses OCI Generative AI Service to build a RAG chatbot for product recommendations. The chatbot should consider both the user's query and the retrieved product descriptions. Which component of the RAG pipeline is responsible for combining these inputs before sending to the LLM?
17Which TWO actions are best practices when deploying a RAG application using OCI OpenSearch and OCI Generative AI?
18Which THREE factors should be considered when designing a vector search index for a RAG application that supports multiple languages?
19A developer receives the above error when querying a RAG application. What is the most likely cause and recommended action?
20An engineer configured the above index mapping for vector search. When performing a k-NN search, the results are unexpected. What is the most likely issue?
21You are a cloud architect at a global e-commerce company. The company is building a RAG-based product support chatbot using OCI Generative AI Service and OCI OpenSearch. The chatbot must answer customer questions in real-time by retrieving from a product knowledge base containing over 10 million documents. The current architecture uses a single vector index with all documents, and the LLM (Cohere Command R+) returns answers in English only. The team observes that queries from non-English customers often return irrelevant results, and the chatbot sometimes fails to generate answers within the 5-second SLA. The leadership wants to support 10 languages and reduce the average response time to under 3 seconds. You need to propose a solution that improves both relevance and latency. Which course of action should you take?
22You are a data scientist at a legal firm. The firm uses OCR to digitize court documents and then indexes them in OCI OpenSearch for a RAG application. The application uses OCI Generative AI Service (Cohere Command) to answer questions about case law. Recently, the team noticed that the answers are often factually incorrect or include information not present in the retrieved documents. After reviewing the pipeline, you find that the chunking strategy splits documents into 512-token chunks with 128-token overlap. The embedding model is Cohere Embed v3 (English), and the retrieval returns the top 5 chunks. The LLM has a context window of 4096 tokens. The team suspects that the chunking strategy is causing loss of context. What is the best course of action to improve answer accuracy?
23A healthcare company is building a RAG-based chatbot to answer patient queries using medical documents stored in OCI Object Storage. They use OCI Generative AI service with Cohere Command R+ model and OCI OpenSearch as the vector database. The chatbot is deployed on OCI Compute with a Flask application. After deployment, the latency for each query is 15-20 seconds, which is unacceptable. Logs show that the embedding generation step (using OCI Generative AI embedding API) takes 8-10 seconds, and the vector search in OpenSearch takes 5-7 seconds. The team has already enabled connection pooling and increased the compute instance shape to the maximum allowed. Which action would MOST effectively reduce the overall latency?
24A developer is building a RAG chatbot for an internal knowledge base. To ensure the system retrieves the most relevant chunks, what is the best practice for chunking?
25A company uses OCI Generative AI to create embeddings for a vector search. They notice high latency in search queries. What is one possible optimization?
26An application uses RAG to answer customer queries, but answers are often incomplete because the retrieved chunks do not contain full context. Which adjustment should the developer make?
27A team uses OCI OpenSearch as a vector database for RAG. Some queries return no results despite relevant documents being indexed. What is a likely cause?
28A developer wants to deploy a RAG application using OCI Generative AI for both embedding and text generation while minimizing costs. Which strategy is most effective?
29An enterprise RAG system must ensure that retrieved data comes only from authorized sources. Which OCI feature should be used to enforce this?
30A team fine-tunes an embedding model for a legal document RAG system but observes low retrieval recall. Which technique is most likely to improve recall?
31An application mixes RAG with other data sources. The vector search returns too many irrelevant chunks. What is the best approach to filter them?
32A developer uses OCI Generative AI with a custom OCI OpenSearch vector store. The text generation model sometimes hallucinates facts not in the retrieved documents. What is the most effective mitigation?
33Which TWO of the following are best practices for building a RAG pipeline in OCI?
34Which THREE of the following are likely causes if retrieval returns no results despite documents being indexed in an OCI OpenSearch vector store?
35Which THREE techniques effectively reduce query latency in a RAG system?
36Refer to the exhibit. Why did the embedding creation fail?
37Refer to the exhibit. What is a potential issue with this OCI OpenSearch index template configuration?
38Refer to the exhibit. What is the best action to resolve this error?
39A company is building a RAG application for customer support. The knowledge base includes documents in English, Spanish, and French. Which embedding model should they use from OCI Generative AI to ensure accurate retrieval across all languages?
40An organization is experiencing low recall in their RAG system. They are using OCI OpenSearch as the vector store with cosine similarity. After reviewing the retrieved chunks, they notice that relevant documents are not being returned. Which configuration change is most likely to improve recall?
41A healthcare company is deploying a RAG application using OCI Generative AI and wants to ensure patient data privacy. They cannot send sensitive data to a public embedding endpoint. Which approach should they take to embed documents while maintaining data residency and security?
42A developer is building a RAG pipeline using OCI Data Science and wants to store vector embeddings. Which OCI service is optimized for vector search and can be used as a vector store?
43During a RAG implementation, the response quality degrades because the LLM receives too many irrelevant document chunks. Which technique can best filter out irrelevant chunks before sending them to the LLM?
44A company is using Oracle Database 23ai AI Vector Search for their RAG pipeline. They notice that similarity search often returns chunks that are semantically unrelated but syntactically similar due to token overlap. Which vector index type should they consider to improve semantic relevance?
45A developer is testing a RAG application using OCI Generative AI. They receive an error: 'The model cohere.command-r-plus-v1:0 is not supported in this region.' What is the most likely cause?
46A team is designing a RAG system for legal document review. They want to ensure that the retrieved chunks are contextually coherent and not truncated mid-sentence. Which chunking strategy should they use?
47An enterprise is using OCI Generative AI with a RAG architecture. They observe that the LLM sometimes produces hallucinated answers that are not supported by the retrieved documents. Which strategy is most effective in reducing these hallucinations?
48Which TWO of the following are best practices when indexing documents for a RAG application using OCI OpenSearch?
49Which THREE factors should be considered when choosing a vector store for a RAG application in OCI?
50Which TWO of the following are valid approaches to serve a RAG application in OCI with low latency?
51A developer sends the above request to the OCI Generative AI API. The response returns an error: 'InvalidParameter: The parameter 'topP' is not supported for this model.' What is the most likely reason?
52An OCI CLI command above returns embeddings for the phrase 'Hello world'. The developer notices that the embedding vector length is 384 dimensions. However, they expected 768 dimensions. What is the most likely cause?
53A DBA has created the above vector index. After running queries, they observe that recall is lower than expected for approximate searches. Which change would most likely improve recall while maintaining query performance?
54A company is building a RAG application using OCI Generative AI and OCI Search with OpenSearch. Users report that the responses from the LLM are not relevant to the queries, even though the document chunks seem appropriate. What is the most likely cause?
55An organization needs to extract text from PDF documents and convert them into embeddings for a RAG pipeline using OCI. Which OCI service is best suited for extracting text from PDFs?
56A developer implements a RAG chatbot using OCI Generative AI with streaming enabled. The chatbot fails to remember earlier conversation turns during a session. What is the most likely cause?
57A data scientist is designing a RAG system with a large vector database (hundreds of millions of documents) and requires high recall accuracy. Which vector search index type should be used in OCI Search with OpenSearch?
58What is a recommended practice to prevent the LLM from generating information not present in the retrieved context when building a RAG application?
59An application using OCI Generative AI returns a 403 Forbidden error when attempting to invoke a model. The user's API key is valid and the endpoint is correct. What is the most likely cause?
60Which OCI service provides a managed vector database capability that can be used as a knowledge base in a RAG architecture?
61What is the primary purpose of an embedding model in a RAG pipeline?
62A RAG system returns irrelevant chunks even though the embedding model and vector index are correctly configured. After reviewing, the chunks are too large and contain extraneous information. Which combination of adjustments should be made to improve relevance?
63Which TWO are required components to implement a basic RAG system using OCI services? (Choose two.)
64Which TWO are best practices for chunking documents in a RAG pipeline? (Choose two.)
65Which THREE factors directly influence the quality of responses in a RAG system? (Choose three.)
66A company has deployed a RAG application using OCI Generative AI service with a vector store in OCI OpenSearch. Users report that answers are often incomplete or irrelevant. The application uses a single prompt template with a fixed chunk size of 1000 tokens. Which action is most likely to improve answer quality?
67A developer wants to build a RAG application that processes highly sensitive medical records. The documents are already stored in OCI Object Storage. Which vector storage strategy best balances security and performance?
68A team has set up a RAG pipeline using OCI Data Science with OCI OpenSearch as the vector store. The embedding model is from the OCI Generative AI service. Users note that the vector search returns irrelevant documents for many queries. Which of the following is the most likely cause?
69When building a RAG application for document retrieval, which chunking strategy is recommended to maximize retrieval accuracy?
70A company's RAG application ingests news articles that are updated frequently. The vector store in OCI OpenSearch contains embeddings of the articles. The team notices that outdated information is still retrieved even after updating the source documents. What is the most effective way to ensure the vector store reflects the latest content?
71A legal firm needs an AI assistant that can answer questions based on a large corpus of internal regulations that change quarterly. The firm also requires high accuracy and the ability to cite sources. Which approach should the firm choose?
72A real-time customer support chatbot uses RAG with OCI Generative AI. The average response time is 5 seconds, which is too slow. The team identifies the vector search as the bottleneck. Which optimization would most reduce latency?
73When invoking the OCI Generative AI service from a RAG application, the developer receives a 401 Unauthorized error. The application uses resource principal authentication from an OCI Data Science notebook session. What is the most likely fix?
74A document processing pipeline uses OCI Document Understanding to extract text from PDFs, then creates embeddings with OCI Generative AI. Some documents exceed the embedding model's token limit. What is the best approach?
75A team is designing a RAG system for a multilingual knowledge base. Which TWO strategies are appropriate? (Choose two.)
76A developer is troubleshooting low recall in a vector search. Which THREE factors should be checked? (Choose three.)
77A company wants to ensure their RAG application complies with data residency requirements. Data must not leave a specific OCI region. Which TWO actions are necessary? (Choose two.)
78Refer to the exhibit. A developer runs the command and immediately tries to use the endpoint. The application fails with an error indicating the endpoint is not active. What is the most likely reason?
79Refer to the exhibit. A developer has set this policy to allow an OCI Data Science session to generate embeddings. However, the API call returns a 403 Forbidden. Which of the following is likely missing?
80Refer to the exhibit. A RAG application logs this error when trying to search. What is the most likely cause?
81A company is building a RAG application on OCI and needs a managed vector database with native support for AI Vector Search, which offers high performance and integration with OCI GenAI. Which OCI service should they use?
82A development team notices that their RAG application returns responses slowly when processing large PDF documents (100+ pages). They need to improve response time without significantly reducing retrieval quality. Which action is most effective?
83An AI engineer observes that the RAG application fails to retrieve relevant documents for certain user queries, despite having a comprehensive knowledge base. The issue appears to be a semantic gap between query phrasing and document content. Which technique should the engineer implement first to address this?
84A developer needs to generate embeddings for text data using the OCI Generative AI service. Which API should they call to get vector representations of text?
85A healthcare organization plans to deploy a RAG application on OCI that handles sensitive patient data. They require that all LLM inference and embedding processing happen within a controlled environment to avoid data leakage to public endpoints. Which OCI feature should they use?
86A company wants to build a multi-modal RAG system that can retrieve both text and images based on a user query. Which approach is most aligned with OCI GenAI capabilities?
87When chunking a large Python code repository for a RAG application, which chunking strategy is best suited to preserve code semantics and functionality?
88An organization wants to combine keyword search and vector search to improve retrieval accuracy in their RAG pipeline. Which OCI service provides built-in hybrid search capabilities?
89A RAG application is hallucinating because the LLM receives irrelevant context from the retrieval step, even when topK is set to 3. Which strategy would best reduce hallucination by improving the relevance of retrieved documents?
90Which TWO actions can improve the retrieval accuracy of a RAG system? (Select two.)
91Which TWO best practices should be followed when designing a RAG application using OCI GenAI? (Select two.)
92Which THREE components are essential in a typical RAG architecture built on OCI? (Select three.)
93A developer calls the OCI GenAI embedding API as shown in the exhibit. What is the most likely cause of the error?
94An IAM policy is shown in the exhibit. A user reports that they cannot call the OCI GenAI embedding API, but they can use OCI AI Language. Which policy statement is missing to allow embedding API access?
95The architecture shown in the exhibit is missing a critical component for a RAG pipeline. What step is missing between receiving the user query and searching the vector store?
96A company is building a RAG application using OCI Generative AI and wants to store embeddings for document retrieval. Which OCI service is most appropriate for storing and querying vector embeddings?
97A developer notices that the RAG application returns irrelevant chunks for user queries. The embedding model used is `cohere.embed-english-light-v3.0`. Which action is MOST likely to improve relevance?
98A company uses OCI Data Science to fine-tune an embedding model for a specialized domain. After fine-tuning, the model produces embeddings that are not aligned with the vector index used in OCI OpenSearch. What is the most likely cause?
99A developer wants to implement a simple RAG pipeline using OCI Language's text generation and embedding models. Which OCI SDK method is used to generate embeddings for a text chunk?
100During load testing, the RAG application's response time increases significantly. The vector search is performed on millions of vectors. Which optimization would MOST reduce latency?
101A security audit reveals that the RAG application exposes internal documents through the chatbot. The vector search index contains sensitive data. Which action should be taken FIRST to mitigate?
102What is the primary purpose of chunking documents in a RAG pipeline?
103A team uses Cohere's `rerank` endpoint after initial retrieval to improve result quality. What is the main benefit of reranking?
104In OCI OpenSearch, a k-NN search query returns results with low precision. The index uses HNSW algorithm. The search parameters are: `k=10`, `ef_search=100`. To improve recall without significantly increasing latency, which parameter should be adjusted?
105Which TWO of the following are valid similarity metrics used in vector search?
106Which THREE factors should be considered when designing a chunking strategy for a RAG application?
107A troubleshooting scenario: A RAG system returns no results for certain queries. The index exists and has documents. Which TWO are likely causes?
108A startup is building a customer support chatbot using RAG with OCI Generative AI. They have a large corpus of FAQ documents stored as PDFs in OCI Object Storage. The developer uses OCI Language to embed the text and stores vectors in OCI OpenSearch. During testing, the chatbot often fails to answer questions because relevant FAQ entries are not retrieved. The team suspects the chunking size is too large, causing loss of specific details. After reducing chunk size, retrieval improves slightly but still misses many answers. What should the team do NEXT?
109A financial services company is deploying a RAG system for regulatory compliance queries. The system uses OCI Data Science to run a custom embedding model fine-tuned on regulatory documents. The index in OpenSearch uses cosine similarity and HNSW algorithm. Users report that queries containing synonyms to regulatory terms (e.g., "AML" vs "Anti-Money Laundering") often fail to retrieve relevant documents. Which combination of improvements would be MOST effective? (Assume budget and latency constraints)
110An enterprise RAG application experiences high latency during peak hours. The architecture uses OCI OpenSearch with a single node cluster storing 5 million vectors (768 dimensions). The search uses exact k-NN (EF_SEARCH=500). The average query takes 1.5 seconds, but the SLA requires <500ms. The team considers several options: A) Switch to ANN with lower recall (HNSW with ef_search=50), B) Scale OpenSearch cluster to 3 nodes, C) Reduce embedding dimension to 256 using PCA, D) Increase the number of shards from 1 to 10. Which option provides the best balance of latency reduction and minimal impact on retrieval quality? (Assume all options are feasible)
111A developer is building a RAG application using OCI Generative AI. They notice that the generated responses often contain outdated information even though the knowledge base is updated daily. What is the most likely cause?
112A company is deploying a RAG system for internal document search using OCI OpenSearch as the vector store. Users report that queries about recent policy changes return no results, even though the new policies were ingested. Which configuration is most likely missing?
113A team is optimizing a RAG pipeline for OCI Generative AI. They observe that the model's responses are verbose and often include irrelevant details from the retrieved chunks, reducing user satisfaction. They have already tuned the prompt template. What is the most effective next step?
114A data scientist is designing a RAG system using OCI Data Science and OCI Generative AI. Which two considerations are critical for optimal retrieval quality? (Choose 2.)
115A company is deploying a RAG application for legal document analysis using OCI. Which three best practices should be followed to mitigate hallucinations? (Choose 3.)
116A small business is building an internal Q&A bot using OCI Generative AI with RAG. They have indexed their product manuals into OCI OpenSearch using a precomputed embedding model. When they test queries, the bot often returns answers that are only partially relevant, and sometimes it cannot find answers for questions that are clearly present in the manuals. The developers suspect the chunking strategy is suboptimal. Currently, they use a fixed chunk size of 512 tokens with no overlap. What should they do to improve retrieval relevance?
117A developer is using OCI Data Science to create a RAG pipeline. They have ingested documents into a vector store using OCI Generative AI's text-embedding model. During testing, they notice that queries return very few results (often 0 or 1) even when the knowledge base contains relevant documents. They have set the top-k parameter to 10. What is the most likely cause?
118A company uses OCI Generative AI's chat endpoint with RAG for customer support. They have observed that the model sometimes generates answers that contradict the retrieved context. The retrieved chunks are correct and relevant, but the model ignores them. What configuration change should they implement first?
119An enterprise is deploying a RAG application for compliance document analysis using OCI. They use OCI OpenSearch as the vector store and have millions of documents. Retrieval latency is critical. Currently, a single query takes over 2 seconds. The index uses a flat (brute-force) distance computation. They have considered using approximate nearest neighbor (ANN) algorithms but are unsure about the impact on recall. They need to reduce latency to under 500ms while maintaining high recall. What should they do?
120A data scientist is using OCI Data Science to build a RAG system for medical literature. They have a large corpus of PDFs. They used the default OCI Generative AI embedding model and chunked each PDF into 512-character segments with 10% overlap. However, queries about specific drug doses often return incorrect information, even though the correct dose is present in the corpus. Upon inspection, they find that the retrieved chunks often contain partial dose information or miss the context units (e.g., mg vs. mcg). What improvement should they prioritize?
121A large organization is deploying a multi-tenant RAG application on OCI, where each tenant has its own set of documents. They use a shared OCI OpenSearch cluster with tenant_id metadata to filter documents. They observe that occasionally, queries from one tenant return results from another tenant's documents. The security team requires strict isolation. They have verified that the metadata filter is correctly applied in the search request. What is the most likely root cause?
122A company is using OCI Generative AI for a RAG-based code assistant. They index source code repositories into a vector store. Developers report that the assistant often suggests deprecated APIs or outdated code snippets, even though the latest code is in the repository. The index was built a week ago and has not been updated. They plan to set up incremental updates. However, they notice that even after re-indexing the latest commits, the issue persists. What is the most likely oversight?
123Which TWO of the following are best practices when implementing a RAG application using OCI OpenSearch as a vector store?
124Refer to the exhibit. A developer creates an index mapping for a vector search application. When performing a k-NN search query, the query fails with a parsing error. What is the most likely cause?
125A company has implemented a RAG-based chatbot using OCI Generative AI and OCI OpenSearch as the vector store. The chatbot answers questions about internal policies. The team uses a dense vector embedding model with 768 dimensions and the HNSW algorithm. The corpus contains 5 million documents. Users report that the chatbot takes 8-12 seconds to respond, and the answers are often not relevant, missing key policy details. Upon investigation, the team finds that the k-NN search returns results based solely on vector similarity, ignoring exact keyword matches that are critical for policy documents. Which course of action will most effectively improve both response time and relevance?
The Building LLM Applications with RAG and Vector Search domain covers the key concepts tested in this area of the 1Z0-1127 exam blueprint published by Oracle. Courseiva provides free domain-focused practice, mock exams, missed-question review, and readiness tracking across all 1Z0-1127 domains — no account required.
The Courseiva 1Z0-1127 question bank contains 125 questions in the Building LLM Applications with RAG and Vector Search domain. Click any question to see the full explanation and answer breakdown.
Start with a 10-question focused session to identify your baseline accuracy in this domain. Read every explanation — even for questions you answer correctly — to understand the reasoning. Once you score consistently above 80%, move to a 20–30 question session to confirm depth before moving to the next domain.
Yes — the session launcher on this page draws questions exclusively from the Building LLM Applications with RAG and Vector Search domain. Choose 10, 20, 30, or 50 questions for a focused session, or click individual questions to review them one by one.
Save your results, see per-domain analytics, and get readiness scores — free, for every certification.
Sign Up FreeFree forever · Every certification included