1Z0-1127 Building LLM Applications with RAG and Vector Search — All Questions With Answers

Question 1easymultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A developer is building a RAG application using Oracle Cloud Infrastructure (OCI) Document Understanding and OCI Generative AI. After chunking documents and generating embeddings, the developer observes that the retrieval step often returns chunks that are semantically unrelated to the query. Which action is MOST likely to improve retrieval relevance?

Question 2mediummultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

An organization stores its knowledge base in Oracle Autonomous Database and wants to build a RAG chatbot using OCI Generative AI. The chatbot must retrieve the most relevant documents based on user queries. Which indexing approach is BEST suited for efficient similarity search on text embeddings?

Question 3hardmultiple choice

Read the full NAT/PAT explanation →

A company is deploying a RAG pipeline using OCI Data Science and OCI Generative AI. The pipeline uses a Cohere command model for generation and a Cohere embed model for retrieval. The team notices that the model occasionally produces hallucinated answers that are not supported by the retrieved context. Which strategy is MOST effective at reducing hallucinations?

Question 4mediummultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A data scientist is building a RAG application that processes PDF invoices. The extraction step uses OCI Document Understanding to convert PDFs to text. The scientist then splits the text into chunks and generates embeddings using OCI Generative AI. However, the retrieval often misses critical fields like invoice numbers and dates. Which preprocessing step would MOST likely improve retrieval of these specific fields?

Question 5easymultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A developer is using OCI Generative AI to build a question-answering system over a large corpus of technical manuals. The developer uses the Cohere Embed model to generate embeddings and stores them in an OCI OpenSearch cluster. Queries are slow and the team needs to reduce latency. Which approach is BEST for improving search speed while maintaining acceptable accuracy?

Question 6hardmultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A team is deploying a RAG system that uses OCI Generative AI to answer questions about internal HR policies. The system must comply with data residency requirements: all data processing must stay within a specific OCI region. The team uses OCI Data Science for orchestration. Which architecture BEST meets the data residency requirement?

Question 7mediummultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A developer notices that the RAG system returns irrelevant chunks when the user query contains typos or abbreviations. Which technique would BEST improve retrieval robustness for such queries?

Question 8easymulti select

Read the full Building LLM Applications with RAG and Vector Search explanation →

Which TWO are best practices for building a RAG application on OCI? (Choose two.)

Question 9mediummulti select

Read the full Building LLM Applications with RAG and Vector Search explanation →

Which THREE are valid considerations when designing a RAG pipeline that uses OCI Generative AI and OCI OpenSearch? (Choose three.)

Question 10hardmulti select

Read the full Building LLM Applications with RAG and Vector Search explanation →

Which TWO are common causes of poor answer quality in a RAG system built on OCI Generative AI? (Choose two.)

Question 11mediummultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A manufacturing company uses OCI OpenSearch to build a RAG application that retrieves procedural documents. After deployment, queries often return outdated procedures even though the vector index was refreshed. What is the most likely cause?

Question 12hardmultiple choice

Read the full NAT/PAT explanation →

A healthcare startup is building a chatbot that retrieves patient treatment guidelines using OCI Generative AI Service and OCI OpenSearch. They require that all retrieved documents are from approved sources only and that the system can explain which source was used for each response. Which combination of features should they implement?

Question 13easymultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A company uses a RAG pipeline with OCI Data Science and Cohere embeddings. They notice that retrieval recall is low for domain-specific acronyms. What is the best practice to improve this?

Question 14mediummultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A financial firm deploys a RAG application using OCI OpenSearch. They observe that the LLM sometimes generates incorrect answers that are not supported by the retrieved documents. Which technique directly addresses this issue?

Question 15hardmultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A research institution uses OCI Data Flow to process large-scale document corpora for a RAG system. They want to minimize latency for end-user queries. Which architecture decision would most effectively reduce query latency?

Question 16easymultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A retail company uses OCI Generative AI Service to build a RAG chatbot for product recommendations. The chatbot should consider both the user's query and the retrieved product descriptions. Which component of the RAG pipeline is responsible for combining these inputs before sending to the LLM?

Question 17mediummulti select

Read the full Building LLM Applications with RAG and Vector Search explanation →

Which TWO actions are best practices when deploying a RAG application using OCI OpenSearch and OCI Generative AI?

Question 18hardmulti select

Read the full Building LLM Applications with RAG and Vector Search explanation →

Which THREE factors should be considered when designing a vector search index for a RAG application that supports multiple languages?

Question 19mediummultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A developer receives the above error when querying a RAG application. What is the most likely cause and recommended action?

Exhibit

Refer to the exhibit.

error log:
{
  "timestamp": "2025-03-15T10:30:00Z",
  "source": "oci-generative-ai-inference",
  "message": "CohereClientException: 429 Too Many Requests",
  "details": {
    "retryAfter": 60,
    "modelId": "cohere.command-r-plus-08-2024"
  }
}

Question 20hardmultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

An engineer configured the above index mapping for vector search. When performing a k-NN search, the results are unexpected. What is the most likely issue?

Exhibit

Refer to the exhibit.

document index mapping:
{
  "settings": {
    "index": {
      "knn": true,
      "knn.space_type": "cosinesimil"
    }
  },
  "mappings": {
    "properties": {
      "content_embedding": {
        "type": "knn_vector",
        "dimension": 768,
        "method": {
          "name": "hnsw",
          "engine": "faiss",
          "space_type": "l2"
        }
      },
      "metadata": {
        "type": "object"
      }
    }
  }
}

Question 21hardmultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

You are a cloud architect at a global e-commerce company. The company is building a RAG-based product support chatbot using OCI Generative AI Service and OCI OpenSearch. The chatbot must answer customer questions in real-time by retrieving from a product knowledge base containing over 10 million documents. The current architecture uses a single vector index with all documents, and the LLM (Cohere Command R+) returns answers in English only. The team observes that queries from non-English customers often return irrelevant results, and the chatbot sometimes fails to generate answers within the 5-second SLA. The leadership wants to support 10 languages and reduce the average response time to under 3 seconds. You need to propose a solution that improves both relevance and latency. Which course of action should you take?

Question 22mediummultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

You are a data scientist at a legal firm. The firm uses OCR to digitize court documents and then indexes them in OCI OpenSearch for a RAG application. The application uses OCI Generative AI Service (Cohere Command) to answer questions about case law. Recently, the team noticed that the answers are often factually incorrect or include information not present in the retrieved documents. After reviewing the pipeline, you find that the chunking strategy splits documents into 512-token chunks with 128-token overlap. The embedding model is Cohere Embed v3 (English), and the retrieval returns the top 5 chunks. The LLM has a context window of 4096 tokens. The team suspects that the chunking strategy is causing loss of context. What is the best course of action to improve answer accuracy?

Question 23hardmultiple choice

Read the full NAT/PAT explanation →

A healthcare company is building a RAG-based chatbot to answer patient queries using medical documents stored in OCI Object Storage. They use OCI Generative AI service with Cohere Command R+ model and OCI OpenSearch as the vector database. The chatbot is deployed on OCI Compute with a Flask application. After deployment, the latency for each query is 15-20 seconds, which is unacceptable. Logs show that the embedding generation step (using OCI Generative AI embedding API) takes 8-10 seconds, and the vector search in OpenSearch takes 5-7 seconds. The team has already enabled connection pooling and increased the compute instance shape to the maximum allowed. Which action would MOST effectively reduce the overall latency?

Question 24easymultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A developer is building a RAG chatbot for an internal knowledge base. To ensure the system retrieves the most relevant chunks, what is the best practice for chunking?

Question 25easymultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A company uses OCI Generative AI to create embeddings for a vector search. They notice high latency in search queries. What is one possible optimization?

Question 26easymultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

An application uses RAG to answer customer queries, but answers are often incomplete because the retrieved chunks do not contain full context. Which adjustment should the developer make?

Question 27mediummultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A team uses OCI OpenSearch as a vector database for RAG. Some queries return no results despite relevant documents being indexed. What is a likely cause?

Question 28mediummultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A developer wants to deploy a RAG application using OCI Generative AI for both embedding and text generation while minimizing costs. Which strategy is most effective?

Question 29mediummultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

An enterprise RAG system must ensure that retrieved data comes only from authorized sources. Which OCI feature should be used to enforce this?

Question 30hardmultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A team fine-tunes an embedding model for a legal document RAG system but observes low retrieval recall. Which technique is most likely to improve recall?

Question 31hardmultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

An application mixes RAG with other data sources. The vector search returns too many irrelevant chunks. What is the best approach to filter them?

Question 32hardmultiple choice

Read the full NAT/PAT explanation →

A developer uses OCI Generative AI with a custom OCI OpenSearch vector store. The text generation model sometimes hallucinates facts not in the retrieved documents. What is the most effective mitigation?

Question 33easymulti select

Read the full Building LLM Applications with RAG and Vector Search explanation →

Which TWO of the following are best practices for building a RAG pipeline in OCI?

Question 34mediummulti select

Read the full Building LLM Applications with RAG and Vector Search explanation →

Which THREE of the following are likely causes if retrieval returns no results despite documents being indexed in an OCI OpenSearch vector store?

Question 35hardmulti select

Read the full Building LLM Applications with RAG and Vector Search explanation →

Which THREE techniques effectively reduce query latency in a RAG system?

Question 36easymultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

Refer to the exhibit. Why did the embedding creation fail?

Network Topology

Question 37mediummultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

Refer to the exhibit. What is a potential issue with this OCI OpenSearch index template configuration?

Exhibit

{
  "version": "1.0",
  "index_patterns": ["*"],
  "priority": 20,
  "template": {
    "settings": {
      "number_of_shards": 1,
      "number_of_replicas": 0,
      "index.knn": true,
      "index.knn.space_type": "l2"
    },
    "mappings": {
      "properties": {
        "content_embedding": {
          "type": "knn_vector",
          "dimension": 1024,
          "method": {
            "name": "hnsw",
            "space_type": "cosinesimil",
            "engine": "lucene",
            "parameters": {
              "ef_construction": 512,
              "m": 32
            }
          }
        }
      }
    }
  }
}

Question 38hardmultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

Refer to the exhibit. What is the best action to resolve this error?

Exhibit

Error: The total token count (4082) exceeds the model's maximum context length (4096). The input includes 512 tokens for system prompt, 3072 tokens for retrieved documents, and 498 tokens for the user query.

Question 39easymultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A company is building a RAG application for customer support. The knowledge base includes documents in English, Spanish, and French. Which embedding model should they use from OCI Generative AI to ensure accurate retrieval across all languages?

Question 40mediummultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

An organization is experiencing low recall in their RAG system. They are using OCI OpenSearch as the vector store with cosine similarity. After reviewing the retrieved chunks, they notice that relevant documents are not being returned. Which configuration change is most likely to improve recall?

Question 41hardmultiple choice

Read the full NAT/PAT explanation →

A healthcare company is deploying a RAG application using OCI Generative AI and wants to ensure patient data privacy. They cannot send sensitive data to a public embedding endpoint. Which approach should they take to embed documents while maintaining data residency and security?

Question 42easymultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A developer is building a RAG pipeline using OCI Data Science and wants to store vector embeddings. Which OCI service is optimized for vector search and can be used as a vector store?

Question 43mediummultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

During a RAG implementation, the response quality degrades because the LLM receives too many irrelevant document chunks. Which technique can best filter out irrelevant chunks before sending them to the LLM?

Question 44hardmultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A company is using Oracle Database 23ai AI Vector Search for their RAG pipeline. They notice that similarity search often returns chunks that are semantically unrelated but syntactically similar due to token overlap. Which vector index type should they consider to improve semantic relevance?

Question 45easymultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A developer is testing a RAG application using OCI Generative AI. They receive an error: 'The model cohere.command-r-plus-v1:0 is not supported in this region.' What is the most likely cause?

Question 46mediummultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A team is designing a RAG system for legal document review. They want to ensure that the retrieved chunks are contextually coherent and not truncated mid-sentence. Which chunking strategy should they use?

Question 47hardmultiple choice

Read the full NAT/PAT explanation →

An enterprise is using OCI Generative AI with a RAG architecture. They observe that the LLM sometimes produces hallucinated answers that are not supported by the retrieved documents. Which strategy is most effective in reducing these hallucinations?

Question 48mediummulti select

Read the full Building LLM Applications with RAG and Vector Search explanation →

Which TWO of the following are best practices when indexing documents for a RAG application using OCI OpenSearch?

Question 49hardmulti select

Read the full Building LLM Applications with RAG and Vector Search explanation →

Which THREE factors should be considered when choosing a vector store for a RAG application in OCI?

Question 50easymulti select

Read the full Building LLM Applications with RAG and Vector Search explanation →

Which TWO of the following are valid approaches to serve a RAG application in OCI with low latency?

Question 51easymultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A developer sends the above request to the OCI Generative AI API. The response returns an error: 'InvalidParameter: The parameter 'topP' is not supported for this model.' What is the most likely reason?

Exhibit

Refer to the exhibit.

```json
{
  "modelId": "cohere.command-r-plus-v1:0",
  "messages": [
    {
      "role": "user",
      "content": "What is the capital of France?"
    }
  ],
  "parameters": {
    "temperature": 0.5,
    "topP": 0.9
  }
}
```

Question 52mediummultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

An OCI CLI command above returns embeddings for the phrase 'Hello world'. The developer notices that the embedding vector length is 384 dimensions. However, they expected 768 dimensions. What is the most likely cause?

Network Topology

Question 53hardmultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A DBA has created the above vector index. After running queries, they observe that recall is lower than expected for approximate searches. Which change would most likely improve recall while maintaining query performance?

Exhibit

Refer to the exhibit.

```sql
-- Oracle Database 23ai AI Vector Search index creation
CREATE VECTOR INDEX doc_vec_idx ON documents(chunk_embedding) 
  ORGANIZATION NEIGHBOR PARTITIONS
  DISTANCE COSINE
  WITH TARGET ACCURACY 95
  PARAMETERS (TYPE IVF, NEIGHBOR PARTITIONS 4);
```

Question 54mediummultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A company is building a RAG application using OCI Generative AI and OCI Search with OpenSearch. Users report that the responses from the LLM are not relevant to the queries, even though the document chunks seem appropriate. What is the most likely cause?

Question 55easymultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

An organization needs to extract text from PDF documents and convert them into embeddings for a RAG pipeline using OCI. Which OCI service is best suited for extracting text from PDFs?

Question 56hardmultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A developer implements a RAG chatbot using OCI Generative AI with streaming enabled. The chatbot fails to remember earlier conversation turns during a session. What is the most likely cause?

Question 57mediummultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A data scientist is designing a RAG system with a large vector database (hundreds of millions of documents) and requires high recall accuracy. Which vector search index type should be used in OCI Search with OpenSearch?

Question 58easymultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

What is a recommended practice to prevent the LLM from generating information not present in the retrieved context when building a RAG application?

Question 59hardmultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

An application using OCI Generative AI returns a 403 Forbidden error when attempting to invoke a model. The user's API key is valid and the endpoint is correct. What is the most likely cause?

Question 60mediummultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

Which OCI service provides a managed vector database capability that can be used as a knowledge base in a RAG architecture?

Question 61easymultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

What is the primary purpose of an embedding model in a RAG pipeline?

Question 62hardmultiple choice

Read the full NAT/PAT explanation →

A RAG system returns irrelevant chunks even though the embedding model and vector index are correctly configured. After reviewing, the chunks are too large and contain extraneous information. Which combination of adjustments should be made to improve relevance?

Question 63mediummulti select

Read the full Building LLM Applications with RAG and Vector Search explanation →

Which TWO are required components to implement a basic RAG system using OCI services? (Choose two.)

Question 64mediummulti select

Read the full Building LLM Applications with RAG and Vector Search explanation →

Which TWO are best practices for chunking documents in a RAG pipeline? (Choose two.)

Question 65hardmulti select

Read the full Building LLM Applications with RAG and Vector Search explanation →

Which THREE factors directly influence the quality of responses in a RAG system? (Choose three.)

Question 66mediummultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A company has deployed a RAG application using OCI Generative AI service with a vector store in OCI OpenSearch. Users report that answers are often incomplete or irrelevant. The application uses a single prompt template with a fixed chunk size of 1000 tokens. Which action is most likely to improve answer quality?

Question 67easymultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A developer wants to build a RAG application that processes highly sensitive medical records. The documents are already stored in OCI Object Storage. Which vector storage strategy best balances security and performance?

Question 68hardmultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A team has set up a RAG pipeline using OCI Data Science with OCI OpenSearch as the vector store. The embedding model is from the OCI Generative AI service. Users note that the vector search returns irrelevant documents for many queries. Which of the following is the most likely cause?

Question 69easymultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

When building a RAG application for document retrieval, which chunking strategy is recommended to maximize retrieval accuracy?

Question 70mediummultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A company's RAG application ingests news articles that are updated frequently. The vector store in OCI OpenSearch contains embeddings of the articles. The team notices that outdated information is still retrieved even after updating the source documents. What is the most effective way to ensure the vector store reflects the latest content?

Question 71mediummultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A legal firm needs an AI assistant that can answer questions based on a large corpus of internal regulations that change quarterly. The firm also requires high accuracy and the ability to cite sources. Which approach should the firm choose?

Question 72hardmultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A real-time customer support chatbot uses RAG with OCI Generative AI. The average response time is 5 seconds, which is too slow. The team identifies the vector search as the bottleneck. Which optimization would most reduce latency?

Question 73easymultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

When invoking the OCI Generative AI service from a RAG application, the developer receives a 401 Unauthorized error. The application uses resource principal authentication from an OCI Data Science notebook session. What is the most likely fix?

Question 74mediummultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A document processing pipeline uses OCI Document Understanding to extract text from PDFs, then creates embeddings with OCI Generative AI. Some documents exceed the embedding model's token limit. What is the best approach?

Question 75hardmulti select

Read the full Building LLM Applications with RAG and Vector Search explanation →

A team is designing a RAG system for a multilingual knowledge base. Which TWO strategies are appropriate? (Choose two.)

Question 76hardmulti select

Read the full Building LLM Applications with RAG and Vector Search explanation →

A developer is troubleshooting low recall in a vector search. Which THREE factors should be checked? (Choose three.)

Question 77easymulti select

Read the full Building LLM Applications with RAG and Vector Search explanation →

A company wants to ensure their RAG application complies with data residency requirements. Data must not leave a specific OCI region. Which TWO actions are necessary? (Choose two.)

Question 78mediummultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

Refer to the exhibit. A developer runs the command and immediately tries to use the endpoint. The application fails with an error indicating the endpoint is not active. What is the most likely reason?

Network Topology

Question 79hardmultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

Refer to the exhibit. A developer has set this policy to allow an OCI Data Science session to generate embeddings. However, the API call returns a 403 Forbidden. Which of the following is likely missing?

Exhibit

{
  "policy": "Allow dynamic-group RAGGroup to use generative-ai-embeddings in compartment Production"
}

Question 80easymultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

Refer to the exhibit. A RAG application logs this error when trying to search. What is the most likely cause?

Exhibit

ERROR: OciOpenSearch: IndexNotFoundException[no such index [rag-index]]

Question 81easymultiple choice

Read the full NAT/PAT explanation →

A company is building a RAG application on OCI and needs a managed vector database with native support for AI Vector Search, which offers high performance and integration with OCI GenAI. Which OCI service should they use?

Question 82mediummultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A development team notices that their RAG application returns responses slowly when processing large PDF documents (100+ pages). They need to improve response time without significantly reducing retrieval quality. Which action is most effective?

Question 83hardmultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

An AI engineer observes that the RAG application fails to retrieve relevant documents for certain user queries, despite having a comprehensive knowledge base. The issue appears to be a semantic gap between query phrasing and document content. Which technique should the engineer implement first to address this?

Question 84easymultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A developer needs to generate embeddings for text data using the OCI Generative AI service. Which API should they call to get vector representations of text?

Question 85mediummultiple choice

Read the full NAT/PAT explanation →

A healthcare organization plans to deploy a RAG application on OCI that handles sensitive patient data. They require that all LLM inference and embedding processing happen within a controlled environment to avoid data leakage to public endpoints. Which OCI feature should they use?

Question 86hardmultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A company wants to build a multi-modal RAG system that can retrieve both text and images based on a user query. Which approach is most aligned with OCI GenAI capabilities?

Question 87easymultiple choice

Study the full Python automation breakdown →

When chunking a large Python code repository for a RAG application, which chunking strategy is best suited to preserve code semantics and functionality?

Question 88mediummultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

An organization wants to combine keyword search and vector search to improve retrieval accuracy in their RAG pipeline. Which OCI service provides built-in hybrid search capabilities?

Question 89hardmultiple choice

Read the full NAT/PAT explanation →

A RAG application is hallucinating because the LLM receives irrelevant context from the retrieval step, even when topK is set to 3. Which strategy would best reduce hallucination by improving the relevance of retrieved documents?

Question 90mediummulti select

Read the full Building LLM Applications with RAG and Vector Search explanation →

Which TWO actions can improve the retrieval accuracy of a RAG system? (Select two.)

Question 91hardmulti select

Read the full Building LLM Applications with RAG and Vector Search explanation →

Which TWO best practices should be followed when designing a RAG application using OCI GenAI? (Select two.)

Question 92easymulti select

Read the full Building LLM Applications with RAG and Vector Search explanation →

Which THREE components are essential in a typical RAG architecture built on OCI? (Select three.)

Question 93mediummultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A developer calls the OCI GenAI embedding API as shown in the exhibit. What is the most likely cause of the error?

Network Topology

Question 94hardmultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

An IAM policy is shown in the exhibit. A user reports that they cannot call the OCI GenAI embedding API, but they can use OCI AI Language. Which policy statement is missing to allow embedding API access?

Exhibit

Refer to the exhibit.
```json
{
  "statements": [
    {
      "action": ["inspect"],
      "resource": "oci-generative-ai-family"
    },
    {
      "action": ["use"],
      "resource": "oci-ai-language-family"
    }
  ]
}
```

Question 95easymultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

The architecture shown in the exhibit is missing a critical component for a RAG pipeline. What step is missing between receiving the user query and searching the vector store?

Exhibit

Refer to the exhibit.
Architecture diagram description:
User Query -> OCI API Gateway -> OCI Functions -> OCI OpenSearch -> OCI GenAI Cohere Command -> Response

Question 96easymultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A company is building a RAG application using OCI Generative AI and wants to store embeddings for document retrieval. Which OCI service is most appropriate for storing and querying vector embeddings?

Question 97mediummultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A developer notices that the RAG application returns irrelevant chunks for user queries. The embedding model used is `cohere.embed-english-light-v3.0`. Which action is MOST likely to improve relevance?

Question 98hardmultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A company uses OCI Data Science to fine-tune an embedding model for a specialized domain. After fine-tuning, the model produces embeddings that are not aligned with the vector index used in OCI OpenSearch. What is the most likely cause?

Question 99easymultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A developer wants to implement a simple RAG pipeline using OCI Language's text generation and embedding models. Which OCI SDK method is used to generate embeddings for a text chunk?

Question 100mediummultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

During load testing, the RAG application's response time increases significantly. The vector search is performed on millions of vectors. Which optimization would MOST reduce latency?

Question 101hardmultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A security audit reveals that the RAG application exposes internal documents through the chatbot. The vector search index contains sensitive data. Which action should be taken FIRST to mitigate?

Question 102easymultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

What is the primary purpose of chunking documents in a RAG pipeline?

Question 103mediummultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A team uses Cohere's `rerank` endpoint after initial retrieval to improve result quality. What is the main benefit of reranking?

Question 104hardmultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

In OCI OpenSearch, a k-NN search query returns results with low precision. The index uses HNSW algorithm. The search parameters are: `k=10`, `ef_search=100`. To improve recall without significantly increasing latency, which parameter should be adjusted?

Question 105easymulti select

Read the full Building LLM Applications with RAG and Vector Search explanation →

Which TWO of the following are valid similarity metrics used in vector search?

Question 106mediummulti select

Read the full Building LLM Applications with RAG and Vector Search explanation →

Which THREE factors should be considered when designing a chunking strategy for a RAG application?

Question 107hardmulti select

Read the full Building LLM Applications with RAG and Vector Search explanation →

A troubleshooting scenario: A RAG system returns no results for certain queries. The index exists and has documents. Which TWO are likely causes?

Question 108easymultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A startup is building a customer support chatbot using RAG with OCI Generative AI. They have a large corpus of FAQ documents stored as PDFs in OCI Object Storage. The developer uses OCI Language to embed the text and stores vectors in OCI OpenSearch. During testing, the chatbot often fails to answer questions because relevant FAQ entries are not retrieved. The team suspects the chunking size is too large, causing loss of specific details. After reducing chunk size, retrieval improves slightly but still misses many answers. What should the team do NEXT?

Question 109mediummultiple choice

Read the full NAT/PAT explanation →

A financial services company is deploying a RAG system for regulatory compliance queries. The system uses OCI Data Science to run a custom embedding model fine-tuned on regulatory documents. The index in OpenSearch uses cosine similarity and HNSW algorithm. Users report that queries containing synonyms to regulatory terms (e.g., "AML" vs "Anti-Money Laundering") often fail to retrieve relevant documents. Which combination of improvements would be MOST effective? (Assume budget and latency constraints)

Question 110hardmultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

An enterprise RAG application experiences high latency during peak hours. The architecture uses OCI OpenSearch with a single node cluster storing 5 million vectors (768 dimensions). The search uses exact k-NN (EF_SEARCH=500). The average query takes 1.5 seconds, but the SLA requires <500ms. The team considers several options: A) Switch to ANN with lower recall (HNSW with ef_search=50), B) Scale OpenSearch cluster to 3 nodes, C) Reduce embedding dimension to 256 using PCA, D) Increase the number of shards from 1 to 10. Which option provides the best balance of latency reduction and minimal impact on retrieval quality? (Assume all options are feasible)

Question 111easymultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A developer is building a RAG application using OCI Generative AI. They notice that the generated responses often contain outdated information even though the knowledge base is updated daily. What is the most likely cause?

Question 112mediummultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A company is deploying a RAG system for internal document search using OCI OpenSearch as the vector store. Users report that queries about recent policy changes return no results, even though the new policies were ingested. Which configuration is most likely missing?

Question 113hardmultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A team is optimizing a RAG pipeline for OCI Generative AI. They observe that the model's responses are verbose and often include irrelevant details from the retrieved chunks, reducing user satisfaction. They have already tuned the prompt template. What is the most effective next step?

Question 114mediummulti select

Read the full Building LLM Applications with RAG and Vector Search explanation →

A data scientist is designing a RAG system using OCI Data Science and OCI Generative AI. Which two considerations are critical for optimal retrieval quality? (Choose 2.)

Question 115hardmulti select

Read the full NAT/PAT explanation →

A company is deploying a RAG application for legal document analysis using OCI. Which three best practices should be followed to mitigate hallucinations? (Choose 3.)

Question 116easymultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A small business is building an internal Q&A bot using OCI Generative AI with RAG. They have indexed their product manuals into OCI OpenSearch using a precomputed embedding model. When they test queries, the bot often returns answers that are only partially relevant, and sometimes it cannot find answers for questions that are clearly present in the manuals. The developers suspect the chunking strategy is suboptimal. Currently, they use a fixed chunk size of 512 tokens with no overlap. What should they do to improve retrieval relevance?

Question 117easymultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A developer is using OCI Data Science to create a RAG pipeline. They have ingested documents into a vector store using OCI Generative AI's text-embedding model. During testing, they notice that queries return very few results (often 0 or 1) even when the knowledge base contains relevant documents. They have set the top-k parameter to 10. What is the most likely cause?

Question 118easymultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A company uses OCI Generative AI's chat endpoint with RAG for customer support. They have observed that the model sometimes generates answers that contradict the retrieved context. The retrieved chunks are correct and relevant, but the model ignores them. What configuration change should they implement first?

Question 119mediummultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

An enterprise is deploying a RAG application for compliance document analysis using OCI. They use OCI OpenSearch as the vector store and have millions of documents. Retrieval latency is critical. Currently, a single query takes over 2 seconds. The index uses a flat (brute-force) distance computation. They have considered using approximate nearest neighbor (ANN) algorithms but are unsure about the impact on recall. They need to reduce latency to under 500ms while maintaining high recall. What should they do?

Question 120mediummultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A data scientist is using OCI Data Science to build a RAG system for medical literature. They have a large corpus of PDFs. They used the default OCI Generative AI embedding model and chunked each PDF into 512-character segments with 10% overlap. However, queries about specific drug doses often return incorrect information, even though the correct dose is present in the corpus. Upon inspection, they find that the retrieved chunks often contain partial dose information or miss the context units (e.g., mg vs. mcg). What improvement should they prioritize?

Question 121hardmultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A large organization is deploying a multi-tenant RAG application on OCI, where each tenant has its own set of documents. They use a shared OCI OpenSearch cluster with tenant_id metadata to filter documents. They observe that occasionally, queries from one tenant return results from another tenant's documents. The security team requires strict isolation. They have verified that the metadata filter is correctly applied in the search request. What is the most likely root cause?

Question 122hardmultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A company is using OCI Generative AI for a RAG-based code assistant. They index source code repositories into a vector store. Developers report that the assistant often suggests deprecated APIs or outdated code snippets, even though the latest code is in the repository. The index was built a week ago and has not been updated. They plan to set up incremental updates. However, they notice that even after re-indexing the latest commits, the issue persists. What is the most likely oversight?

Question 123mediummulti select

Read the full Building LLM Applications with RAG and Vector Search explanation →

Which TWO of the following are best practices when implementing a RAG application using OCI OpenSearch as a vector store?

Question 124hardmultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

Refer to the exhibit. A developer creates an index mapping for a vector search application. When performing a k-NN search query, the query fails with a parsing error. What is the most likely cause?

Exhibit

{
  "settings": {
    "index": {
      "knn": true,
      "knn.algo_param.ef_search": 100
    }
  },
  "mappings": {
    "properties": {
      "embedding": {
        "type": "knn_vector",
        "dimension": 768,
        "method": {
          "name": "hnsw",
          "space_type": "cosinesimil",
          "engine": "faiss"
        }
      },
      "content": {
        "type": "text"
      }
    }
  }
}

Question 125easymultiple choice

Read the full Building LLM Applications with RAG and Vector Search explanation →

A company has implemented a RAG-based chatbot using OCI Generative AI and OCI OpenSearch as the vector store. The chatbot answers questions about internal policies. The team uses a dense vector embedding model with 768 dimensions and the HNSW algorithm. The corpus contains 5 million documents. Users report that the chatbot takes 8-12 seconds to respond, and the answers are often not relevant, missing key policy details. Upon investigation, the team finds that the k-NN search returns results based solely on vector similarity, ignoring exact keyword matches that are critical for policy documents. Which course of action will most effectively improve both response time and relevance?