Oracle · Free Practice Questions · Last reviewed May 2026
24real exam-style questions organised by domain, each with the correct answer highlighted and a plain-English explanation of why it's right — and why the others are wrong.
A company is deploying a large language model for a customer service chatbot. The model needs to understand industry-specific jargon and maintain low latency. Which approach best balances these requirements?
Employ retrieval-augmented generation (RAG) with a general model
Rely solely on prompt engineering with a general model
Use a large general-purpose LLM with zero-shot prompting
Fine-tune a small open-source LLM on domain-specific data
Fine-tuning adapts the model to jargon and a smaller model keeps latency low.
A data scientist observes that their fine-tuned LLM performs well on training data but generates repetitive and dull responses in production. What is the most likely cause and best solution?
The model is overfitted; apply stronger regularization
The temperature is set too low; increase temperature during inference
Low temperature makes outputs deterministic and repetitive; increasing it adds variability.
The training data lacks diversity; add more varied examples
The model has too many layers; reduce model size
An organization wants to use an LLM to summarize legal documents. Which consideration is most important for ensuring accurate summaries?
Fine-tune the model on a curated legal corpus
Domain-specific fine-tuning teaches the model legal terminology and reasoning.
Use the largest available general-purpose model
Rely on zero-shot summarization with careful prompting
Pre-train a new model from scratch on legal texts
A developer is building a code generation assistant. The model occasionally produces syntactically correct but semantically wrong code. Which technique directly addresses semantic correctness?
Expand the token vocabulary
Lower the temperature to 0
Apply RLHF using human-validated code examples
RLHF directly optimizes for desired outcomes like semantic correctness.
Increase beam search width
A company fine-tunes an LLM on internal support tickets. After deployment, the model hallucinates company-specific product names. What is the most effective mitigation?
Switch to a smaller model to reduce hallucination risk
Use prompt engineering to remind the model to be accurate
Implement RAG with a verified product database
RAG provides factual grounding, reducing hallucinations.
Fine-tune further with more ticket data
A team wants to evaluate an LLM's performance on a text classification task. Which metric is most appropriate for a balanced dataset?
BLEU score
Perplexity
Accuracy
Accuracy directly measures correct predictions, appropriate for balanced data.
ROUGE score
Want more Fundamentals of Large Language Models practice?
Practice this domainA company uses OCI Generative AI Service to build a chatbot for customer support. They notice that the model sometimes generates inappropriate responses. What is the MOST effective way to mitigate this without retraining the model?
Fine-tune the model with curated safe examples
Configure system instructions to define acceptable behavior
System instructions constrain the model's output at inference time without retraining.
Reduce the temperature parameter to 0
Use the moderation API to filter responses
A developer wants to use OCI Generative AI Service to summarize long documents. Which endpoint should they use to send the document content?
/generate
/classify
/embed
/chat
The /chat endpoint accepts a conversation history, suitable for summarization tasks.
An organization deploys a fine-tuned model for legal document analysis using OCI Generative AI Service. They need to ensure that only authorized users in the 'LegalTeam' group can access the model endpoint. Which policy statement should be used?
Allow group LegalTeam to use generative-ai-model in compartment ABC
Use permission allows invoking the model for inference.
Allow group LegalTeam to manage generative-ai-family in compartment ABC
Allow group LegalTeam to read generative-ai-model in compartment ABC
Allow group LegalTeam to inspect generative-ai-model in compartment ABC
A data scientist is using OCI Generative AI Service to generate product descriptions. They notice that the output often repeats phrases. Which parameter adjustment would MOST directly address this issue?
Increase the temperature
Increase the max tokens
Increase the frequency penalty
Frequency penalty penalizes tokens that have already appeared, reducing repetition.
Decrease the top-p value
A company needs to integrate OCI Generative AI Service with an existing application that uses OCI IAM for authentication. They want to use resource principal to allow the application to call the service without storing API keys. Which step is REQUIRED?
Create an OCI API key for the application
Enable the Generative AI Service for resource principal in the tenancy
Assign the application to a group with admin privileges
Create a dynamic group and a policy granting access to the Generative AI Service
Dynamic group with matching rules and a policy are required for resource principal.
A developer is using OCI Generative AI Service to generate code snippets. They want to ensure the output is as deterministic as possible for testing. Which combination of parameters should they use?
Temperature = 0, Top-p = 1
Temperature=0 makes output deterministic; top-p=1 disables nucleus sampling.
Temperature = 0.5, Top-p = 0.5
Temperature = 0, Top-p = 0
Temperature = 1, Top-p = 1
Want more Using OCI Generative AI Service practice?
Practice this domainA developer is building a RAG application using Oracle Cloud Infrastructure (OCI) Document Understanding and OCI Generative AI. After chunking documents and generating embeddings, the developer observes that the retrieval step often returns chunks that are semantically unrelated to the query. Which action is MOST likely to improve retrieval relevance?
Switch from a dense embedding model to a sparse embedding model.
Adjust the chunk size and chunk overlap to better capture coherent passages.
Proper chunking helps preserve meaning and improves retrieval accuracy.
Increase the chunk size to capture more context.
Reduce the number of retrieved chunks (k) in the vector search.
An organization stores its knowledge base in Oracle Autonomous Database and wants to build a RAG chatbot using OCI Generative AI. The chatbot must retrieve the most relevant documents based on user queries. Which indexing approach is BEST suited for efficient similarity search on text embeddings?
Create an ANN index on the embedding vector column.
ANN indexes enable fast approximate nearest neighbor search in vector databases.
Create a bitmap index on the embedding vector column.
Create an inverted index on the document text column.
Create a B-tree index on the document text column.
A company is deploying a RAG pipeline using OCI Data Science and OCI Generative AI. The pipeline uses a Cohere command model for generation and a Cohere embed model for retrieval. The team notices that the model occasionally produces hallucinated answers that are not supported by the retrieved context. Which strategy is MOST effective at reducing hallucinations?
Implement a faithfulness verification step that re-ranks retrieved passages based on alignment with the generated answer.
A verification step can detect and mitigate unsupported claims.
Increase the temperature parameter of the generation model.
Increase the number of retrieved chunks (k) to provide more context.
Use a larger generative model with more parameters.
A data scientist is building a RAG application that processes PDF invoices. The extraction step uses OCI Document Understanding to convert PDFs to text. The scientist then splits the text into chunks and generates embeddings using OCI Generative AI. However, the retrieval often misses critical fields like invoice numbers and dates. Which preprocessing step would MOST likely improve retrieval of these specific fields?
Increase the chunk size to include entire invoices.
Apply stemming and lemmatization to the text before chunking.
Tag each chunk with metadata such as invoice number, date, and vendor, and use metadata filtering during retrieval.
Metadata filtering enables precise retrieval based on structured fields.
Switch from dense embeddings to sparse embeddings for better exact match.
A developer is using OCI Generative AI to build a question-answering system over a large corpus of technical manuals. The developer uses the Cohere Embed model to generate embeddings and stores them in an OCI OpenSearch cluster. Queries are slow and the team needs to reduce latency. Which approach is BEST for improving search speed while maintaining acceptable accuracy?
Increase the embedding dimension for better representation.
Reduce the k value in the nearest neighbor search.
Fewer neighbors means less distance computation and faster retrieval.
Use exact nearest neighbor search instead of approximate.
Increase the index refresh interval to reduce write overhead.
A team is deploying a RAG system that uses OCI Generative AI to answer questions about internal HR policies. The system must comply with data residency requirements: all data processing must stay within a specific OCI region. The team uses OCI Data Science for orchestration. Which architecture BEST meets the data residency requirement?
Deploy the generative AI model endpoints within the same OCI region as the data and compute.
All components remain in the specified region, ensuring compliance.
Use OCI Generative AI endpoints in a different region but store data in the required region.
Use an external third-party LLM endpoint that guarantees data residency.
Store embeddings in a different region but run inference in the required region.
Want more Building LLM Applications with RAG and Vector Search practice?
Practice this domainA company is deploying a generative AI service on OCI using the OCI Data Science service with a large language model (LLM) in a VCN. The model inference endpoint must be accessible only from a private subnet within the same VCN. Which networking component should be configured to enable this?
NAT Gateway
Dynamic Routing Gateway (DRG)
Internet Gateway
Service Gateway
Service gateway enables private subnet access to OCI services like Data Science.
A data scientist is fine-tuning a generative AI model on OCI Data Science using a custom container with GPU resources. The training job fails with an out-of-memory error despite the GPU instance having sufficient memory. The job works fine on a smaller dataset. What is the most likely cause?
The training script has a memory leak
The GPU instance is not supported by OCI Data Science
The model is not compatible with the PyTorch version
The batch size is too large for the GPU memory
Large batch size can cause OOM errors; reducing batch size resolves it.
An organization wants to deploy a generative AI chatbot using OCI Generative AI service. The chatbot must comply with data residency requirements by ensuring that all data processing occurs within a specific geographic region. What is the best practice to achieve this?
Use a dedicated AI cluster in the required region
Dedicated AI clusters are region-specific and ensure data stays in that region.
Enable cross-region replication for disaster recovery
Configure a tenancy-wide policy to restrict region usage
Use IAM policies to block access from other regions
A team has deployed a generative AI model using OCI Data Science model deployment. The endpoint is behind a load balancer. Users report that after 5 minutes of inactivity, the first request takes over 30 seconds to respond, while subsequent requests are fast. What is the most likely cause and solution?
The model deployment has an idle timeout that scales down to zero; configure a minimum number of instances or use a warm-up request
Idle timeout causes cold start; setting min replicas or health check warm-up solves it.
The load balancer is scaling based on CPU utilization; increase the CPU threshold
The VCN has a network latency issue; use a different availability domain
The inference code has a lazy initialization; pre-load the model in the deployment script
A company is using OCI Generative AI service with a dedicated AI cluster for text generation. They notice that the latency is higher than expected. The cluster is in the Ashburn region, and users are distributed globally. What is the most effective way to reduce latency?
Enable the OCI Generative AI inference optimizer
Deploy dedicated AI clusters in regions closer to the users
Geographic proximity reduces network round-trip time.
Increase the number of nodes in the dedicated AI cluster
Use a content delivery network (CDN) to cache responses
A machine learning engineer is deploying a fine-tuned Llama 2 model on OCI Data Science model deployment. The deployment fails with an error: 'Model artifact exceeds the maximum allowed size of 10 GB.' The model files total 12 GB. What is the best approach to resolve this?
Store the model in Object Storage and reference it in the deployment configuration
Object Storage allows large models and is supported by model deployment.
Use a different model that is smaller than 10 GB
Increase the model deployment artifact size limit via a service request
Compress the model artifact to under 10 GB using gzip
Want more Deploying and Managing Generative AI on OCI practice?
Practice this domainThe 1Z0-1127 exam has 40 questions and must be completed in 90 minutes. The passing score is 65/1000.
Scenario-based questions covering exam objectives with detailed answer explanations.
The exam covers 4 domains: Fundamentals of Large Language Models, Using OCI Generative AI Service, Building LLM Applications with RAG and Vector Search, Deploying and Managing Generative AI on OCI. Questions are weighted by domain — higher-weight domains appear more on your actual exam.
No. These are original exam-style practice questions written against the official Oracle 1Z0-1127 exam objectives. They are not copied from the real exam. Courseiva focuses on genuine understanding, not memorisation of braindumps.
Courseiva tracks your accuracy per domain and routes you toward weak areas automatically. Free, no account required.