Practice 1Z0-1127 Deploying and Managing Generative AI on OCI questions with full explanations on every answer.
Start practicing
Deploying and Managing Generative AI on OCI — choose a session length
Free · No account required
Click any question to see the full explanation and answer options, or start a focused practice session above.
A company is deploying a generative AI service on OCI using the OCI Data Science service with a large language model (LLM) in a VCN. The model inference endpoint must be accessible only from a private subnet within the same VCN. Which networking component should be configured to enable this?
2A data scientist is fine-tuning a generative AI model on OCI Data Science using a custom container with GPU resources. The training job fails with an out-of-memory error despite the GPU instance having sufficient memory. The job works fine on a smaller dataset. What is the most likely cause?
3An organization wants to deploy a generative AI chatbot using OCI Generative AI service. The chatbot must comply with data residency requirements by ensuring that all data processing occurs within a specific geographic region. What is the best practice to achieve this?
4A team has deployed a generative AI model using OCI Data Science model deployment. The endpoint is behind a load balancer. Users report that after 5 minutes of inactivity, the first request takes over 30 seconds to respond, while subsequent requests are fast. What is the most likely cause and solution?
5A company is using OCI Generative AI service with a dedicated AI cluster for text generation. They notice that the latency is higher than expected. The cluster is in the Ashburn region, and users are distributed globally. What is the most effective way to reduce latency?
6A machine learning engineer is deploying a fine-tuned Llama 2 model on OCI Data Science model deployment. The deployment fails with an error: 'Model artifact exceeds the maximum allowed size of 10 GB.' The model files total 12 GB. What is the best approach to resolve this?
7A developer wants to call the OCI Generative AI service from a Python application running on an OCI Compute instance. Which method is the most secure for authenticating the API calls?
8Which TWO actions are recommended best practices for managing costs when using OCI Generative AI dedicated AI clusters?
9Which THREE components are required to deploy a custom generative AI model on OCI Data Science model deployment?
10Which TWO are valid methods to monitor the performance of a generative AI model deployed on OCI Data Science?
11An administrator runs the above CLI command to check the status of a dedicated AI cluster. The cluster is ACTIVE with capacity 10. However, a user reports that inference requests to this cluster are failing with a '429 Too Many Requests' error. What is the most likely cause?
12A security administrator wrote the above IAM policy for a compartment named MyCompartment. Users in the GenerativeAIUsers group can successfully list dedicated AI clusters and models in MyCompartment, but when they try to create an inference endpoint using a model from a different compartment (SharedModels), they get an authorization error. What is the most likely missing policy statement?
13A company has deployed a generative AI model on OCI to generate product descriptions. After a recent update, the model started producing outputs with repetitive phrases and poor coherence. The inference endpoint is configured with default parameters. Which single parameter adjustment is most likely to improve output quality?
14An organization wants to fine-tune a large language model on OCI using their proprietary data. They are concerned about data privacy and want to ensure that fine-tuning data does not leave the OCI region. Which OCI service should they use to securely store and manage their training data?
15A data scientist is deploying a custom generative AI model using OCI Data Science. After deploying the model to an endpoint, they notice that inference requests are failing with a timeout error when the payload size exceeds 1 MB. What is the most likely cause and solution?
16A company is using OCI Generative AI service to power a customer support chatbot. They observe that the chatbot sometimes provides outdated information because the model was trained on data up to 2022. They want to incorporate real-time knowledge without retraining the model. Which approach should they use?
17A team is deploying a generative AI model using OCI Functions for serverless inference. They are experiencing cold start latency of over 10 seconds for the first invocation after idle periods. What is the best strategy to reduce cold start latency?
18An administrator needs to ensure that only specific users in the finance department can invoke a generative AI model deployed on OCI. Which IAM policy should be used?
19A company is deploying a large generative AI model on OCI using GPU compute instances. They want to optimize inference cost while maintaining acceptable latency. Which TWO strategies should they implement?
20A team is fine-tuning a generative AI model on OCI using a custom dataset. The training job fails with an out-of-memory error. Which THREE actions should they take to resolve this issue?
21A large enterprise is deploying a generative AI model for internal document summarization. The model is deployed on OCI Data Science using a custom container. The inference endpoint is behind a public load balancer. The security team requires that all traffic between the client and the endpoint be encrypted in transit and that the endpoint not be accessible from the public internet. The current setup uses a public load balancer with an SSL certificate. The VCN has a public subnet for the load balancer and a private subnet for the model deployment. The security team is concerned that the load balancer is publicly accessible. The enterprise wants to maintain high availability and low latency. What should the architect do to meet the security requirements?
22A healthcare company is using OCI Generative AI to analyze patient records and generate clinical summaries. The company must comply with HIPAA regulations, which require that all protected health information (PHI) be encrypted at rest and in transit, and that access be logged and audited. The current architecture uses an OCI Data Science model deployment with a public endpoint. The model is stored in an OCI Object Storage bucket that is publicly accessible for testing. The company is now moving to production. The compliance officer has flagged the following issues: (1) The model endpoint is publicly accessible. (2) The bucket containing the model is public. (3) No audit logs are enabled. The company wants to remediate these issues while maintaining the ability to invoke the model from on-premises applications via a secure connection. Which set of actions should the architect take?
23You are deploying a generative AI solution on OCI for a healthcare client that requires strict data residency (data must remain in the EU) and low-latency inference. The solution uses a fine-tuned LLM model (7B parameters) stored in Object Storage in the Frankfurt region. You have set up an OCI Data Science model deployment endpoint with GPU shape VM.GPU.A10.1, using a single replica. During load testing with 50 concurrent users, you observe high latency (average 8 seconds per request) and occasional 504 gateway timeouts. The model deployment logs show no errors, and the model loads successfully. You have confirmed that the Object Storage bucket is in the same region and that the network latency between the client and the endpoint is minimal (under 5 ms). Which action should you take to reduce latency and eliminate timeouts?
24A company has fine-tuned a large language model using OCI Generative AI service. When attempting to deploy the model to a dedicated endpoint, the deployment fails with an error indicating insufficient capacity. Which action should be taken to resolve this issue?
25A startup wants to minimize costs when using OCI Generative AI service for a chatbot application that experiences sporadic usage. Which deployment strategy is most cost-effective?
26A global enterprise is deploying a generative AI application that requires high availability across multiple OCI regions. The application must automatically fail over to a secondary region if the primary region becomes unavailable. What is the recommended architecture to achieve this?
27A data scientist is using the OCI Generative AI service to generate text completions. The API calls are returning HTTP 400 errors with the message 'Invalid model parameters'. What is the most likely cause?
28A developer wants to deploy a custom generative AI model that was trained using OCI Data Science. Which service should they use to expose the model as an API endpoint?
29A financial services company is concerned about data privacy when using OCI Generative AI service for processing sensitive customer data. They want to ensure that their data is not used to improve the model and is encrypted at rest and in transit. Which combination of OCI features should they implement?
30A company is using OCI Generative AI service to generate product descriptions. They notice that the model sometimes generates biased content. Which approach should they take to mitigate bias while maintaining performance?
31A user wants to access the OCI Generative AI service programmatically. Which credential method is recommended for use in a production application running on OCI Compute?
32An organization is deploying a generative AI model that requires GPU acceleration for inference. They are using OCI Data Science Model Deployment. The model is expected to handle variable traffic, with occasional spikes. Which scaling option should they configure to ensure cost-efficiency and responsiveness?
33Which TWO factors should be considered when selecting a base model for fine-tuning on OCI Generative AI service?
34Which TWO methods can be used to invoke a generative AI model deployed on OCI?
35Which THREE steps are required to deploy a custom generative AI model using OCI Data Science Model Deployment?
36Given the CLI output from `oci generative-ai model list`, what can be determined about the model 'my-fine-tuned-model'?
37An administrator created the above IAM policies. A member of the GenerativeAIAdmins group reports they cannot invoke the model endpoint. Which permission is missing?
38A developer receives the above error when trying to send a request to a model endpoint. What is the most likely reason?
39A company is deploying a fine-tuned Cohere model on OCI Generative AI service for real-time inference. They need to ensure low latency even during demand spikes. Which configuration should they prioritize?
40A data scientist needs to generate vector embeddings for a large corpus of text documents to use in a semantic search application. Which OCI service is best suited for this task?
41An organization is fine-tuning a large language model on OCI Data Science. They must ensure that the training data remains within a specific geographic region and is encrypted at rest. Which combination of resources should they use?
42A company has deployed a generative AI model endpoint on OCI. They want to monitor token usage and latency for cost optimization. Which OCI service should they use to collect these metrics?
43A developer wants to integrate OCI Generative AI into a web application. Which API authentication method is recommended for programmatic access?
44A data science team is using OCI Data Science to fine-tune a model. They notice that training jobs are failing due to out-of-memory errors on the notebook session. What should they do to resolve this?
45A company wants to use OCI Generative AI to summarize customer support tickets. They need to ensure that the model does not output any sensitive information. Which technique should they implement?
46An administrator needs to grant a data science team access to create and manage generative AI model endpoints in a specific compartment. Which policy should they create?
47A company has deployed a generative AI endpoint using a custom fine-tuned model. They observe that the endpoint is returning 429 (Too Many Requests) errors during business hours. They need to handle this without losing requests. What should they implement?
48Which TWO of the following are valid ways to consume OCI Generative AI models?
49Which THREE of the following are best practices when deploying a generative AI model on OCI?
50Which TWO of the following are sources of training data for fine-tuning a model in OCI Generative AI?
51A company has fine-tuned a custom Llama 3 model using OCI Data Science for a chatbot. They now need a production-grade inference endpoint with auto-scaling. Which OCI service should they use?
52An organization is deploying a large language model on OCI using a dedicated AI cluster. They need to minimize inference latency. Which configuration step is most critical?
53A developer is deploying a fine-tuned model using OCI Generative AI service. They want to use a custom container image for inference. Which statement is true?
54A company needs to ensure that only authorized users can invoke an endpoint for a generative AI model. Which OCI feature should be used to control access?
55A team is fine-tuning a foundation model on a large dataset stored in OCI Object Storage. They want to minimize data transfer costs. What is the best practice for locating the storage?
56During deployment of a generative AI model, the inference endpoint returns high latency and timeouts. The model is deployed on a dedicated AI cluster with multiple nodes. What is the most likely cause?
57A company wants to use OCI Generative AI service to generate marketing copy that adheres to brand guidelines. Which technique should they use?
58A team has deployed a generative AI model and needs to monitor inference performance and set up alerts for increased error rates. Which OCI service should they integrate with?
59An organization is deploying multiple generative AI models on a shared dedicated AI cluster. They need to isolate resource usage for each model to avoid interference. Which strategy is recommended?
60A data scientist is preparing to fine-tune a foundation model on OCI. Which two actions should they take to optimize costs? (Select TWO.)
61A DevOps engineer is setting up monitoring and logging for a generative AI inference endpoint. Which three resources should they enable? (Select THREE.)
62An enterprise is deploying a generative AI model that must comply with data residency regulations. Which two configurations should they implement? (Select TWO.)
63Refer to the exhibit. A user receives this error when using the OCI CLI to chat with a model. What is the most likely cause?
64Refer to the exhibit. A team created this dedicated AI cluster. However, when they try to create a model deployment, the deployment fails with an error indicating insufficient public IPs. What change to the cluster configuration should they make?
65Refer to the exhibit. A data scientist received this output after submitting a fine-tuning job. What is the most effective change to resolve the out-of-memory error?
66A data scientist wants to deploy a fine-tuned LLM on OCI for inference with low latency. Which OCI service should they use?
67A company notices that some inference requests to their deployed model on OCI Generative AI take longer than acceptable. They want to reduce per-request latency. What should they do?
68A financial institution needs to deploy a fine-tuned model on OCI with strict data residency requirements. They must ensure that data used for inference never leaves a specific OCI region. The model is stored in Object Storage in the same region. What additional configuration is needed?
69A developer wants to integrate generative AI capabilities into an application using REST API calls. Which OCI Generative AI service endpoint should they use for text generation?
70A company has deployed a model on a Dedicated AI Cluster and needs to monitor inference performance metrics such as request latency, throughput, and error rates. Which OCI service provides built-in monitoring dashboards for these metrics?
71An AI team is fine-tuning a large language model using OCI Data Science and plans to deploy the fine-tuned model using the Generative AI service's custom model deployment. What is the required format for the model artifacts?
72A company has multiple teams sharing an OCI Generative AI Dedicated AI Cluster. They need to ensure that each team can only access their own fine-tuned models and cannot see or invoke models from other teams. What is the best approach?
73What is the primary benefit of using a Dedicated AI Cluster over On-Demand serving for deploying generative AI models on OCI?
74A developer is getting a 401 Unauthorized error when calling the OCI Generative AI inference API. What is the most likely cause?
75Which two actions are required when deploying a custom fine-tuned model using the OCI Generative AI service? (Choose two.)
76A company is designing a generative AI solution on OCI that must comply with data privacy regulations. Which three best practices should they follow? (Choose three.)
77Which two metrics would you monitor to ensure a generative AI deployment on OCI is operating efficiently? (Choose two.)
78Refer to the exhibit. An administrator receives the error shown when attempting to deploy a custom model. What is the most likely cause?
79Refer to the exhibit. The dashboard shows latency grouped by modelId, but some points are missing for certain modelIds. Which of the following is the most likely reason?
80Refer to the exhibit. Users in the group cannot create a new custom model deployment on a Dedicated AI Cluster. What is the most likely missing permission?
81A retail company uses OCI Generative AI to generate product descriptions. They observe the model occasionally produces biased content. Which technique should be applied to reduce bias in model outputs?
82A developer wants to invoke an OCI Generative AI model from an application running on a compute instance in OCI. The instance is in a private subnet. What is the most secure method to access the model endpoint?
83A data scientist needs to fine-tune a model on OCI Generative AI. Which of the following is a required parameter in the fine-tuning request?
84A company is deploying a chatbot powered by OCI Generative AI. They want to inject the conversation history into the model prompt to maintain context. However, they notice that after a long conversation, the model starts to ignore earlier messages. What is the most likely cause?
85An organization needs to ensure that all inference requests to OCI Generative AI are logged for compliance. Which OCI feature should be enabled?
86A team wants to use OCI Generative AI to generate synthetic data for training a model. They are concerned about the cost of API calls. Which pricing model would be most cost-effective for high-volume batch processing?
87A company deploys a fine-tuned model on an OCI Generative AI dedicated AI cluster. After deployment, they observe high latency during peak hours. The cluster has only one replica. Which action would most effectively reduce latency without increasing cost unnecessarily?
88An enterprise with strict data residency requirements wants to use OCI Generative AI. They must ensure that no training data or inference data leaves a specific OCI region. Which configuration option should they choose?
89A machine learning engineer evaluates OCI Generative AI for a real-time content generation application. They need to meet a SLAs of 99.9% availability. Which deployment architecture satisfies the requirement with the lowest cost?
90A company is designing a generative AI application using OCI Generative AI. Which two factors should be considered when selecting the appropriate model? (Choose two.)
91An OCI administrator is configuring access control for OCI Generative AI. Which three IAM components are required to allow a group of data scientists to call the GenerateText API? (Choose three.)
92A developer is troubleshooting an OCI Generative AI inference request that returns a 400 Bad Request error. Which three common causes could result in this error? (Choose three.)
93A user sends an inference request with the JSON parameters shown. They notice the model is returning very short responses. What is the most likely cause?
94A user runs the CLI command shown but receives only one model in the list, even though they know there are more models available in the compartment. What is the most likely reason?
95A data scientist in group DataScientists uses the OCI Generative AI SDK to start a fine-tuning job in compartment AIResources. They receive the error shown. What is the most likely cause?
96A company deploys a fine-tuned Llama 2 model using OCI Generative AI service. They want to ensure low-latency inference for a real-time chat application. Which deployment option should they use?
97A data scientist fine-tunes a model using OCI Data Science and wants to deploy it as a managed endpoint in OCI Generative AI. What must they do first?
98A team deploys a generative AI model endpoint and notices intermittent 429 Too Many Requests errors. The endpoint is configured with auto-scaling using a dedicated AI cluster. What is the most likely cause?
99An organization wants to use OCI Generative AI to build a summarization tool but must ensure that all inference requests are logged for audit purposes. Which approach should they take?
100A company deploys a large language model on a dedicated AI cluster with 4 nodes. The model requires 128 GB of memory per instance, but the nodes have only 64 GB each. During inference, the nodes experience out-of-memory errors. What is the best solution?
101A team uses OCI Generative AI’s fine-tuning capability to adapt a base model. After fine-tuning, they evaluate the model but see degraded performance on certain edge cases. What is the most likely cause?
102A financial company deploys a generative AI model for document analysis. They need to ensure that the model does not expose sensitive information in its responses. Which OCI service should they use to implement content filtering?
103A user wants to invoke an OCI Generative AI endpoint from a cloud function. What is the required authentication method?
104An administrator notices that a dedicated AI cluster is not scaling down after a period of low traffic. What could be the cause?
105Which TWO are best practices for securing a generative AI endpoint on OCI? (Select TWO)
106Which TWO actions should be taken to monitor model drift in a deployed generative AI model? (Select TWO)
107Which THREE components are essential for a production-grade generative AI deployment on OCI? (Select THREE)
108A company has deployed a fine-tuned GPT model on OCI Generative AI using a dedicated AI cluster with 2 nodes. The endpoint is used by an internal application that generates product descriptions. Recently, the application started receiving timeouts and slow responses. The monitoring dashboard shows that the cluster's CPU utilization is consistently above 90%, and the request queue is growing. The team has verified that the model and code have not changed. The application traffic has increased by 20% over the past month. What should the team do to resolve the issue?
109A data science team at a healthcare company has fine-tuned a Llama 2 model using OCI Data Science and registered it in the Model Catalog. They want to deploy it as a managed endpoint using OCI Generative AI. The model requires 64 GB of GPU memory. The team has created a dedicated AI cluster with a single node shape that has 48 GB GPU memory. When they attempt to deploy the model, the deployment fails with an error indicating insufficient resources. The team has verified that the model artifact is correct and that the compartment policies allow deployment. What should the team do to successfully deploy the model?
110A multinational corporation uses OCI Generative AI to power a customer support chatbot. The chatbot uses a fine-tuned model deployed on a dedicated AI cluster in the us-ashburn-1 region. The application is used globally, and users in Europe are experiencing high latency (over 2 seconds) compared to users in North America (under 500 ms). The company has a requirement to keep all data within the US due to compliance, so they cannot deploy in Europe. The latency is not due to network bandwidth but due to the inference time. The monitoring shows that the cluster is at 80% utilization during peak hours. The team wants to reduce the latency for European users without violating data residency. What is the best course of action?
111Your team has deployed a fine-tuned GPT-2 model on OCI Model Deployment for a simple text generation API. The model performs text completion for short prompts (e.g., 50 tokens). The endpoint is working but response times are over 10 seconds for these short prompts. The model size is approximately 500MB and you used a VM.Standard.E3.Flex shape (2 OCPU, 16GB RAM). The deployment is in a single replica with no autoscaling. You have verified that the network latency is minimal (<5ms). The model was trained in OCI Data Science using a GPU shape, but during deployment you selected a CPU shape to reduce cost. The model is a transformer-based neural network. You've also confirmed that the deployment is healthy and there are no errors in the logs. The memory usage is within limits. What is the most likely cause of the high latency?
112A data scientist deployed a fine-tuned Llama 2 7B model on OCI Model Deployment with a single VM.GPU.A10.1 shape. Users report average latency of 3 seconds per request, which is too high for the intended real-time application. The model is used for short text generation (max 128 tokens). The data scientist wants to reduce per-request latency without significant accuracy loss. Which action would be most effective?
113A company wants to deploy a custom generative AI model for generating synthetic data for training other models. The model requires approximately 20GB of memory and must be accessible via a REST API with authentication. Additionally, the team needs to monitor for data drift over time. Which combination of OCI services best meets these requirements with minimal operational overhead?
114A company is deploying a generative AI model for a real-time inference API. To ensure high availability and cost efficiency under variable load, which two configurations should they implement? (Choose two.)
115A company is deploying a generative AI model on OCI for an internal application that must comply with strict security policies. The model will be accessed by a limited group of users. Which three actions should the administrator take to ensure security? (Choose three.)
116Your organization uses OCI Data Science to train a generative AI model for code generation. After training, you want to deploy it as a REST API. You create a model deployment using the OCI console, but after 30 minutes the deployment status is still 'Creating'. You check the logs and see the message: 'Insufficient capacity for shape VM.GPU.A10.1 in availability domain AD-1'. The deployment is configured with a single replica. You have verified your tenancy has sufficient service limits for GPU instances. What should you do to resolve this issue quickly?
117You manage a generative AI model deployed on OCI Model Deployment that serves a chatbot application. The model is a 13B parameter LLM on a VM.GPU.A100.1 shape. Recently, you rolled out a new version of the model that is supposed to improve response quality. However, after the update, the application starts returning HTTP 500 errors and memory usage spikes. You need to update to the new version without causing downtime. The current deployment has 2 replicas with autoscaling enabled. Which strategy should you use to safely deploy the new model version?
118Your team is deploying a generative AI model for a clinical decision support system. The model must meet HIPAA compliance requirements. You have trained a model using OCI Data Science and now need to deploy it so that patient data is protected. The application requires real-time inference. Which set of actions should you take to ensure compliance while maintaining low latency?
119You deployed a generative AI model on OCI Model Deployment with autoscaling configured based on average CPU utilization. The model is a large language model that heavily utilizes the GPU. During peak hours, the scaling is too slow to keep up with demand, resulting in high latency for users. You want to improve the responsiveness of autoscaling. Which change should you make?
120Your company uses OCI Data Science for model development and deployment. You have a generative AI model that requires dynamic batching for efficient inference. You deployed the model using the OCI Model Deployment service with a custom inference script in a Docker container. However, you notice that the batch size is fixed at 1, leading to low throughput. The model can process multiple requests together efficiently. You want to implement dynamic batching to increase throughput without significantly increasing latency for individual requests. What is the best approach?
121A generative AI model deployed on OCI Model Deployment is experiencing high tail latency. The model is a large language model that processes variable-length input sequences. Profiling shows that inference time varies significantly: short inputs (100 tokens) take 100ms, while long inputs (2000 tokens) take 2 seconds. The application requires consistent low latency (<500ms) for most requests. You want to reduce the variance in inference time without major changes to the model architecture. Which technique should you apply?
122Your organization has deployed a generative AI model for a multilingual translation service on OCI Model Deployment. The model is a 13B parameter transformer hosted on a single VM.GPU.A100.1 shape with 2 replicas. Recently, the service experiences intermittent timeouts when a burst of requests arrives. You have enabled autoscaling based on CPU utilization, but the scaling is too slow. After investigation, you find that the model inference time is highly variable due to different sequence lengths. You need to ensure the service can handle sudden spikes without timeouts. Which solution should you implement?
The Deploying and Managing Generative AI on OCI domain covers the key concepts tested in this area of the 1Z0-1127 exam blueprint published by Oracle. Courseiva provides free domain-focused practice, mock exams, missed-question review, and readiness tracking across all 1Z0-1127 domains — no account required.
The Courseiva 1Z0-1127 question bank contains 122 questions in the Deploying and Managing Generative AI on OCI domain. Click any question to see the full explanation and answer breakdown.
Start with a 10-question focused session to identify your baseline accuracy in this domain. Read every explanation — even for questions you answer correctly — to understand the reasoning. Once you score consistently above 80%, move to a 20–30 question session to confirm depth before moving to the next domain.
Yes — the session launcher on this page draws questions exclusively from the Deploying and Managing Generative AI on OCI domain. Choose 10, 20, 30, or 50 questions for a focused session, or click individual questions to review them one by one.
Save your results, see per-domain analytics, and get readiness scores — free, for every certification.
Sign Up FreeFree forever · Every certification included