1Z0-1127 Deploying and Managing Generative AI on OCI — All Questions With Answers

Question 1mediummultiple choice

Review the full subnetting walkthrough →

A company is deploying a generative AI service on OCI using the OCI Data Science service with a large language model (LLM) in a VCN. The model inference endpoint must be accessible only from a private subnet within the same VCN. Which networking component should be configured to enable this?

Question 2hardmultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A data scientist is fine-tuning a generative AI model on OCI Data Science using a custom container with GPU resources. The training job fails with an out-of-memory error despite the GPU instance having sufficient memory. The job works fine on a smaller dataset. What is the most likely cause?

Question 3easymultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

An organization wants to deploy a generative AI chatbot using OCI Generative AI service. The chatbot must comply with data residency requirements by ensuring that all data processing occurs within a specific geographic region. What is the best practice to achieve this?

Question 4hardmultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A team has deployed a generative AI model using OCI Data Science model deployment. The endpoint is behind a load balancer. Users report that after 5 minutes of inactivity, the first request takes over 30 seconds to respond, while subsequent requests are fast. What is the most likely cause and solution?

Question 5mediummultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A company is using OCI Generative AI service with a dedicated AI cluster for text generation. They notice that the latency is higher than expected. The cluster is in the Ashburn region, and users are distributed globally. What is the most effective way to reduce latency?

Question 6hardmultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A machine learning engineer is deploying a fine-tuned Llama 2 model on OCI Data Science model deployment. The deployment fails with an error: 'Model artifact exceeds the maximum allowed size of 10 GB.' The model files total 12 GB. What is the best approach to resolve this?

Question 7easymultiple choice

Study the full Python automation breakdown →

A developer wants to call the OCI Generative AI service from a Python application running on an OCI Compute instance. Which method is the most secure for authenticating the API calls?

Question 8mediummulti select

Read the full Deploying and Managing Generative AI on OCI explanation →

Which TWO actions are recommended best practices for managing costs when using OCI Generative AI dedicated AI clusters?

Question 9hardmulti select

Read the full Deploying and Managing Generative AI on OCI explanation →

Which THREE components are required to deploy a custom generative AI model on OCI Data Science model deployment?

Question 10easymulti select

Read the full Deploying and Managing Generative AI on OCI explanation →

Which TWO are valid methods to monitor the performance of a generative AI model deployed on OCI Data Science?

Question 11mediummultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

An administrator runs the above CLI command to check the status of a dedicated AI cluster. The cluster is ACTIVE with capacity 10. However, a user reports that inference requests to this cluster are failing with a '429 Too Many Requests' error. What is the most likely cause?

Exhibit

Refer to the exhibit.

```
$ oci generative-ai dedicated-ai-cluster get --dedicated-ai-cluster-id ocid1.dedicatedaicluster.oc1.iad.xxxxx
{
  "data": {
    "capacity": 10,
    "id": "ocid1.dedicatedaicluster.oc1.iad.xxxxx",
    "lifecycle-state": "ACTIVE",
    "time-created": "2024-01-15T10:00:00Z",
    "time-updated": "2024-01-15T10:00:00Z"
  }
}
```

Question 12hardmultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A security administrator wrote the above IAM policy for a compartment named MyCompartment. Users in the GenerativeAIUsers group can successfully list dedicated AI clusters and models in MyCompartment, but when they try to create an inference endpoint using a model from a different compartment (SharedModels), they get an authorization error. What is the most likely missing policy statement?

Exhibit

Refer to the exhibit.

```
{
  "statements": [
    "ALLOW GROUP GenerativeAIAdmins TO USE generative-ai-family IN TENANCY",
    "ALLOW GROUP GenerativeAIUsers TO USE generative-ai-dedicated-ai-clusters IN COMPARTMENT MyCompartment",
    "ALLOW GROUP GenerativeAIUsers TO USE generative-ai-models IN COMPARTMENT MyCompartment"
  ]
}
```

Question 13hardmultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A company has deployed a generative AI model on OCI to generate product descriptions. After a recent update, the model started producing outputs with repetitive phrases and poor coherence. The inference endpoint is configured with default parameters. Which single parameter adjustment is most likely to improve output quality?

Question 14easymultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

An organization wants to fine-tune a large language model on OCI using their proprietary data. They are concerned about data privacy and want to ensure that fine-tuning data does not leave the OCI region. Which OCI service should they use to securely store and manage their training data?

Question 15mediummultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A data scientist is deploying a custom generative AI model using OCI Data Science. After deploying the model to an endpoint, they notice that inference requests are failing with a timeout error when the payload size exceeds 1 MB. What is the most likely cause and solution?

Question 16hardmultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A company is using OCI Generative AI service to power a customer support chatbot. They observe that the chatbot sometimes provides outdated information because the model was trained on data up to 2022. They want to incorporate real-time knowledge without retraining the model. Which approach should they use?

Question 17mediummultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A team is deploying a generative AI model using OCI Functions for serverless inference. They are experiencing cold start latency of over 10 seconds for the first invocation after idle periods. What is the best strategy to reduce cold start latency?

Question 18easymultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

An administrator needs to ensure that only specific users in the finance department can invoke a generative AI model deployed on OCI. Which IAM policy should be used?

Question 19mediummulti select

Read the full Deploying and Managing Generative AI on OCI explanation →

A company is deploying a large generative AI model on OCI using GPU compute instances. They want to optimize inference cost while maintaining acceptable latency. Which TWO strategies should they implement?

Question 20hardmulti select

Read the full Deploying and Managing Generative AI on OCI explanation →

A team is fine-tuning a generative AI model on OCI using a custom dataset. The training job fails with an out-of-memory error. Which THREE actions should they take to resolve this issue?

Question 21mediummultiple choice

Review the full subnetting walkthrough →

A large enterprise is deploying a generative AI model for internal document summarization. The model is deployed on OCI Data Science using a custom container. The inference endpoint is behind a public load balancer. The security team requires that all traffic between the client and the endpoint be encrypted in transit and that the endpoint not be accessible from the public internet. The current setup uses a public load balancer with an SSL certificate. The VCN has a public subnet for the load balancer and a private subnet for the model deployment. The security team is concerned that the load balancer is publicly accessible. The enterprise wants to maintain high availability and low latency. What should the architect do to meet the security requirements?

Question 22hardmultiple choice

Read the full NAT/PAT explanation →

A healthcare company is using OCI Generative AI to analyze patient records and generate clinical summaries. The company must comply with HIPAA regulations, which require that all protected health information (PHI) be encrypted at rest and in transit, and that access be logged and audited. The current architecture uses an OCI Data Science model deployment with a public endpoint. The model is stored in an OCI Object Storage bucket that is publicly accessible for testing. The company is now moving to production. The compliance officer has flagged the following issues: (1) The model endpoint is publicly accessible. (2) The bucket containing the model is public. (3) No audit logs are enabled. The company wants to remediate these issues while maintaining the ability to invoke the model from on-premises applications via a secure connection. Which set of actions should the architect take?

Question 23hardmultiple choice

Read the full NAT/PAT explanation →

You are deploying a generative AI solution on OCI for a healthcare client that requires strict data residency (data must remain in the EU) and low-latency inference. The solution uses a fine-tuned LLM model (7B parameters) stored in Object Storage in the Frankfurt region. You have set up an OCI Data Science model deployment endpoint with GPU shape VM.GPU.A10.1, using a single replica. During load testing with 50 concurrent users, you observe high latency (average 8 seconds per request) and occasional 504 gateway timeouts. The model deployment logs show no errors, and the model loads successfully. You have confirmed that the Object Storage bucket is in the same region and that the network latency between the client and the endpoint is minimal (under 5 ms). Which action should you take to reduce latency and eliminate timeouts?

Question 24mediummultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A company has fine-tuned a large language model using OCI Generative AI service. When attempting to deploy the model to a dedicated endpoint, the deployment fails with an error indicating insufficient capacity. Which action should be taken to resolve this issue?

Question 25easymultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A startup wants to minimize costs when using OCI Generative AI service for a chatbot application that experiences sporadic usage. Which deployment strategy is most cost-effective?

Question 26hardmultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A global enterprise is deploying a generative AI application that requires high availability across multiple OCI regions. The application must automatically fail over to a secondary region if the primary region becomes unavailable. What is the recommended architecture to achieve this?

Question 27mediummultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A data scientist is using the OCI Generative AI service to generate text completions. The API calls are returning HTTP 400 errors with the message 'Invalid model parameters'. What is the most likely cause?

Question 28easymultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A developer wants to deploy a custom generative AI model that was trained using OCI Data Science. Which service should they use to expose the model as an API endpoint?

Question 29hardmultiple choice

Read the full NAT/PAT explanation →

A financial services company is concerned about data privacy when using OCI Generative AI service for processing sensitive customer data. They want to ensure that their data is not used to improve the model and is encrypted at rest and in transit. Which combination of OCI features should they implement?

Question 30mediummultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A company is using OCI Generative AI service to generate product descriptions. They notice that the model sometimes generates biased content. Which approach should they take to mitigate bias while maintaining performance?

Question 31easymultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A user wants to access the OCI Generative AI service programmatically. Which credential method is recommended for use in a production application running on OCI Compute?

Question 32hardmultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

An organization is deploying a generative AI model that requires GPU acceleration for inference. They are using OCI Data Science Model Deployment. The model is expected to handle variable traffic, with occasional spikes. Which scaling option should they configure to ensure cost-efficiency and responsiveness?

Question 33mediummulti select

Read the full Deploying and Managing Generative AI on OCI explanation →

Which TWO factors should be considered when selecting a base model for fine-tuning on OCI Generative AI service?

Question 34easymulti select

Read the full Deploying and Managing Generative AI on OCI explanation →

Which TWO methods can be used to invoke a generative AI model deployed on OCI?

Question 35hardmulti select

Read the full Deploying and Managing Generative AI on OCI explanation →

Which THREE steps are required to deploy a custom generative AI model using OCI Data Science Model Deployment?

Question 36hardmultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

Given the CLI output from `oci generative-ai model list`, what can be determined about the model 'my-fine-tuned-model'?

Exhibit

Refer to the exhibit.

```json
{
  "data": [
    {
      "id": "ocid1.generativeaimodel.oc1..aaaaaaEXAMPLE",
      "display-name": "my-fine-tuned-model",
      "model-type": "FINE_TUNED",
      "base-model-id": "ocid1.generativeaimodel.oc1..aaaaaaBASEMODEL",
      "time-created": "2024-05-01T12:00:00Z",
      "lifecycle-state": "ACTIVE"
    }
  ]
}
```

Question 37mediummultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

An administrator created the above IAM policies. A member of the GenerativeAIAdmins group reports they cannot invoke the model endpoint. Which permission is missing?

Exhibit

Refer to the exhibit.

```
Allow group GenerativeAIAdmins to manage generative-ai-model in compartment MyCompartment
Allow group GenerativeAIAdmins to manage generative-ai-endpoint in compartment MyCompartment
```

Question 38easymultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A developer receives the above error when trying to send a request to a model endpoint. What is the most likely reason?

Exhibit

Refer to the exhibit.

```json
{
  "code": "IncorrectState",
  "message": "The model endpoint is in DELETED state and cannot be used."
}
```

Question 39mediummultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A company is deploying a fine-tuned Cohere model on OCI Generative AI service for real-time inference. They need to ensure low latency even during demand spikes. Which configuration should they prioritize?

Question 40easymultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A data scientist needs to generate vector embeddings for a large corpus of text documents to use in a semantic search application. Which OCI service is best suited for this task?

Question 41hardmultiple choice

Read the full NAT/PAT explanation →

An organization is fine-tuning a large language model on OCI Data Science. They must ensure that the training data remains within a specific geographic region and is encrypted at rest. Which combination of resources should they use?

Question 42mediummultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A company has deployed a generative AI model endpoint on OCI. They want to monitor token usage and latency for cost optimization. Which OCI service should they use to collect these metrics?

Question 43easymultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A developer wants to integrate OCI Generative AI into a web application. Which API authentication method is recommended for programmatic access?

Question 44hardmultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A data science team is using OCI Data Science to fine-tune a model. They notice that training jobs are failing due to out-of-memory errors on the notebook session. What should they do to resolve this?

Question 45mediummultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A company wants to use OCI Generative AI to summarize customer support tickets. They need to ensure that the model does not output any sensitive information. Which technique should they implement?

Question 46easymultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

An administrator needs to grant a data science team access to create and manage generative AI model endpoints in a specific compartment. Which policy should they create?

Question 47hardmultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A company has deployed a generative AI endpoint using a custom fine-tuned model. They observe that the endpoint is returning 429 (Too Many Requests) errors during business hours. They need to handle this without losing requests. What should they implement?

Question 48mediummulti select

Read the full Deploying and Managing Generative AI on OCI explanation →

Which TWO of the following are valid ways to consume OCI Generative AI models?

Question 49hardmulti select

Read the full Deploying and Managing Generative AI on OCI explanation →

Which THREE of the following are best practices when deploying a generative AI model on OCI?

Question 50easymulti select

Read the full Deploying and Managing Generative AI on OCI explanation →

Which TWO of the following are sources of training data for fine-tuning a model in OCI Generative AI?

Question 51easymultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A company has fine-tuned a custom Llama 3 model using OCI Data Science for a chatbot. They now need a production-grade inference endpoint with auto-scaling. Which OCI service should they use?

Question 52mediummultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

An organization is deploying a large language model on OCI using a dedicated AI cluster. They need to minimize inference latency. Which configuration step is most critical?

Question 53hardmultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A developer is deploying a fine-tuned model using OCI Generative AI service. They want to use a custom container image for inference. Which statement is true?

Question 54easymultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A company needs to ensure that only authorized users can invoke an endpoint for a generative AI model. Which OCI feature should be used to control access?

Question 55mediummultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A team is fine-tuning a foundation model on a large dataset stored in OCI Object Storage. They want to minimize data transfer costs. What is the best practice for locating the storage?

Question 56hardmultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

During deployment of a generative AI model, the inference endpoint returns high latency and timeouts. The model is deployed on a dedicated AI cluster with multiple nodes. What is the most likely cause?

Question 57easymultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A company wants to use OCI Generative AI service to generate marketing copy that adheres to brand guidelines. Which technique should they use?

Question 58mediummultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A team has deployed a generative AI model and needs to monitor inference performance and set up alerts for increased error rates. Which OCI service should they integrate with?

Question 59hardmultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

An organization is deploying multiple generative AI models on a shared dedicated AI cluster. They need to isolate resource usage for each model to avoid interference. Which strategy is recommended?

Question 60easymulti select

Read the full Deploying and Managing Generative AI on OCI explanation →

A data scientist is preparing to fine-tune a foundation model on OCI. Which two actions should they take to optimize costs? (Select TWO.)

Question 61mediummulti select

Read the full Deploying and Managing Generative AI on OCI explanation →

A DevOps engineer is setting up monitoring and logging for a generative AI inference endpoint. Which three resources should they enable? (Select THREE.)

Question 62hardmulti select

Read the full Deploying and Managing Generative AI on OCI explanation →

An enterprise is deploying a generative AI model that must comply with data residency regulations. Which two configurations should they implement? (Select TWO.)

Question 63easymultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

Refer to the exhibit. A user receives this error when using the OCI CLI to chat with a model. What is the most likely cause?

Network Topology

Question 64mediummultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

Refer to the exhibit. A team created this dedicated AI cluster. However, when they try to create a model deployment, the deployment fails with an error indicating insufficient public IPs. What change to the cluster configuration should they make?

Exhibit

{
  "compartmentId": "ocid1.compartment.oc1..xxxxx",
  "displayName": "llama-cluster",
  "aiClusterShape": "VM.GPU.A10.1",
  "nodeCount": 4,
  "networkConfiguration": {
    "subnetId": "ocid1.subnet.oc1.iad.xxxxx",
    "assignPublicIp": false
  }
}

Question 65hardmultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

Refer to the exhibit. A data scientist received this output after submitting a fine-tuning job. What is the most effective change to resolve the out-of-memory error?

Exhibit

{
  "data": {
    "id": "ocid1.finetuningjob.oc1.iad.xxxxx",
    "lifecycle-state": "FAILED",
    "lifecycle-details": "Job terminated due to out-of-memory error on worker node. Consider increasing the cluster shape or reducing the model size."
  }
}

Question 66easymultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A data scientist wants to deploy a fine-tuned LLM on OCI for inference with low latency. Which OCI service should they use?

Question 67mediummultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A company notices that some inference requests to their deployed model on OCI Generative AI take longer than acceptable. They want to reduce per-request latency. What should they do?

Question 68hardmultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A financial institution needs to deploy a fine-tuned model on OCI with strict data residency requirements. They must ensure that data used for inference never leaves a specific OCI region. The model is stored in Object Storage in the same region. What additional configuration is needed?

Question 69easymultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A developer wants to integrate generative AI capabilities into an application using REST API calls. Which OCI Generative AI service endpoint should they use for text generation?

Question 70mediummultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A company has deployed a model on a Dedicated AI Cluster and needs to monitor inference performance metrics such as request latency, throughput, and error rates. Which OCI service provides built-in monitoring dashboards for these metrics?

Question 71hardmultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

An AI team is fine-tuning a large language model using OCI Data Science and plans to deploy the fine-tuned model using the Generative AI service's custom model deployment. What is the required format for the model artifacts?

Question 72hardmultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A company has multiple teams sharing an OCI Generative AI Dedicated AI Cluster. They need to ensure that each team can only access their own fine-tuned models and cannot see or invoke models from other teams. What is the best approach?

Question 73easymultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

What is the primary benefit of using a Dedicated AI Cluster over On-Demand serving for deploying generative AI models on OCI?

Question 74mediummultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A developer is getting a 401 Unauthorized error when calling the OCI Generative AI inference API. What is the most likely cause?

Question 75mediummulti select

Read the full Deploying and Managing Generative AI on OCI explanation →

Which two actions are required when deploying a custom fine-tuned model using the OCI Generative AI service? (Choose two.)

Question 76hardmulti select

Read the full Deploying and Managing Generative AI on OCI explanation →

A company is designing a generative AI solution on OCI that must comply with data privacy regulations. Which three best practices should they follow? (Choose three.)

Question 77easymulti select

Read the full Deploying and Managing Generative AI on OCI explanation →

Which two metrics would you monitor to ensure a generative AI deployment on OCI is operating efficiently? (Choose two.)

Question 78mediummultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

Refer to the exhibit. An administrator receives the error shown when attempting to deploy a custom model. What is the most likely cause?

Network Topology

Question 79hardmultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

Refer to the exhibit. The dashboard shows latency grouped by modelId, but some points are missing for certain modelIds. Which of the following is the most likely reason?

Exhibit

GET /20180401/metrics?compartmentId=ocid1.compartment.oc1..aaaa...&metricName=InferenceLatency&aggregationInterval=1m&groupBy=modelId

Question 80easymultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

Refer to the exhibit. Users in the group cannot create a new custom model deployment on a Dedicated AI Cluster. What is the most likely missing permission?

Exhibit

Allow group GenAI-Admin to manage dedicated-ai-clusters in compartment FinComp
Allow group GenAI-Admin to manage custom-models in compartment FinComp
Allow group GenAI-Admin to read object-storage in compartment FinComp

Question 81easymultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A retail company uses OCI Generative AI to generate product descriptions. They observe the model occasionally produces biased content. Which technique should be applied to reduce bias in model outputs?

Question 82easymultiple choice

Review the full subnetting walkthrough →

A developer wants to invoke an OCI Generative AI model from an application running on a compute instance in OCI. The instance is in a private subnet. What is the most secure method to access the model endpoint?

Question 83easymultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A data scientist needs to fine-tune a model on OCI Generative AI. Which of the following is a required parameter in the fine-tuning request?

Question 84mediummultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A company is deploying a chatbot powered by OCI Generative AI. They want to inject the conversation history into the model prompt to maintain context. However, they notice that after a long conversation, the model starts to ignore earlier messages. What is the most likely cause?

Question 85mediummultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

An organization needs to ensure that all inference requests to OCI Generative AI are logged for compliance. Which OCI feature should be enabled?

Question 86mediummultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A team wants to use OCI Generative AI to generate synthetic data for training a model. They are concerned about the cost of API calls. Which pricing model would be most cost-effective for high-volume batch processing?

Question 87hardmultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A company deploys a fine-tuned model on an OCI Generative AI dedicated AI cluster. After deployment, they observe high latency during peak hours. The cluster has only one replica. Which action would most effectively reduce latency without increasing cost unnecessarily?

Question 88hardmultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

An enterprise with strict data residency requirements wants to use OCI Generative AI. They must ensure that no training data or inference data leaves a specific OCI region. Which configuration option should they choose?

Question 89hardmultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A machine learning engineer evaluates OCI Generative AI for a real-time content generation application. They need to meet a SLAs of 99.9% availability. Which deployment architecture satisfies the requirement with the lowest cost?

Question 90mediummulti select

Read the full Deploying and Managing Generative AI on OCI explanation →

A company is designing a generative AI application using OCI Generative AI. Which two factors should be considered when selecting the appropriate model? (Choose two.)

Question 91hardmulti select

Read the full Deploying and Managing Generative AI on OCI explanation →

An OCI administrator is configuring access control for OCI Generative AI. Which three IAM components are required to allow a group of data scientists to call the GenerateText API? (Choose three.)

Question 92easymulti select

Read the full Deploying and Managing Generative AI on OCI explanation →

A developer is troubleshooting an OCI Generative AI inference request that returns a 400 Bad Request error. Which three common causes could result in this error? (Choose three.)

Question 93easymultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A user sends an inference request with the JSON parameters shown. They notice the model is returning very short responses. What is the most likely cause?

Exhibit

Refer to the exhibit.

{
  "compartmentId": "ocid1.compartment.oc1..aaaaaaaaxxx",
  "modelId": "ocid1.generativeaimodel.oc1.iad.xxxx",
  "inferenceParameters": {
    "temperature": 0.5,
    "maxTokens": 2000,
    "topP": 0.9
  }
}

Question 94mediummultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A user runs the CLI command shown but receives only one model in the list, even though they know there are more models available in the compartment. What is the most likely reason?

Exhibit

Refer to the exhibit.

$ oci generative-ai model list --compartment-id ocid1.compartment.oc1..aaaa
{
  "data": [
    { "id": "ocid1.generativeaimodel.oc1.iad.xxxx", "name": "cohere.command-light", "lifecycle-state": "ACTIVE" }
  ]
}

Question 95hardmultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A data scientist in group DataScientists uses the OCI Generative AI SDK to start a fine-tuning job in compartment AIResources. They receive the error shown. What is the most likely cause?

Exhibit

Refer to the exhibit.

{
  "status": 403,
  "code": "NotAuthorizedOrNotFound",
  "message": "Authorization failed or requested resource not found"
}

A data scientist has the following IAM policy in the root compartment:
Allow group DataScientists to manage ai-services-generative-ai-family in compartment AIResources

Question 96easymultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A company deploys a fine-tuned Llama 2 model using OCI Generative AI service. They want to ensure low-latency inference for a real-time chat application. Which deployment option should they use?

Question 97easymultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A data scientist fine-tunes a model using OCI Data Science and wants to deploy it as a managed endpoint in OCI Generative AI. What must they do first?

Question 98mediummultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A team deploys a generative AI model endpoint and notices intermittent 429 Too Many Requests errors. The endpoint is configured with auto-scaling using a dedicated AI cluster. What is the most likely cause?

Question 99mediummultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

An organization wants to use OCI Generative AI to build a summarization tool but must ensure that all inference requests are logged for audit purposes. Which approach should they take?

Question 100hardmultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A company deploys a large language model on a dedicated AI cluster with 4 nodes. The model requires 128 GB of memory per instance, but the nodes have only 64 GB each. During inference, the nodes experience out-of-memory errors. What is the best solution?

Question 101hardmultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A team uses OCI Generative AI’s fine-tuning capability to adapt a base model. After fine-tuning, they evaluate the model but see degraded performance on certain edge cases. What is the most likely cause?

Question 102hardmultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A financial company deploys a generative AI model for document analysis. They need to ensure that the model does not expose sensitive information in its responses. Which OCI service should they use to implement content filtering?

Question 103easymultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A user wants to invoke an OCI Generative AI endpoint from a cloud function. What is the required authentication method?

Question 104mediummultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

An administrator notices that a dedicated AI cluster is not scaling down after a period of low traffic. What could be the cause?

Question 105easymulti select

Read the full Deploying and Managing Generative AI on OCI explanation →

Which TWO are best practices for securing a generative AI endpoint on OCI? (Select TWO)

Question 106mediummulti select

Read the full Deploying and Managing Generative AI on OCI explanation →

Which TWO actions should be taken to monitor model drift in a deployed generative AI model? (Select TWO)

Question 107hardmulti select

Read the full Deploying and Managing Generative AI on OCI explanation →

Which THREE components are essential for a production-grade generative AI deployment on OCI? (Select THREE)

Question 108easymultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A company has deployed a fine-tuned GPT model on OCI Generative AI using a dedicated AI cluster with 2 nodes. The endpoint is used by an internal application that generates product descriptions. Recently, the application started receiving timeouts and slow responses. The monitoring dashboard shows that the cluster's CPU utilization is consistently above 90%, and the request queue is growing. The team has verified that the model and code have not changed. The application traffic has increased by 20% over the past month. What should the team do to resolve the issue?

Question 109mediummultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A data science team at a healthcare company has fine-tuned a Llama 2 model using OCI Data Science and registered it in the Model Catalog. They want to deploy it as a managed endpoint using OCI Generative AI. The model requires 64 GB of GPU memory. The team has created a dedicated AI cluster with a single node shape that has 48 GB GPU memory. When they attempt to deploy the model, the deployment fails with an error indicating insufficient resources. The team has verified that the model artifact is correct and that the compartment policies allow deployment. What should the team do to successfully deploy the model?

Question 110hardmultiple choice

Read the full NAT/PAT explanation →

A multinational corporation uses OCI Generative AI to power a customer support chatbot. The chatbot uses a fine-tuned model deployed on a dedicated AI cluster in the us-ashburn-1 region. The application is used globally, and users in Europe are experiencing high latency (over 2 seconds) compared to users in North America (under 500 ms). The company has a requirement to keep all data within the US due to compliance, so they cannot deploy in Europe. The latency is not due to network bandwidth but due to the inference time. The monitoring shows that the cluster is at 80% utilization during peak hours. The team wants to reduce the latency for European users without violating data residency. What is the best course of action?

Question 111easymultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

Your team has deployed a fine-tuned GPT-2 model on OCI Model Deployment for a simple text generation API. The model performs text completion for short prompts (e.g., 50 tokens). The endpoint is working but response times are over 10 seconds for these short prompts. The model size is approximately 500MB and you used a VM.Standard.E3.Flex shape (2 OCPU, 16GB RAM). The deployment is in a single replica with no autoscaling. You have verified that the network latency is minimal (<5ms). The model was trained in OCI Data Science using a GPU shape, but during deployment you selected a CPU shape to reduce cost. The model is a transformer-based neural network. You've also confirmed that the deployment is healthy and there are no errors in the logs. The memory usage is within limits. What is the most likely cause of the high latency?

Question 112mediummultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A data scientist deployed a fine-tuned Llama 2 7B model on OCI Model Deployment with a single VM.GPU.A10.1 shape. Users report average latency of 3 seconds per request, which is too high for the intended real-time application. The model is used for short text generation (max 128 tokens). The data scientist wants to reduce per-request latency without significant accuracy loss. Which action would be most effective?

Question 113hardmultiple choice

Read the full NAT/PAT explanation →

A company wants to deploy a custom generative AI model for generating synthetic data for training other models. The model requires approximately 20GB of memory and must be accessible via a REST API with authentication. Additionally, the team needs to monitor for data drift over time. Which combination of OCI services best meets these requirements with minimal operational overhead?

Question 114easymulti select

Read the full Deploying and Managing Generative AI on OCI explanation →

A company is deploying a generative AI model for a real-time inference API. To ensure high availability and cost efficiency under variable load, which two configurations should they implement? (Choose two.)

Question 115hardmulti select

Read the full Deploying and Managing Generative AI on OCI explanation →

A company is deploying a generative AI model on OCI for an internal application that must comply with strict security policies. The model will be accessed by a limited group of users. Which three actions should the administrator take to ensure security? (Choose three.)

Question 116easymultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

Your organization uses OCI Data Science to train a generative AI model for code generation. After training, you want to deploy it as a REST API. You create a model deployment using the OCI console, but after 30 minutes the deployment status is still 'Creating'. You check the logs and see the message: 'Insufficient capacity for shape VM.GPU.A10.1 in availability domain AD-1'. The deployment is configured with a single replica. You have verified your tenancy has sufficient service limits for GPU instances. What should you do to resolve this issue quickly?

Question 117mediummultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

You manage a generative AI model deployed on OCI Model Deployment that serves a chatbot application. The model is a 13B parameter LLM on a VM.GPU.A100.1 shape. Recently, you rolled out a new version of the model that is supposed to improve response quality. However, after the update, the application starts returning HTTP 500 errors and memory usage spikes. You need to update to the new version without causing downtime. The current deployment has 2 replicas with autoscaling enabled. Which strategy should you use to safely deploy the new model version?

Question 118mediummultiple choice

Read the full NAT/PAT explanation →

Your team is deploying a generative AI model for a clinical decision support system. The model must meet HIPAA compliance requirements. You have trained a model using OCI Data Science and now need to deploy it so that patient data is protected. The application requires real-time inference. Which set of actions should you take to ensure compliance while maintaining low latency?

Question 119mediummultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

You deployed a generative AI model on OCI Model Deployment with autoscaling configured based on average CPU utilization. The model is a large language model that heavily utilizes the GPU. During peak hours, the scaling is too slow to keep up with demand, resulting in high latency for users. You want to improve the responsiveness of autoscaling. Which change should you make?

Question 120hardmultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

Your company uses OCI Data Science for model development and deployment. You have a generative AI model that requires dynamic batching for efficient inference. You deployed the model using the OCI Model Deployment service with a custom inference script in a Docker container. However, you notice that the batch size is fixed at 1, leading to low throughput. The model can process multiple requests together efficiently. You want to implement dynamic batching to increase throughput without significantly increasing latency for individual requests. What is the best approach?

Question 121hardmultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

A generative AI model deployed on OCI Model Deployment is experiencing high tail latency. The model is a large language model that processes variable-length input sequences. Profiling shows that inference time varies significantly: short inputs (100 tokens) take 100ms, while long inputs (2000 tokens) take 2 seconds. The application requires consistent low latency (<500ms) for most requests. You want to reduce the variance in inference time without major changes to the model architecture. Which technique should you apply?

Question 122hardmultiple choice

Read the full Deploying and Managing Generative AI on OCI explanation →

Your organization has deployed a generative AI model for a multilingual translation service on OCI Model Deployment. The model is a 13B parameter transformer hosted on a single VM.GPU.A100.1 shape with 2 replicas. Recently, the service experiences intermittent timeouts when a burst of requests arrives. You have enabled autoscaling based on CPU utilization, but the scaling is too slow. After investigation, you find that the model inference time is highly variable due to different sequence lengths. You need to ensure the service can handle sudden spikes without timeouts. Which solution should you implement?