1Z0-1127 Practice Test 28 — 15 Questions

Question 1

You are a machine learning engineer at a large e-commerce company. You have been tasked with deploying a large language model to power a customer service chatbot that handles product returns and refunds. The model will answer customer queries based on a knowledge base of return policies and FAQs. The company has strict requirements: (1) responses must be factually accurate and grounded in the knowledge base, (2) the system must be cost-effective, and (3) latency should be under 2 seconds per response. You decide to use a pre-trained LLM from OCI Data Science and implement retrieval-augmented generation (RAG). You have two options for the retriever: a dense embedding-based retriever (e.g., using OCI AI Language embeddings) or a sparse keyword-based retriever (e.g., BM25). You also need to decide on the generation model size: a 7B parameter model or a 70B parameter model. You run a pilot test: with the dense retriever + 7B model, average latency is 1.8 seconds and accuracy is 85%. With the sparse retriever + 7B model, latency is 1.2 seconds but accuracy drops to 75%. With the 70B model (any retriever), latency exceeds 5 seconds. Which combination should you choose to meet all requirements?

Accepted Answer

Dense retriever + 7B model.. Option D (dense retriever + 7B model) is correct because it meets all three requirements: factual accuracy (85% accuracy from dense retrieval grounding), latency under 2 seconds (1.8 seconds), and cost-effectiveness (7B model is cheaper to run than 70B). The dense retriever provides better semantic matching for nuanced return policy queries, while the 7B model keeps inference fast and affordable.

Answer

Sparse retriever + 70B model.

Answer

Dense retriever + 70B model.

Answer

Sparse retriever + 7B model.

Question 2

A financial services company deployed a fine-tuned model using OCI Generative AI Service to generate investment advice based on quarterly reports. The model was trained on 10,000 labeled examples and achieved high accuracy in testing. However, after three months in production, the model's outputs have become inconsistent and sometimes recommend investments based on outdated market conditions. The team has received multiple complaints from users about inaccurate advice. The model is deployed on a dedicated AI cluster with auto-scaling disabled. The OCI audit logs show no configuration changes. The team suspects data drift and wants to mitigate it without incurring high costs. They have a pipeline that can collect new labeled data monthly, but it takes two weeks to process. What should the team do?

Accepted Answer

Set up a monthly retraining schedule using the new labeled data as soon as it is available, and use a champion/challenger deployment to validate the new model before full rollout.. Option A is correct because it directly addresses data drift by establishing a regular retraining cycle with the new labeled data, which is the standard mitigation strategy for model degradation over time. The champion/challenger deployment pattern allows the team to validate the updated model's performance against the current production model before full rollout, ensuring no regression in accuracy. This approach balances cost efficiency (monthly retraining) with the operational constraint of a two-week data processing pipeline.

Answer

Decrease the temperature parameter to 0.1 to make outputs more deterministic.

Answer

Revert to the base model (Cohere Command) and use few-shot prompting with recent reports.

Answer

Enable auto-scaling on the dedicated AI cluster to handle increased load.

Question 3

Your team is deploying a generative AI model for a clinical decision support system. The model must meet HIPAA compliance requirements. You have trained a model using OCI Data Science and now need to deploy it so that patient data is protected. The application requires real-time inference. Which set of actions should you take to ensure compliance while maintaining low latency?

Accepted Answer

Deploy in a private VCN subnet, use a service gateway, store keys in OCI Vault, and enable OCI Logging and OCI Audit. Option D is correct because deploying the model in a private VCN subnet ensures the inference endpoint is not exposed to the internet, meeting HIPAA's requirement for network isolation. Using a service gateway allows private connectivity to OCI services without traversing the internet, while storing encryption keys in OCI Vault enables customer-managed key control for data at rest. Enabling OCI Logging and OCI Audit provides the necessary audit trail for compliance, and the private subnet with service gateway keeps latency low by avoiding internet hops.

Answer

Use OCI Functions with API Gateway and allow anonymous access

Answer

Deploy in a public subnet with HTTPS and enable OCI Audit

Answer

Use OCI Data Flow for batch inference and store results in Object Storage with SSE

Question 4

Which TWO factors should be considered when selecting a base model for fine-tuning on OCI Generative AI service?

Accepted Answer

The model's size and number of parameters. When selecting a base model for fine-tuning on OCI Generative AI service, the model's size and number of parameters (B) directly impact computational cost, training time, and the model's capacity to learn from your dataset. The model's license and terms of use (C) are critical because commercial use, redistribution, and fine-tuning rights vary per model (e.g., Llama 2 vs. GPT-based models), and violating these can lead to legal or compliance issues.

Answer

The model's training dataset size

Answer

The model's training framework (PyTorch vs TensorFlow)

Answer

The model's built-in features like content filtering

Question 5

An organization is fine-tuning a large language model on OCI Data Science. They must ensure that the training data remains within a specific geographic region and is encrypted at rest. Which combination of resources should they use?

Accepted Answer

OCI Object Storage bucket with a bucket policy and default encryption, created in the required region.. Option A is correct because OCI Object Storage with default encryption ensures data is encrypted at rest using AES-256, and a bucket policy can enforce that data remains within a specific geographic region by restricting cross-region replication or access. This combination directly meets the requirements of regional data residency and encryption at rest for training data used in OCI Data Science.

Answer

OCI Database with Transparent Data Encryption, storing the training data in tables.

Answer

OCI File Storage with export options and encryption, mounted to the Data Science session.

Answer

OCI Block Volume with encryption, attached to the Data Science notebook session.

Question 6

A developer wants to integrate OCI Generative AI into a web application. Which API authentication method is recommended for programmatic access?

Accepted Answer

API key-based signing. Option B is correct because OCI APIs require request signing using an API signing key (an RSA key pair) for programmatic access. The developer must generate a key pair, upload the public key to the OCI console, and then use the private key to sign each HTTP request using the OCI Signature Version 1 algorithm (based on RFC 2104 HMAC-SHA256). This ensures authentication without transmitting secrets over the wire.

Answer

Pre-authenticated request

Answer

OAuth 2.0 client credentials

Answer

Username and password in the header

Question 7

A developer is using OCI Generative AI Service to generate code snippets. They want to ensure the output is as deterministic as possible for testing. Which combination of parameters should they use?

Accepted Answer

Temperature = 0, Top-p = 1. Setting Temperature=0 makes the model deterministic by always selecting the highest-probability token, while Top-p=1 includes all tokens in the sampling pool, ensuring no additional randomness is introduced. This combination eliminates stochastic variation, making outputs repeatable for testing.

Answer

Temperature = 0.5, Top-p = 0.5

Answer

Temperature = 0, Top-p = 0

Answer

Temperature = 1, Top-p = 1

Question 8

A data science team is using OCI Data Science to fine-tune a model. They notice that training jobs are failing due to out-of-memory errors on the notebook session. What should they do to resolve this?

Accepted Answer

Switch to a larger notebook session shape.. Out-of-memory errors during training on a notebook session indicate that the current shape's memory capacity is insufficient for the model or data being processed. Switching to a larger notebook session shape directly increases available RAM and compute resources, resolving the memory constraint without altering the training logic or infrastructure type.

Answer

Enable autoscaling on the notebook session.

Answer

Use OCI Data Flow instead.

Answer

Reduce the batch size in the training script.

Question 9

A developer is troubleshooting an OCI Generative AI inference request that returns a 400 Bad Request error. Which three common causes could result in this error? (Choose three.)

Accepted Answer

Incorrect endpoint URL. A 400 Bad Request error indicates the server cannot process the request due to client-side issues. An incorrect endpoint URL (A) is a common cause because the request is sent to the wrong OCI Generative AI service endpoint (e.g., using a chat endpoint for a text generation model), leading to a malformed request that the server rejects. Missing required parameters (C) in the request body, such as 'compartmentId' or 'modelId', also triggers a 400 error as the API cannot validate or process the inference without them. Exceeding the model's maximum token limit (D) results in a 400 error because the input or output exceeds the model's configured token capacity, which the API validates before processing.

Answer

Invalid API key in the request header

Answer

Network connectivity issues

Question 10

An organization wants to use OCI Generative AI for real-time document translation. They need high availability across regions. Which deployment option meets this requirement?

Accepted Answer

Multiple dedicated AI clusters in different regions with a load balancer. Option B is correct because deploying multiple dedicated AI clusters across different regions with a load balancer ensures high availability by distributing traffic and providing failover if one region becomes unavailable. OCI Generative AI dedicated AI clusters are provisioned per region, and a load balancer can route requests to healthy clusters, meeting the requirement for real-time document translation with cross-region redundancy.

Answer

Single dedicated AI cluster in one region

Answer

Single serverless endpoint

Answer

Multiple serverless endpoints in different regions

Question 11

Which TWO of the following are valid ways to reduce latency when using OCI Generative AI Service?

Accepted Answer

Use a dedicated AI cluster. A dedicated AI cluster provides isolated compute resources (GPU nodes) for inference, eliminating resource contention from other tenants or workloads. This ensures consistent low-latency responses because the model is always warm and available without queueing delays, which is critical for real-time applications.

Answer

Deploy the model in a different region

Answer

Use a larger model

Answer

Batch multiple requests

Question 12

A financial company deploys a generative AI model for document analysis. They need to ensure that the model does not expose sensitive information in its responses. Which OCI service should they use to implement content filtering?

Accepted Answer

OCI AI Content Moderation. OCI AI Content Moderation is the correct service because it provides pre-trained models and APIs specifically designed to detect and filter sensitive content such as personally identifiable information (PII), profanity, and other unsafe text in generative AI outputs. This allows the financial company to enforce content safety policies on document analysis responses, preventing exposure of sensitive information.

Answer

OCI Data Safe

Answer

OCI Vault

Answer

OCI WAF

Question 13

A company has fine-tuned a custom Llama 3 model using OCI Data Science for a chatbot. They now need a production-grade inference endpoint with auto-scaling. Which OCI service should they use?

Accepted Answer

OCI Generative AI Service. Option C is correct because OCI Generative AI Service provides a fully managed, production-grade inference endpoint with built-in auto-scaling for custom models like fine-tuned Llama 3. It abstracts infrastructure management, offers serverless deployment, and integrates with OCI Data Science for model import, making it the ideal choice for a chatbot requiring scalable inference.

Answer

OCI Functions

Answer

OCI Data Science Model Deployment

Answer

OCI Kubernetes Engine (OKE)

Question 14

A company needs to ensure that only authorized users can invoke an endpoint for a generative AI model. Which OCI feature should be used to control access?

Accepted Answer

OCI Identity and Access Management (IAM) policies. OCI Identity and Access Management (IAM) policies are the correct choice because they define who (users, groups, or service principals) can invoke which OCI resources, including generative AI model endpoints. IAM policies use resource-type and verb-based statements (e.g., 'allow group A to manage ai-service-family in compartment X') to enforce authorization at the API level, ensuring only authorized principals can call the model's inference endpoint.

Answer

Network security groups (NSGs)

Answer

VCN flow logs

Answer

OCI Web Application Firewall (WAF)

Question 15

A data scientist is preparing to fine-tune a foundation model on OCI. Which two actions should they take to optimize costs? (Select TWO.)

Accepted Answer

Use the smallest model that meets accuracy requirements. Option A is correct because using the smallest model that meets accuracy requirements directly reduces the number of parameters and computational operations required during fine-tuning. On OCI, larger models consume significantly more GPU memory and compute hours, so selecting the minimal viable model minimizes both training time and associated costs. This aligns with cost optimization best practices for generative AI workloads.

Answer

Use a single OCPU shape to minimize per-hour cost

Answer

Use spot preemptible instances to save on compute

Answer

Store training data in Archive Storage to reduce storage costs