Sample questions
Oracle Cloud Infrastructure Generative AI Professional 1Z0-1127 practice questions
A developer wants to deploy a RAG application using OCI Generative AI for both embedding and text generation while minimizing costs. Which strategy is most effective?
Trap 1: Use a larger generation model
Larger generation models increase cost per generation.
Trap 2: Reduce chunk size to decrease embedding calls
Smaller chunks may increase the number of chunks and thus embedding calls.
Trap 3: Use a larger embedding model for better accuracy
Larger models cost more per API call.
- A
Use a larger generation model
Why wrong: Larger generation models increase cost per generation.
- B
Cache frequent queries and their embeddings
Caching reduces redundant embedding API calls, lowering costs.
- C
Reduce chunk size to decrease embedding calls
Why wrong: Smaller chunks may increase the number of chunks and thus embedding calls.
- D
Use a larger embedding model for better accuracy
Why wrong: Larger models cost more per API call.
A data scientist fine-tuned a model on OCI Gen AI using a dedicated AI cluster. After deployment, the model gives inaccurate results. Which troubleshooting step should they take first?
Trap 1: Switch to a different base model
Base model may not be the root cause if fine-tuning data is flawed.
Trap 2: Increase the cluster size
Cluster size affects performance, not accuracy.
Trap 3: Use a serverless endpoint
Endpoint type does not fix accuracy issues.
- A
Switch to a different base model
Why wrong: Base model may not be the root cause if fine-tuning data is flawed.
- B
Increase the cluster size
Why wrong: Cluster size affects performance, not accuracy.
- C
Use a serverless endpoint
Why wrong: Endpoint type does not fix accuracy issues.
- D
Check the training data for bias or quality issues
Training data quality directly impacts model accuracy.
Users report that inference requests to the OCI Generative AI service are taking longer than expected. The application uses the on-demand endpoint. What is the most likely cause of the increased latency?
Trap 1: The inference model is not fine-tuned for the use case.
Fine-tuning affects accuracy, not latency. The issue is performance, not model suitability.
Trap 2: The selected model is too large for the use case.
Model size affects inference speed, but the on-demand endpoint automatically scales; the more common cause is shared resource contention.
Trap 3: The API request timeout is set too low.
Timeout settings affect client-side waits, not server-side inference latency.
- A
The inference model is not fine-tuned for the use case.
Why wrong: Fine-tuning affects accuracy, not latency. The issue is performance, not model suitability.
- B
The on-demand endpoint experiences shared resource contention.
On-demand endpoints are multi-tenant; high concurrent usage can cause latency spikes.
- C
The selected model is too large for the use case.
Why wrong: Model size affects inference speed, but the on-demand endpoint automatically scales; the more common cause is shared resource contention.
- D
The API request timeout is set too low.
Why wrong: Timeout settings affect client-side waits, not server-side inference latency.
Refer to the exhibit. A developer runs the command and receives the error. What is the issue?
Trap 1: The message is too short.
Message length is not the issue.
Trap 2: The chat-id is invalid.
The error does not mention chat-id.
Trap 3: The endpoint is incorrect.
The endpoint is valid; the error is about max-tokens.
- A
The max-tokens value exceeds the allowed range.
The error explicitly states the valid range.
- B
The message is too short.
Why wrong: Message length is not the issue.
- C
The chat-id is invalid.
Why wrong: The error does not mention chat-id.
- D
The endpoint is incorrect.
Why wrong: The endpoint is valid; the error is about max-tokens.
A developer wants to integrate OCI GenAI into a Java application. Which SDK should they use?
Trap 1: OCI JavaScript SDK.
JavaScript SDK is for Node.js or browser applications.
Trap 2: OCI Python SDK.
Python SDK is for Python applications, not Java.
Trap 3: OCI CLI.
CLI is a command-line tool, not an SDK for integration.
- A
OCI JavaScript SDK.
Why wrong: JavaScript SDK is for Node.js or browser applications.
- B
OCI Python SDK.
Why wrong: Python SDK is for Python applications, not Java.
- C
OCI Java SDK.
The Java SDK is designed for Java applications.
- D
OCI CLI.
Why wrong: CLI is a command-line tool, not an SDK for integration.
Which TWO factors most significantly influence the computational cost of fine-tuning a large language model?
Trap 1: Batch size
Batch size affects memory but not per-token compute cost.
Trap 2: Quantization bits
Quantization reduces cost, not increases.
Trap 3: Dataset size
Dataset size affects total training time but not per-step cost.
- A
Batch size
Why wrong: Batch size affects memory but not per-token compute cost.
- B
Number of model parameters
More parameters increase compute and memory requirements.
- C
Maximum sequence length
Longer sequences increase attention computation and memory usage.
- D
Quantization bits
Why wrong: Quantization reduces cost, not increases.
- E
Dataset size
Why wrong: Dataset size affects total training time but not per-step cost.
An organization wants to use an LLM to summarize legal documents. Which consideration is most important for ensuring accurate summaries?
Trap 1: Use the largest available general-purpose model
Size alone doesn't ensure domain expertise.
Trap 2: Rely on zero-shot summarization with careful prompting
Zero-shot may miss critical legal details.
Trap 3: Pre-train a new model from scratch on legal texts
Pre-training from scratch is resource-intensive and seldom needed.
- A
Fine-tune the model on a curated legal corpus
Domain-specific fine-tuning teaches the model legal terminology and reasoning.
- B
Use the largest available general-purpose model
Why wrong: Size alone doesn't ensure domain expertise.
- C
Rely on zero-shot summarization with careful prompting
Why wrong: Zero-shot may miss critical legal details.
- D
Pre-train a new model from scratch on legal texts
Why wrong: Pre-training from scratch is resource-intensive and seldom needed.
A healthcare startup is building an AI assistant to help doctors draft clinical notes from patient-physician conversations. They have a large language model that is fine-tuned on medical data. During testing, they notice the model occasionally generates plausible-sounding but incorrect medical recommendations. The startup wants to deploy the assistant to assist doctors, not replace them. They have the following options: (A) Deploy the model as-is and rely on doctors to catch errors, (B) Add a disclaimer that the model may make mistakes, (C) Implement a fact-checking pipeline that cross-references outputs with a trusted medical knowledge base before presenting to doctors, (D) Reduce the model's temperature to 0 to ensure deterministic outputs. Which option best balances safety and utility?
Trap 1: Add a disclaimer that the model may make mistakes.
Disclaimer does not reduce risk of incorrect advice.
Trap 2: Deploy the model as-is and rely on doctors to catch errors.
Doctors may miss errors; this is unsafe.
Trap 3: Reduce the model's temperature to 0 to ensure deterministic outputs.
Deterministic outputs can still be incorrect.
- A
Implement a fact-checking pipeline that cross-references outputs with a trusted medical knowledge base.
Fact-checking reduces hallucinations and ensures accuracy.
- B
Add a disclaimer that the model may make mistakes.
Why wrong: Disclaimer does not reduce risk of incorrect advice.
- C
Deploy the model as-is and rely on doctors to catch errors.
Why wrong: Doctors may miss errors; this is unsafe.
- D
Reduce the model's temperature to 0 to ensure deterministic outputs.
Why wrong: Deterministic outputs can still be incorrect.
A team is fine-tuning a large language model for a domain-specific Q&A application. After fine-tuning, they observe that the model performs well on the training distribution but struggles with out-of-distribution (OOD) questions. Which approach would best improve OOD robustness?
Trap 1: Use early stopping based on training loss to avoid overfitting.
Early stopping on training loss may not address OOD issues.
Trap 2: Reduce the model size to prevent overfitting to the training data.
Smaller model has less capacity to learn generalizable features.
Trap 3: Increase the learning rate during fine-tuning to adapt faster to…
Higher learning rate can cause instability and catastrophic forgetting.
- A
Include a diverse set of examples from related domains in the fine-tuning dataset.
Diverse data improves generalization and OOD performance.
- B
Use early stopping based on training loss to avoid overfitting.
Why wrong: Early stopping on training loss may not address OOD issues.
- C
Reduce the model size to prevent overfitting to the training data.
Why wrong: Smaller model has less capacity to learn generalizable features.
- D
Increase the learning rate during fine-tuning to adapt faster to new patterns.
Why wrong: Higher learning rate can cause instability and catastrophic forgetting.
Which TWO measures can help reduce the risk of generating toxic or unsafe content when using OCI Generative AI Service?
Trap 1: Disable model monitoring and logging to reduce overhead.
Disabling monitoring reduces the ability to detect and respond to toxic outputs.
Trap 2: Increase the temperature parameter to make output more…
Higher temperature increases randomness, which can increase the chance of toxic outputs.
Trap 3: Fine-tune the model on a large dataset without any safety filtering.
Fine-tuning without safety filtering can embed toxic patterns into the model.
- A
Use few-shot prompting with examples that demonstrate safe and appropriate responses.
Safe examples help steer the model toward desired behavior.
- B
Disable model monitoring and logging to reduce overhead.
Why wrong: Disabling monitoring reduces the ability to detect and respond to toxic outputs.
- C
Increase the temperature parameter to make output more deterministic.
Why wrong: Higher temperature increases randomness, which can increase the chance of toxic outputs.
- D
Fine-tune the model on a large dataset without any safety filtering.
Why wrong: Fine-tuning without safety filtering can embed toxic patterns into the model.
- E
Enable the built-in content filtering features provided by OCI Generative AI Service.
Content filters block harmful outputs based on predefined categories.
Refer to the exhibit. A user runs the command shown and receives the error: 'ServiceError: NotAuthorizedOrNotFound'. What is the MOST likely cause?
Trap 1: The CLI is not configured with OCI credentials
Missing credentials would give an authentication error, not NotAuthorizedOrNotFound.
Trap 2: The model ID is incorrectly formatted
An invalid format would likely cause a validation error, not a 404/403.
Trap 3: The model is in a different region than iad
Region mismatch would give a different error.
- A
The CLI is not configured with OCI credentials
Why wrong: Missing credentials would give an authentication error, not NotAuthorizedOrNotFound.
- B
The user does not have the 'inspect' permission on the model
NotAuthorizedOrNotFound is common when permissions are insufficient.
- C
The model ID is incorrectly formatted
Why wrong: An invalid format would likely cause a validation error, not a 404/403.
- D
The model is in a different region than iad
Why wrong: Region mismatch would give a different error.
You are a cloud architect at a healthcare company that uses OCI Generative AI Service to analyze patient records and generate clinical summaries. The service is deployed in the Frankfurt region with a dedicated AI cluster. Recently, the compliance team flagged that some generated summaries contain hallucinated diagnoses not present in the source records. They demand immediate mitigation. The current setup uses the default model (cohere.command-r-08-2024) with temperature=0.7, top_p=0.9, and max_tokens=2048. The application sends the entire patient record as a single prompt. You have access to OCI Logging, monitoring metrics (latency, request count, token count, safety filter rejections), and the AI service's model fine-tuning capability. You must reduce hallucinations while minimizing latency increase. What is the most effective course of action?
Trap 1: Switch to cohere.command-light model for faster inference and add a…
A lighter model may be faster but likely less accurate; post-processing NER helps but does not prevent hallucinations at generation time.
Trap 2: Increase max_tokens to 4096 and use chunked processing with…
Chunking with overlap may reduce hallucinations by providing more context, but increasing max_tokens increases latency and cost; the improvement might be marginal.
Trap 3: Enable the safety filter with strict content moderation and set up…
Safety filters block harmful content but do not reduce hallucinations about medical facts; auditing only detects issues after the fact.
- A
Switch to cohere.command-light model for faster inference and add a post-processing step using a BERT-based NER model to validate entities.
Why wrong: A lighter model may be faster but likely less accurate; post-processing NER helps but does not prevent hallucinations at generation time.
- B
Increase max_tokens to 4096 and use chunked processing with overlapping context windows to provide more context.
Why wrong: Chunking with overlap may reduce hallucinations by providing more context, but increasing max_tokens increases latency and cost; the improvement might be marginal.
- C
Enable the safety filter with strict content moderation and set up OCI Logging to audit all generations.
Why wrong: Safety filters block harmful content but do not reduce hallucinations about medical facts; auditing only detects issues after the fact.
- D
Reduce temperature to 0.2, top_p to 0.5, and fine-tune the model on a curated dataset of 5,000 clinical summaries with a learning rate of 0.00005 and batch size of 8.
Lower temperature/top_p yields more deterministic outputs; fine-tuning on domain-specific data directly reduces hallucinations.
An enterprise deployed a custom fine-tuned model for generating financial reports. After the first month, the model's outputs began to include outdated information and occasional factual errors. The team suspects data drift. What is the best course of action?
Trap 1: Switch to a newer base model like Llama 3.1 without retraining.
A newer base model may still require fine-tuning on the specific domain data to be accurate.
Trap 2: Decrease the temperature parameter to 0.1 to reduce model…
Temperature controls randomness, not factual accuracy; it won't fix outdated knowledge.
Trap 3: Increase the max tokens value to allow longer responses.
Max tokens only affects response length, not quality or timeliness.
- A
Switch to a newer base model like Llama 3.1 without retraining.
Why wrong: A newer base model may still require fine-tuning on the specific domain data to be accurate.
- B
Decrease the temperature parameter to 0.1 to reduce model creativity.
Why wrong: Temperature controls randomness, not factual accuracy; it won't fix outdated knowledge.
- C
Retrain the model on the latest financial data and monitor for drift.
Retraining with current data mitigates data drift and improves output accuracy.
- D
Increase the max tokens value to allow longer responses.
Why wrong: Max tokens only affects response length, not quality or timeliness.
A developer is building a RAG application using Oracle Cloud Infrastructure (OCI) Document Understanding and OCI Generative AI. After chunking documents and generating embeddings, the developer observes that the retrieval step often returns chunks that are semantically unrelated to the query. Which action is MOST likely to improve retrieval relevance?
Trap 1: Switch from a dense embedding model to a sparse embedding model.
The embedding model choice is secondary; chunking is the primary issue.
Trap 2: Increase the chunk size to capture more context.
Larger chunks may include irrelevant content, reducing precision.
Trap 3: Reduce the number of retrieved chunks (k) in the vector search.
Reducing k may cause relevant passages to be missed.
- A
Switch from a dense embedding model to a sparse embedding model.
Why wrong: The embedding model choice is secondary; chunking is the primary issue.
- B
Adjust the chunk size and chunk overlap to better capture coherent passages.
Proper chunking helps preserve meaning and improves retrieval accuracy.
- C
Increase the chunk size to capture more context.
Why wrong: Larger chunks may include irrelevant content, reducing precision.
- D
Reduce the number of retrieved chunks (k) in the vector search.
Why wrong: Reducing k may cause relevant passages to be missed.
A company is deploying a RAG pipeline using OCI Data Science and OCI Generative AI. The pipeline uses a Cohere command model for generation and a Cohere embed model for retrieval. The team notices that the model occasionally produces hallucinated answers that are not supported by the retrieved context. Which strategy is MOST effective at reducing hallucinations?
Trap 1: Increase the temperature parameter of the generation model.
Higher temperature increases randomness, potentially worsening hallucinations.
Trap 2: Increase the number of retrieved chunks (k) to provide more context.
More context can include irrelevant or contradictory information.
Trap 3: Use a larger generative model with more parameters.
Larger models may still hallucinate; size alone does not guarantee faithful output.
- A
Implement a faithfulness verification step that re-ranks retrieved passages based on alignment with the generated answer.
A verification step can detect and mitigate unsupported claims.
- B
Increase the temperature parameter of the generation model.
Why wrong: Higher temperature increases randomness, potentially worsening hallucinations.
- C
Increase the number of retrieved chunks (k) to provide more context.
Why wrong: More context can include irrelevant or contradictory information.
- D
Use a larger generative model with more parameters.
Why wrong: Larger models may still hallucinate; size alone does not guarantee faithful output.
A developer is using OCI Generative AI to build a question-answering system over a large corpus of technical manuals. The developer uses the Cohere Embed model to generate embeddings and stores them in an OCI OpenSearch cluster. Queries are slow and the team needs to reduce latency. Which approach is BEST for improving search speed while maintaining acceptable accuracy?
Trap 1: Increase the embedding dimension for better representation.
Higher dimensionality increases computation and slows search.
Trap 2: Use exact nearest neighbor search instead of approximate.
Exact search is slower than approximate methods.
Trap 3: Increase the index refresh interval to reduce write overhead.
Index refresh affects write performance, not search latency.
- A
Increase the embedding dimension for better representation.
Why wrong: Higher dimensionality increases computation and slows search.
- B
Reduce the k value in the nearest neighbor search.
Fewer neighbors means less distance computation and faster retrieval.
- C
Use exact nearest neighbor search instead of approximate.
Why wrong: Exact search is slower than approximate methods.
- D
Increase the index refresh interval to reduce write overhead.
Why wrong: Index refresh affects write performance, not search latency.
An engineer configured the above index mapping for vector search. When performing a k-NN search, the results are unexpected. What is the most likely issue?
Exhibit
Refer to the exhibit.
document index mapping:
{
"settings": {
"index": {
"knn": true,
"knn.space_type": "cosinesimil"
}
},
"mappings": {
"properties": {
"content_embedding": {
"type": "knn_vector",
"dimension": 768,
"method": {
"name": "hnsw",
"engine": "faiss",
"space_type": "l2"
}
},
"metadata": {
"type": "object"
}
}
}
}Trap 1: The space type 'cosinesimil' is not supported; it should be…
'cosinesimil' is valid.
Trap 2: The dimension 768 does not match the embedding model's output…
768 is a common dimension.
Trap 3: The mapping uses 'knn_vector' type with 'faiss' engine, which is…
faiss is a supported engine.
- A
The space type 'cosinesimil' is not supported; it should be 'cosine'.
Why wrong: 'cosinesimil' is valid.
- B
The dimension 768 does not match the embedding model's output dimension.
Why wrong: 768 is a common dimension.
- C
The mapping uses 'knn_vector' type with 'faiss' engine, which is incompatible.
Why wrong: faiss is a supported engine.
- D
The space type at the index level and mapping level are mismatched.
Mismatch causes incorrect distance calculations.
An administrator runs the above CLI command to check the status of a dedicated AI cluster. The cluster is ACTIVE with capacity 10. However, a user reports that inference requests to this cluster are failing with a '429 Too Many Requests' error. What is the most likely cause?
Exhibit
Refer to the exhibit.
```
$ oci generative-ai dedicated-ai-cluster get --dedicated-ai-cluster-id ocid1.dedicatedaicluster.oc1.iad.xxxxx
{
"data": {
"capacity": 10,
"id": "ocid1.dedicatedaicluster.oc1.iad.xxxxx",
"lifecycle-state": "ACTIVE",
"time-created": "2024-01-15T10:00:00Z",
"time-updated": "2024-01-15T10:00:00Z"
}
}
```Trap 1: The cluster does not have enough nodes to handle the load
Capacity 10 nodes may be sufficient; error is rate limiting.
Trap 2: The user is not in the same compartment as the cluster
Compartment mismatch would cause 404 or 401, not 429.
Trap 3: The cluster is not in ACTIVE state
The output shows ACTIVE.
- A
The cluster is hitting the maximum inference requests per minute limit
429 indicates rate limit; the cluster has a requests-per-minute limit separate from node count.
- B
The cluster does not have enough nodes to handle the load
Why wrong: Capacity 10 nodes may be sufficient; error is rate limiting.
- C
The user is not in the same compartment as the cluster
Why wrong: Compartment mismatch would cause 404 or 401, not 429.
- D
The cluster is not in ACTIVE state
Why wrong: The output shows ACTIVE.
You are deploying a generative AI solution on OCI for a healthcare client that requires strict data residency (data must remain in the EU) and low-latency inference. The solution uses a fine-tuned LLM model (7B parameters) stored in Object Storage in the Frankfurt region. You have set up an OCI Data Science model deployment endpoint with GPU shape VM.GPU.A10.1, using a single replica. During load testing with 50 concurrent users, you observe high latency (average 8 seconds per request) and occasional 504 gateway timeouts. The model deployment logs show no errors, and the model loads successfully. You have confirmed that the Object Storage bucket is in the same region and that the network latency between the client and the endpoint is minimal (under 5 ms). Which action should you take to reduce latency and eliminate timeouts?
Trap 1: Increase the model deployment endpoint timeout setting from 60…
Option C is wrong because increasing timeout only masks the symptom without addressing the root cause (insufficient capacity).
Trap 2: Upgrade the model deployment shape to VM.GPU.A100.4 and keep a…
Option A is wrong because upgrading to a larger GPU (A100) increases compute power per request, but with only one replica, concurrency remains a bottleneck; scaling out is more effective for high concurrency.
Trap 3: Move the model deployment to the US East (Ashburn) region to…
Option B is wrong because moving to a different region increases data residency risk and may add latency.
- A
Increase the model deployment endpoint timeout setting from 60 seconds to 300 seconds in the OCI console.
Why wrong: Option C is wrong because increasing timeout only masks the symptom without addressing the root cause (insufficient capacity).
- B
Upgrade the model deployment shape to VM.GPU.A100.4 and keep a single replica.
Why wrong: Option A is wrong because upgrading to a larger GPU (A100) increases compute power per request, but with only one replica, concurrency remains a bottleneck; scaling out is more effective for high concurrency.
- C
Increase the number of replicas to 3 and enable autoscaling based on CPU utilization.
Option D is correct because increasing the number of replicas to handle concurrent requests reduces queuing and improves throughput, while also enabling load balancing to avoid timeouts.
- D
Move the model deployment to the US East (Ashburn) region to leverage lower-cost GPU capacity and reduce latency.
Why wrong: Option B is wrong because moving to a different region increases data residency risk and may add latency.
A team is deploying a generative AI model using OCI Functions for serverless inference. They are experiencing cold start latency of over 10 seconds for the first invocation after idle periods. What is the best strategy to reduce cold start latency?
Trap 1: Migrate the inference to OCI Data Flow for better performance.
Data Flow is for big data processing, not real-time inference.
Trap 2: Reduce the function timeout to force faster execution.
Reducing timeout does not affect cold start and may cause errors.
Trap 3: Increase the memory allocation for the function.
More memory can speed up cold start but provisioned concurrency is more direct.
- A
Migrate the inference to OCI Data Flow for better performance.
Why wrong: Data Flow is for big data processing, not real-time inference.
- B
Use provisioned concurrency to keep a set number of function instances warm.
Provisioned concurrency eliminates cold start by pre-warming instances.
- C
Reduce the function timeout to force faster execution.
Why wrong: Reducing timeout does not affect cold start and may cause errors.
- D
Increase the memory allocation for the function.
Why wrong: More memory can speed up cold start but provisioned concurrency is more direct.
A company has fine-tuned a large language model using OCI Generative AI service. When attempting to deploy the model to a dedicated endpoint, the deployment fails with an error indicating insufficient capacity. Which action should be taken to resolve this issue?
Trap 1: Delete existing endpoints to free capacity
Unnecessary if a limit increase is possible; also disrupts existing workloads.
Trap 2: Deploy the model to a different OCI region
This may avoid capacity issues but is not a direct resolution and could introduce latency.
Trap 3: Use a pre-built model instead of the fine-tuned model
This disregards the value of the fine-tuned model.
- A
Delete existing endpoints to free capacity
Why wrong: Unnecessary if a limit increase is possible; also disrupts existing workloads.
- B
Deploy the model to a different OCI region
Why wrong: This may avoid capacity issues but is not a direct resolution and could introduce latency.
- C
Use a pre-built model instead of the fine-tuned model
Why wrong: This disregards the value of the fine-tuned model.
- D
Request a service limit increase for dedicated endpoints
OCI allows customers to request higher limits for resources like dedicated endpoints.
A startup wants to minimize costs when using OCI Generative AI service for a chatbot application that experiences sporadic usage. Which deployment strategy is most cost-effective?
Trap 1: Use a pre-built model with a dedicated endpoint
Pre-built models can be used via on-demand, but dedicated endpoint is unnecessary.
Trap 2: Provision a dedicated endpoint for low latency
Dedicated endpoints have hourly costs, even when idle.
Trap 3: Deploy the model on OCI Compute with autoscaling
Adds operational overhead and may not be as simple as the managed service.
- A
Use a pre-built model with a dedicated endpoint
Why wrong: Pre-built models can be used via on-demand, but dedicated endpoint is unnecessary.
- B
Use the serverless on-demand API without dedicated endpoints
Pay per request, no idle costs.
- C
Provision a dedicated endpoint for low latency
Why wrong: Dedicated endpoints have hourly costs, even when idle.
- D
Deploy the model on OCI Compute with autoscaling
Why wrong: Adds operational overhead and may not be as simple as the managed service.
A user wants to access the OCI Generative AI service programmatically. Which credential method is recommended for use in a production application running on OCI Compute?
Trap 1: API signing keys
Keys can be compromised if stored in code.
Trap 2: User password and OCID
Passwords are not used for API authentication.
Trap 3: Resource principal
Resource principal is for OCI resources like Functions, not compute instances.
- A
API signing keys
Why wrong: Keys can be compromised if stored in code.
- B
Instance principal
Instance principal dynamically obtains credentials via instance metadata service.
- C
User password and OCID
Why wrong: Passwords are not used for API authentication.
- D
Resource principal
Why wrong: Resource principal is for OCI resources like Functions, not compute instances.
A company has deployed a generative AI model endpoint on OCI. They want to monitor token usage and latency for cost optimization. Which OCI service should they use to collect these metrics?
Trap 1: OCI Events
Events trigger actions based on changes, not for continuous metric collection.
Trap 2: OCI Notifications
Notifications are for alerting, not metric collection.
Trap 3: OCI Logging
Logging captures logs, not metrics. Token usage and latency are metric data.
- A
OCI Monitoring
OCI Monitoring collects and visualizes metrics such as token count and latency.
- B
OCI Events
Why wrong: Events trigger actions based on changes, not for continuous metric collection.
- C
OCI Notifications
Why wrong: Notifications are for alerting, not metric collection.
- D
OCI Logging
Why wrong: Logging captures logs, not metrics. Token usage and latency are metric data.
Question Discussion
Share a tip, memory trick, or ask about the reasoning behind this question. Do not post real exam questions, leaked content, braindumps, or copyrighted exam material. Comments are moderated and may be removed without notice.
Sign in to join the discussion.