CCNA Oci Genai Service Questions — Page 1 of 2

MCQmedium

A user wants to test different prompt variations with a generative model interactively without writing code. Which OCI Generative AI tool should they use?

A.OCI Generative AI API with InferenceClient

B.OCI Data Science Notebook

C.OCI Generative AI Agents

D.OCI Generative AI Playground

AnswerD

The Playground allows interactive testing of models with adjustable parameters.

Why this answer

The OCI Generative AI Playground provides a no-code interface to test models, adjust parameters, and experiment with prompts interactively.

Practice this question →

MCQmedium

A developer needs to generate embeddings for a set of search queries to be used in a semantic search system. Which input type should they specify when calling the Embedding API?

A.clustering

B.search_document

C.classification

D.search_query

AnswerD

This is optimized for short queries in search.

Why this answer

Option D is correct because the Embedding API in OCI Generative AI requires specifying the input type as 'search_query' when generating embeddings for search queries in a semantic search system. This input type optimizes the embedding model to produce vectors that are specifically tuned for query-side representation, ensuring better alignment with document embeddings during similarity search.

Exam trap

The trap here is that candidates may confuse 'search_query' with 'search_document' or assume a generic input type like 'clustering' or 'classification' is valid, not realizing that the Embedding API explicitly distinguishes between query and document embeddings for optimal semantic search performance.

How to eliminate wrong answers

Option A is wrong because 'clustering' is not a valid input type for the Embedding API; it is a downstream task that uses embeddings but not a parameter for embedding generation. Option B is wrong because 'search_document' is used for embedding documents, not queries, and specifying it for queries would mismatch the intended use case, degrading search accuracy. Option C is wrong because 'classification' is not a supported input type for the Embedding API; classification tasks use embeddings but require a different API or model endpoint.

Practice this question →

MCQhard

A developer is using the Chat API for a multi-turn conversation. They want the assistant to adopt a formal tone and always identify itself as 'OracleBot'. How should they configure the API request?

A.Include the persona instructions in the 'preamble_override' field and the conversation history in the 'messages' field of the Chat API

B.Set the 'temperature' parameter low and 'max_tokens' high

C.Use the Embedding API to embed the persona instructions and pass them with each request

D.Set the 'system_prompt' in the Generate API

AnswerA

Preamble override sets the assistant's behavior; messages carry the conversation history.

Why this answer

The Chat API supports a 'preamble_override' parameter to set system-level instructions, and 'messages' array for conversation history. The preamble is the correct place to set persona. Temperature and max tokens do not control persona.

The Generate API does not support system prompts.

Practice this question →

MCQeasy

An OCI Generative AI user wants to host a fine-tuned model with guaranteed low latency for a production application. Which option should they choose?

A.Use the Chat API with a base model

B.Use the on-demand shared inference endpoint

C.Use the OCI Generative AI Playground

D.Provision a Dedicated AI Cluster with model units for the fine-tuned model

AnswerD

A Dedicated AI Cluster provides reserved capacity and low-latency inference for custom models.

Why this answer

A Dedicated AI Cluster provides dedicated inference capacity with low latency, ideal for production fine-tuned models. Shared infrastructure may have variable latency.

Practice this question →

MCQmedium

A company wants to process a large batch of documents to generate summaries using OCI Generative AI. They need the most cost-effective option without compromising on summary quality. Which approach should they use?

A.Use the on-demand (shared) inference endpoint with the Summarisation API

B.Use OCI Generative AI Agents with a knowledge base

C.Provision a dedicated AI cluster for the batch job

D.Fine-tune a model on summarization and deploy it on a dedicated cluster

AnswerA

On-demand tokens are cost-effective for batch summarization.

Why this answer

On-demand pricing is pay-per-token, suitable for batch workloads without dedicated capacity; it's cost-effective for sporadic or large batch jobs.

Practice this question →

MCQmedium

An application needs to generate embeddings for text in multiple languages (English, Spanish, French). Which OCI Generative AI embedding model should be used?

A.embed-english-v3.0

B.Cohere Command R

C.embed-multilingual-v3.0

D.Meta Llama 3

AnswerC

This model supports multiple languages and generates high-quality embeddings for English, Spanish, French, and many others.

Why this answer

The embed-multilingual-v3.0 model is designed for multilingual text. The English-only model would not handle other languages well.

Practice this question →

MCQmedium

An organization plans to deploy a custom fine-tuned model for a real-time chat application requiring consistent low-latency responses. They expect high throughput during business hours. Which OCI Generative AI infrastructure choice best meets these requirements?

A.Use the shared infrastructure on-demand endpoint

B.Use the OCI Generative AI Playground for production

C.Provision a Dedicated AI Cluster with the required model units

D.Deploy the model on OCI Data Science using a custom container

AnswerC

Dedicated clusters offer isolated compute for low-latency inference and are suitable for custom fine-tuned models.

Why this answer

A Dedicated AI Cluster provides reserved compute with low latency and high throughput, ideal for production workloads with custom models. Shared infrastructure may have variable performance.

Practice this question →

Multi-Selectmedium

A company wants to build a RAG application using OCI Generative AI Agents. Which TWO components are required to set up the agent?

Select 2 answers

A.An embedding model

B.A knowledge base

C.A dedicated AI cluster

D.A data source (e.g., OCI Object Storage)

E.A fine-tuned model

AnswersB, D

Required to store indexed documents.

Why this answer

Option B is correct because a knowledge base is the core repository that stores the indexed content used by OCI Generative AI Agents to retrieve relevant information for answering queries. Without a knowledge base, the agent has no source of domain-specific data to ground its responses, making it a mandatory component.

Exam trap

Cisco often tests the misconception that you need to separately provision an embedding model or a dedicated AI cluster, when in fact OCI Generative AI Agents abstract these components away as part of the managed service, requiring only a data source and a knowledge base.

Practice this question →

Multi-Selectmedium

An organization needs to choose a model for a multilingual customer support chatbot that must understand and respond in five different languages. Which TWO models available in OCI Generative AI are suitable?

Select 2 answers

A.Meta Llama 3

B.Cohere Command R

C.Cohere Rerank

D.Cohere Command R+

E.Cohere embed-english-v3.0

AnswersA, D

Supports multiple languages.

Why this answer

Cohere Command R+ and Meta Llama 3 support multiple languages.

Practice this question →

MCQmedium

A developer is using the Embedding API to create embeddings for a clustering task. They want to ensure the embeddings are optimized for clustering similar documents. Which input type should they specify?

A.clustering

B.search_query

C.search_document

D.classification

AnswerA

Optimized for clustering.

Why this answer

The input type 'clustering' optimizes embeddings for clustering tasks.

Practice this question →

MCQmedium

A developer uses the OCI Generative AI Chat API to build a multi-turn conversational agent. They notice the model starts to lose context after several exchanges. What is the MOST likely cause?

A.The temperature parameter is set too high, causing the model to forget

B.The model's context window is too small for the conversation length

C.The developer is not sending the conversation history with each request

D.The fine-tuning dataset did not include multi-turn examples

AnswerC

The Chat API requires the client to include previous messages; otherwise the model has no context of prior turns.

Why this answer

The Chat API does not automatically manage conversation history; the developer must provide previous messages. The model itself retains state only within the provided context.

Practice this question →

MCQhard

An organization requires low-latency inference for a custom fine-tuned model that will be used in a real-time application. They also need guaranteed availability and isolation from other tenants. Which infrastructure option should they choose?

A.Provision a Dedicated AI Cluster with model units for the fine-tuned model

B.Create an endpoint on OCI API Gateway pointing to shared inference

C.Use the OCI Generative AI Playground to host the model

D.On-demand token-based inference on shared infrastructure

AnswerA

A Dedicated AI Cluster provides isolated compute resources, ensuring low latency and dedicated capacity for custom models.

Why this answer

Dedicated AI Clusters provide isolated, low-latency inference for custom models. Shared infrastructure is multi-tenant and may have higher latency. On-demand tokens and serving endpoints without a cluster do not offer dedicated resources.

Practice this question →

MCQhard

An organization needs to fine-tune a Cohere Command R model for a custom domain. They have prepared a dataset in JSONL format. Which component of the fine-tuning job specifies the base model and the training dataset location?

A.The dedicated AI cluster settings

B.The T-Few configuration file

C.The fine-tuning job creation request body

D.The model deployment endpoint configuration

AnswerC

The request body specifies base model, dataset, and other settings for the fine-tuning job.

Why this answer

The fine-tuning job creation request includes parameters such as 'baseModelId' and 'trainingDataset' (pointing to the dataset in Object Storage). The model deployment endpoint and cluster are separate steps.

Practice this question →

MCQhard

A developer is using the OCI Generative AI Chat API with Cohere Command R+ to build a multi-turn conversational agent. They want the agent to always respond in a formal tone, regardless of the user's phrasing. Which parameter should they set in the API request to achieve this consistently?

A.Set the 'stop_sequences' parameter to include periods

B.Set the 'temperature' parameter to 0.1

C.Use the 'system' parameter (or preamble_override) to provide a system message like 'You are a formal assistant'

D.Set the 'max_tokens' parameter to 200

AnswerC

The system message or preamble override instructs the model on its behavior and tone, which persists across the conversation.

Why this answer

A system message or preamble override sets the overall behavior and tone of the assistant for the entire conversation. Temperature controls randomness; max tokens limits length; stop sequences end generation — none are suitable for defining a persistent tone.

Practice this question →

MCQhard

A machine learning engineer is fine-tuning a Cohere Command R model using T-Few. They have prepared a JSONL dataset with 500 prompt-completion pairs. After submitting the fine-tuning job, they notice the model's performance on validation data is poor. Which action is MOST likely to improve performance?

A.Adding more high-quality training examples to the dataset

B.Increasing the number of training epochs

C.Setting the temperature to 0 in the fine-tuning configuration

D.Switching to a larger base model like Llama 3 70B

AnswerA

More data can improve T-Few fine-tuning performance.

Why this answer

T-Few is parameter-efficient and may require more data. Increasing dataset size or using data augmentation is a likely improvement.

Practice this question →

Multi-Selecthard

A data scientist is using the OCI Generative AI Playground to test a model for a text generation task. They want to control the output to be more focused and avoid repeating the same phrases. Which THREE parameter adjustments should they consider?

Select 3 answers

A.Increase the presence penalty

B.Increase the temperature

C.Increase the max tokens

D.Increase the frequency penalty

E.Decrease the temperature

AnswersA, D, E

Encourages the model to talk about new topics, reducing repetition.

Why this answer

Temperature controls randomness; frequency penalty reduces repetition of words/phrases; presence penalty encourages new topics.

Practice this question →

MCQmedium

An application needs to generate embeddings for customer reviews to cluster them by sentiment. Which input type should be specified in the Embedding API call?

A.clustering

B.search_document

C.search_query

D.classification

AnswerA

The clustering input type tailors embeddings for clustering algorithms.

Why this answer

For clustering tasks, the 'clustering' input type optimizes the embeddings for that purpose. The other types are for different tasks.

Practice this question →

MCQmedium

A machine learning engineer is preparing a dataset for fine-tuning a model in OCI Generative AI. The dataset consists of customer support conversations with questions and desired answers. What is the required format for the training data?

A.CSV file with columns 'input' and 'output'

B.Parquet file with 'text' and 'label' columns

C.Plain text file with examples separated by blank lines

D.JSONL file with 'prompt' and 'completion' fields per line

AnswerD

JSONL is the supported format with prompt/completion pairs.

Why this answer

OCI Generative AI expects a JSONL file where each line is a JSON object with 'prompt' and 'completion' fields. CSV, Parquet, and text files are not supported for fine-tuning.

Practice this question →

MCQmedium

A company wants to build a RAG-based assistant that answers queries using documents stored in OCI Object Storage. Which OCI Generative AI service should they use?

A.OCI Generative AI Playground

B.OCI Generative AI Agents

C.OCI Generative AI Embedding API

D.OCI Generative AI Chat API

AnswerB

Agents is a managed RAG service that integrates with Object Storage and provides a question-answering interface.

Why this answer

OCI Generative AI Agents provides a managed RAG service that can ingest documents from Object Storage and answer questions based on the ingested knowledge base.

Practice this question →

Multi-Selecthard

An enterprise needs to deploy a custom fine-tuned model for real-time inference with strict latency requirements. They also need to manage costs by paying only for usage. Which three steps are required to achieve this? (Select THREE)

Select 3 answers

A.Use the on-demand token-based inference on shared infrastructure

B.Create an endpoint using InferenceClient to point to the dedicated cluster

C.Delete the fine-tuning job after deployment to save costs

D.Provision a Dedicated AI Cluster with sufficient model units

E.Host the fine-tuned model on the Dedicated AI Cluster

AnswersB, D, E

An endpoint enables API calls to the model hosted on the cluster.

Why this answer

To deploy a custom model for dedicated low-latency inference, you need a Dedicated AI Cluster, host the model on it, and create an endpoint to access it. Using shared infrastructure does not guarantee low latency. Deletion of cluster is not needed.

On-demand tokens are for pay-as-you-go, not dedicated clusters.

Practice this question →

MCQhard

A security engineer needs to allow the DataScience group to use OCI Generative AI resources (Chat, Embedding, and Summarisation) in the compartment 'genai-compartment', but not allow them to create dedicated AI clusters. Which IAM policy statement achieves this?

A.Allow group DataScience to use generative-ai-family in compartment genai-compartment and allow group DataScience to use dedicated-ai-clusters in compartment genai-compartment

B.Allow group DataScience to use generative-ai-family in compartment genai-compartment where request.permission != 'MANAGE_DEDICATED_AI_CLUSTER'

C.Allow group DataScience to use generative-ai-family in compartment genai-compartment and allow group DataScience to inspect dedicated-ai-clusters in compartment genai-compartment

D.Allow group DataScience to use generative-ai-family in compartment genai-compartment and deny group DataScience to manage dedicated-ai-clusters in compartment genai-compartment

AnswerC

'use' on generative-ai-family allows all GenAI operations except cluster management; 'inspect' on dedicated-ai-clusters allows listing but not creating or managing clusters.

Why this answer

The 'inspect' verb on dedicated-ai-clusters prevents creating or managing clusters while allowing use of other GenAI resources. 'use' on generative-ai-family allows using the resources. Option A denies all GenAI resources, B grants full access including clusters, C uses incorrect resource type.

Practice this question →

MCQmedium

A data scientist is fine-tuning a model using T-Few in OCI Generative AI. They have prepared a dataset with prompt/completion pairs. Which file format is required for the training data upload?

A.Plain text file with one prompt-completion pair per line separated by a tab

B.Parquet file with a 'text' column containing concatenated prompt and completion

C.CSV with columns 'input' and 'output'

D.JSONL with each line containing 'prompt' and 'completion' keys

AnswerD

The training dataset must be a JSONL file where each line is a JSON object with 'prompt' and 'completion' fields.

Why this answer

OCI Generative AI fine-tuning expects training data in JSONL format with specific fields.

Practice this question →

MCQmedium

A financial services firm needs to ensure that only members of the 'DataScientists' group can use OCI Generative AI resources in the 'prod' compartment. Which IAM policy statement should be applied?

A.Allow group DataScientists to manage genai-family in tenancy

B.Allow group DataScientists to use ai-services in compartment prod

C.Allow group DataScientists to manage genai-family in compartment prod

D.Allow group DataScientists to read genai-family in compartment prod

AnswerC

This policy grants the group access to all GenAI resources in the prod compartment.

Why this answer

Option C is correct because the 'manage' verb on the 'genai-family' resource type grants full access (including use) to OCI Generative AI resources, and scoping the policy to 'compartment prod' restricts the permission to only that compartment. The requirement is to allow the 'DataScientists' group to use (which implies manage-level access for creating and running inference jobs) Generative AI resources specifically in the 'prod' compartment, not the entire tenancy.

Exam trap

The trap here is that candidates often confuse the generic 'ai-services' resource type (which exists for other AI services like OCI AI Vision or OCI Language) with the specific 'genai-family' resource type required for OCI Generative AI, leading them to select Option B.

How to eliminate wrong answers

Option A is wrong because it grants access to 'genai-family in tenancy', which would allow the group to manage Generative AI resources across all compartments, violating the requirement to restrict access to only the 'prod' compartment. Option B is wrong because 'ai-services' is not a valid resource type for OCI Generative AI; the correct resource type is 'genai-family', and using 'ai-services' would match no resources, effectively granting no access. Option D is wrong because the 'read' verb only allows viewing metadata and listing resources, but does not permit using (creating, updating, or invoking) Generative AI models, which is required for the DataScientists to actually work with the AI services.

Practice this question →

MCQeasy

A user wants to quickly test different prompts and parameters (temperature, max tokens) with various OCI Generative AI models without writing any code. Which tool should they use?

A.The InferenceClient SDK in Python

B.OCI CLI with generative-ai commands

C.The OCI Generative AI Playground

D.OCI Cloud Shell

AnswerC

The Playground is the no-code interface for experimenting with models.

Why this answer

The OCI Generative AI Playground provides an interactive web UI to test models and adjust parameters.

Practice this question →

Multi-Selectmedium

A team is planning to use OCI Generative AI to summarize large documents. They need to choose between on-demand (pay-as-you-go) and dedicated cluster pricing. Which THREE factors should they consider when deciding? (Choose three.)

Select 3 answers

A.Throughput requirements and consistency

B.Latency sensitivity of the application

C.Model accuracy on summarization tasks

D.Data residency requirements

E.Cost predictability and budget constraints

AnswersA, B, E

Dedicated clusters guarantee throughput; on-demand is shared and may throttle.

Why this answer

Latency sensitivity, throughput consistency, and cost predictability are key factors. On-demand is variable cost with potential contention; dedicated offers fixed cost with reserved capacity. Model accuracy does not differ based on pricing model, and data residency is a compliance factor but not directly tied to pricing model choice.

Practice this question →

MCQmedium

A data scientist is using the OCI Generative AI Playground to test a summarization model. They want to generate shorter summaries and avoid repetitive phrasing. Which parameter adjustments should they make?

A.Decrease temperature and increase frequency penalty

B.Increase temperature and increase presence penalty

C.Set stop sequences and disable frequency penalty

D.Increase temperature and decrease frequency penalty

AnswerA

Lower temperature makes output more deterministic (focused summaries), and higher frequency penalty discourages repetition, achieving the goal.

Why this answer

Lowering temperature reduces randomness, increasing frequency penalty penalizes repeated tokens, and setting appropriate max tokens controls output length. Stop sequences define when generation stops, and presence penalty penalizes token presence overall.

Practice this question →

MCQmedium

A company wants to build a customer service chatbot that answers questions about their internal policy documents. The documents are updated monthly, and the team cannot afford to retrain a model each time. Which approach is MOST appropriate?

A.Use a larger foundation model with a longer context window and paste all documents into each prompt

B.Train a custom model from scratch on the policy documents each month

C.Use Retrieval-Augmented Generation (RAG) with the policy documents indexed in a vector store

D.Fine-tune a base LLM on the policy documents monthly

AnswerC

RAG retrieves relevant document chunks at query time, ensuring the chatbot always answers from the latest uploaded documents without any model retraining.

Why this answer

Retrieval-Augmented Generation (RAG) is the most appropriate approach because it allows the chatbot to answer questions based on the latest policy documents without retraining the model. By indexing the documents in a vector store, the system retrieves relevant chunks at query time and passes them as context to a frozen LLM, ensuring the model always uses the most current information.

Exam trap

Cisco often tests the misconception that fine-tuning is the only way to incorporate new knowledge, but the trap here is that candidates overlook RAG as a zero-training alternative that keeps the model static while the knowledge base evolves.

How to eliminate wrong answers

Option A is wrong because pasting all documents into each prompt would exceed the context window limits of even the largest foundation models, leading to high latency, token costs, and potential loss of relevant information. Option B is wrong because training a custom model from scratch each month is prohibitively expensive, time-consuming, and requires extensive GPU resources and data preparation, which contradicts the requirement to avoid retraining. Option D is wrong because fine-tuning a base LLM monthly would still require retraining the model weights on the updated documents, incurring significant cost and effort, and does not solve the need for dynamic, up-to-date knowledge without retraining.

Practice this question →

Multi-Selectmedium

A company wants to reduce costs for a high-volume, latency-tolerant text generation workload using OCI Generative AI. Which TWO strategies should they consider?

Select 2 answers

A.Use on-demand (pay-per-token) pricing instead of a dedicated AI cluster

B.Provision a dedicated AI cluster for consistent throughput

C.Increase the temperature parameter to reduce token usage

D.Use the largest available model to ensure best quality

E.Use a smaller, faster model like Cohere Command R instead of Command R+

AnswersA, E

On-demand pricing is cost-effective for variable or high-volume workloads without dedicated infrastructure commitment.

Why this answer

On-demand pricing (pay-per-token) and using a smaller/faster model (like Command R instead of R+) are cost-saving measures. Dedicated clusters increase cost; the largest model is more expensive; increasing temperature doesn't affect pricing.

Practice this question →

MCQmedium

A security administrator needs to grant a group of data scientists the ability to use OCI Generative AI service resources in the compartment 'genai-dev'. They want to allow the group to create endpoints and run inference, but not to manage IAM policies. Which policy statement is correct?

A.Allow group DataScientists to use generative-ai-family in compartment genai-dev

B.Allow group DataScientists to manage generative-ai-family in compartment genai-dev

C.Allow group DataScientists to inspect generative-ai-family in compartment genai-dev

D.Allow group DataScientists to read generative-ai-family in compartment genai-dev

AnswerA

'use' allows creating endpoints and running inference without granting policy management.

Why this answer

Option A is correct because the 'use' verb in OCI IAM policies grants the ability to perform operational actions on resources without allowing management or administrative changes. For the OCI Generative AI service, 'use generative-ai-family' permits data scientists to create endpoints and run inference (the core actions needed), while explicitly preventing them from managing IAM policies or altering resource configurations. This aligns with the principle of least privilege required by the security administrator.

Exam trap

The trap here is that candidates often confuse 'use' with 'manage' or 'read', assuming that creating endpoints requires 'manage' permissions, but OCI's 'use' verb is specifically designed for operational tasks like running inference and creating resources within a compartment without granting full administrative rights.

How to eliminate wrong answers

Option B is wrong because 'manage' grants full administrative control, including creating, updating, and deleting resources, as well as managing IAM policies, which exceeds the required permissions. Option C is wrong because 'inspect' only allows listing resources and reading metadata, not creating endpoints or running inference. Option D is wrong because 'read' permits viewing resource details but does not allow any write or operational actions like creating endpoints or invoking inference.

Practice this question →

MCQeasy

Which OCI Generative AI model is best suited for generating embeddings from text that can be used for semantic search across multiple languages?

A.Cohere Command R+

B.Cohere embed-multilingual-v3.0

C.Meta Llama 3 70B

D.Cohere Rerank

AnswerB

This model is specifically built for multilingual embeddings, supporting semantic search across many languages.

Why this answer

Embed-multilingual-v3.0 is designed for multilingual embedding tasks. The other models are primarily for text generation or English-only embeddings.

Practice this question →

MCQmedium

A team wants to use OCI Generative AI Agents to build a RAG system that answers questions from documents stored in OCI Object Storage. What must they create first?

A.A knowledge base that indexes the documents from Object Storage

B.A fine-tuned model on the documents

C.A dedicated AI cluster for the agent

D.An OCI Functions endpoint to process documents

AnswerA

The knowledge base links the data source to the agent and enables retrieval.

Why this answer

The agent requires a knowledge base that is connected to the data source. Directly creating a session without a knowledge base and agent will fail.

Practice this question →

MCQmedium

A developer is using the OCI Generative AI Agents service to build a RAG application. They have uploaded policy PDFs to an OCI Object Storage bucket. What is the next step to make the documents searchable?

A.Create a dedicated AI cluster to host the documents

B.Use the Embeddings API to embed each PDF manually

C.Fine-tune a model on the PDFs

D.Create a knowledge base from the Object Storage bucket

AnswerD

The knowledge base indexes the documents from Object Storage and enables retrieval.

Why this answer

In OCI Generative AI Agents, you create a knowledge base from data sources like Object Storage. The agent then uses this knowledge base to retrieve relevant information for answering questions. Creating a dedicated cluster or fine-tuning is not required for RAG with agents.

Practice this question →

Multi-Selectmedium

A developer is creating a fine-tuning job for a Cohere Command R model using OCI Generative AI. Which TWO of the following are required when submitting the fine-tuning job?

Select 2 answers

A.Base model name (e.g., Cohere Command R)

B.A Dedicated AI Cluster ID

C.A pre-trained embedding model

D.Training dataset in JSONL format with prompt/completion pairs

E.Hyperparameters file in YAML format

AnswersA, D

You must specify which model to fine-tune.

Why this answer

The training dataset and the base model are mandatory inputs for a fine-tuning job. The compartment is part of the resource configuration, and the hyperparameters file is optional but not required.

Practice this question →

MCQeasy

Which input type should be used with the Cohere Embed API when generating embeddings for a query in a semantic search system?

A.search_document

B.clustering

C.search_query

D.classification

AnswerC

'search_query' is the correct input type for query embeddings in semantic search.

Why this answer

The Cohere Embed API uses the `search_query` input type specifically for embedding search queries in a semantic search system. This tells the model to optimize the embedding for matching against documents that were embedded with the `search_document` type, ensuring proper alignment in the vector space for retrieval tasks.

Exam trap

Cisco often tests the distinction between `search_query` and `search_document` to trap candidates who assume a single input type works for both sides of a semantic search system, leading them to pick `search_document` for the query.

How to eliminate wrong answers

Option A is wrong because `search_document` is used for embedding the documents in a corpus, not the query; using it for a query would produce embeddings optimized for document representation, leading to poor retrieval accuracy. Option B is wrong because `clustering` is an input type for grouping similar texts, not for search queries, and it does not align embeddings for query-document matching. Option D is wrong because `classification` is used for tasks like sentiment analysis or topic labeling, not for semantic search, and its embeddings are optimized for decision boundaries rather than retrieval.

Practice this question →

MCQeasy

You are using the OCI Generative AI Playground with a Cohere Command R model. You want the model to generate more varied and creative responses. Which parameter should you increase?

A.Temperature

B.Stop sequences

C.Max tokens

D.Frequency penalty

AnswerA

Increasing temperature (e.g., to 0.8) increases randomness and variety in the generated text, making responses more creative.

Why this answer

Temperature controls the randomness of the output. Higher temperature (e.g., 0.8) makes the model more creative and varied. Max tokens, frequency penalty, and stop sequences do not primarily affect creativity.

Practice this question →

MCQhard

During fine-tuning a model in OCI Generative AI, the training loss does not decrease after several epochs. The dataset has 5,000 prompt-completion pairs. What is the MOST likely cause?

A.The dataset is too large, causing underfitting

B.The dataset contains many duplicate prompt-completion pairs

C.The learning rate is too high, causing the loss to oscillate

D.The dataset format is incorrect (e.g., missing 'prompt' or 'completion' fields)

AnswerD

Incorrect JSONL format can cause the training to not learn effectively, as the model may not receive the expected inputs.

Why this answer

In OCI Generative AI fine-tuning, the dataset must be a JSON file with exactly 'prompt' and 'completion' fields. If these fields are missing or incorrectly named, the service cannot parse the training data, so the model never learns from the actual content, causing the training loss to remain flat across epochs. This is the most direct cause of a non-decreasing loss when the dataset structure is invalid.

Exam trap

Cisco often tests the misconception that a flat loss is always due to hyperparameter issues (like learning rate) or data quantity, when in fact the most common cause in OCI Generative AI is an incorrectly formatted dataset that prevents any learning from occurring.

How to eliminate wrong answers

Option A is wrong because 5,000 prompt-completion pairs is a moderate-sized dataset for fine-tuning; underfitting typically occurs with too small a dataset or insufficient model capacity, not with a dataset that is too large. Option B is wrong because duplicate pairs can cause overfitting or biased learning, but they would still allow the loss to decrease as the model memorizes the duplicates; they would not prevent loss reduction entirely. Option C is wrong because a learning rate that is too high causes the loss to oscillate or diverge (increase), not to remain flat and unchanged; a flat loss indicates no learning is happening, not instability.

Practice this question →

MCQeasy

Which OCI Generative AI model is designed to rerank and improve the relevance of documents retrieved by a search system?

A.Cohere Rerank

B.Cohere Embed

C.Cohere Command R

D.Meta Llama 3

AnswerA

Rerank is purpose-built for reranking search results.

Why this answer

Cohere Rerank is specifically designed for reranking documents to improve relevance ranking.

Practice this question →

MCQmedium

A data scientist needs to fine-tune a model using OCI Generative AI. They have prepared a dataset in JSONL format with prompt/completion pairs. The fine-tuning job is configured with the T-Few technique. What is a key characteristic of T-Few fine-tuning?

A.It requires unlabeled data for unsupervised pre-training before fine-tuning

B.It updates all model parameters, requiring significant compute resources

C.It modifies a small subset of parameters using lightweight adapter layers

D.It only trains a new classification head on top of a frozen base model

AnswerC

T-Few uses adapter-based fine-tuning that updates a small number of parameters while keeping most of the model frozen.

Why this answer

T-Few is a parameter-efficient fine-tuning method that updates only a small fraction of model parameters, making it faster and more resource-efficient than full fine-tuning. It does not train all parameters, add new layers, or require unlabeled data.

Practice this question →

MCQmedium

A company uses OCI Generative AI Agents to build a RAG application that answers questions from documents stored in OCI Object Storage. The knowledge base is updated daily. Which step is necessary to ensure the agent incorporates the latest documents?

A.Re-sync the knowledge base after updating the documents in Object Storage

B.Use the Embedding API to manually index each new document

C.Set the session API to refresh automatically

D.Recreate the agent each time documents change

AnswerA

Re-syncing the knowledge base indexes the new or updated documents, making them available for retrieval by the agent.

Why this answer

The knowledge base indexes the data sources. To reflect changes in the source documents, you must re-sync or re-index the knowledge base. The agent endpoint or session API does not automatically refresh content.

Practice this question →

MCQeasy

Which OCI Generative AI service component is designed to convert text into vector representations for use in semantic search?

A.Chat API

B.Summarisation API

C.Generate API

D.Embedding API

AnswerD

Embedding API produces vector embeddings from text inputs.

Why this answer

The Embedding API generates vector embeddings from text, which can be used for semantic search, clustering, etc.

Practice this question →

MCQeasy

Which OCI Generative AI service component is specifically designed to convert text into vector representations for use in semantic search and clustering?

A.Embedding

B.Chat

C.Summarisation

D.Agents

AnswerA

The Embedding API converts text into dense vector representations for search, classification, and clustering.

Why this answer

Embedding is the correct answer because it is the OCI Generative AI service component that converts text into dense vector representations (embeddings). These numerical vectors capture semantic meaning, enabling similarity comparisons for tasks like semantic search and clustering, where documents with similar vectors are considered contextually related.

Exam trap

Cisco often tests the misconception that 'Chat' or 'Summarisation' can handle semantic search tasks, but the trap here is confusing generative output (text) with representation learning (vectors), which is the unique role of Embedding.

How to eliminate wrong answers

Option B (Chat) is wrong because it is designed for interactive conversational AI, generating human-like responses in a dialogue format, not for converting text into vector representations. Option C (Summarisation) is wrong because it condenses long text into shorter summaries, producing natural language output rather than numerical vectors. Option D (Agents) is wrong because it orchestrates multi-step workflows and tool integrations, not vector generation for semantic search.

Practice this question →

MCQmedium

A developer is using the OCI GenAI Chat API to build a multi-turn customer support chatbot. They want the assistant to always introduce itself as 'SupportBot' and never mention being an AI. How should they configure the API call?

A.Use the Generate API instead and include the instruction in the prompt

B.Set the preamble override to a JSON object with the assistant's identity

C.Set the system message (preamble) to 'You are SupportBot. You must never mention that you are an AI.'

D.Set the first user message to 'Introduce yourself as SupportBot and never say you are an AI.'

AnswerC

The system message (preamble) defines the assistant's persona and instructions.

Why this answer

Option C is correct because the OCI GenAI Chat API supports a 'system message' or 'preamble' parameter that sets the assistant's behavior and identity for the entire conversation. By setting the preamble to 'You are SupportBot. You must never mention that you are an AI.', the developer enforces the desired persona and restriction across all turns, ensuring the assistant introduces itself as SupportBot and avoids any reference to being an AI.

Exam trap

Cisco often tests the distinction between single-turn (Generate API) and multi-turn (Chat API) endpoints, and the trap here is that candidates confuse the first user message with a system instruction, not realizing that only the dedicated system message parameter enforces persistent behavior across all turns.

How to eliminate wrong answers

Option A is wrong because the Generate API is a single-turn text generation endpoint that does not support multi-turn conversation context or persistent system instructions; using it would require manually managing conversation history and re-injecting the instruction in every prompt, which is inefficient and error-prone. Option B is wrong because the OCI GenAI Chat API does not accept a 'preamble override' as a JSON object; the correct parameter is a plain text string for the system message (preamble), and a JSON object would be rejected or misinterpreted. Option D is wrong because setting the first user message to include the instruction does not persist across turns; the assistant might follow it initially but could deviate in subsequent responses, and the instruction is not enforced as a system-level constraint.

Practice this question →

MCQmedium

A company wants to use OCI Generative AI Agents to build a question-answering system over documents stored in OCI Object Storage. Which component acts as the knowledge source for the agent?

A.A Dedicated AI Cluster

B.The OCI Generative AI Playground

C.An endpoint created via the InferenceClient

D.A knowledge base created from the Object Storage bucket

AnswerD

The knowledge base indexes documents from the data source (Object Storage) for retrieval.

Why this answer

In OCI Generative AI Agents, a knowledge base is the indexed representation of data sources (like Object Storage buckets). The agent uses the knowledge base to retrieve relevant information.

Practice this question →

MCQeasy

You need to convert a set of customer support tickets into vector embeddings for a similarity search application. Which OCI Generative AI model should you use?

A.Cohere Rerank

B.Cohere Embed (e.g., embed-english-v3.0)

C.Meta Llama 3

D.Cohere Command R

AnswerB

Cohere Embed models are specifically designed to generate dense vector embeddings from text, ideal for similarity search and retrieval tasks.

Why this answer

The Cohere Embed models are designed for text-to-vector embedding. The other options are for text generation or reranking.

Practice this question →

MCQmedium

A regulatory compliance team needs to restrict access to the OCI Generative AI service so that only users in the 'AI_Engineers' group can create fine-tuning jobs and endpoints. Which IAM policy statement should be used?

A.Allow group AI_Engineers to inspect generative-ai-family in compartment ABC

B.Allow group AI_Engineers to manage generative-ai-family in tenancy

C.Allow group AI_Engineers to read generative-ai-family in compartment ABC

D.Allow group AI_Engineers to use generative-ai-family in compartment ABC

AnswerD

'use' permission includes the ability to create, update, and delete generative AI resources.

Why this answer

To allow a group to manage generative AI resources, you allow them to use the 'generative-ai-family' resource type. Specific verbs like 'inspect' limit visibility but not management. 'Read' only allows viewing, not creating.

Practice this question →

MCQmedium

A data scientist is creating a fine-tuning job in OCI Generative AI. They have prepared a JSONL dataset with prompt/completion pairs. What is the correct format for each line in the JSONL file?

A.{"input": "What is OCI?", "output": "Oracle Cloud Infrastructure is a cloud computing platform."}

B.{"prompt": "What is OCI?", "completion": "Oracle Cloud Infrastructure is a cloud computing platform."}

C.{"text": "What is OCI?", "label": "Oracle Cloud Infrastructure is a cloud computing platform."}

D.{"question": "What is OCI?", "answer": "Oracle Cloud Infrastructure is a cloud computing platform."}

AnswerB

This is the correct JSON format with 'prompt' and 'completion' fields.

Why this answer

Option B is correct because the OCI Generative AI fine-tuning service expects JSONL files where each line contains exactly the keys "prompt" and "completion". This format matches the service's internal training pipeline, which maps the prompt to the input and the completion to the expected output during supervised fine-tuning.

Exam trap

The trap here is that candidates often confuse the key names with those used in other AI services (like OpenAI's fine-tuning which uses "prompt" and "completion" as well, but OCI's documentation explicitly requires these exact keys, and the exam tests attention to the specific OCI schema).

How to eliminate wrong answers

Option A is wrong because it uses "input" and "output" keys, which are not recognized by the OCI Generative AI fine-tuning API; the service requires the exact key names "prompt" and "completion". Option C is wrong because it uses "text" and "label" keys, which are commonly used in classification tasks but not for prompt/completion fine-tuning in OCI Generative AI. Option D is wrong because it uses "question" and "answer" keys, which are not the expected schema; the service strictly enforces the "prompt" and "completion" key names for JSONL lines.

Practice this question →

MCQhard

A company needs to classify customer support tickets into 20 categories. They have a labeled dataset of 50,000 examples. They want to use OCI Generative AI Embedding API to generate embeddings, then train a classifier. Which input type should they use for the embedding API when processing the training examples?

A.search_query

B.clustering

C.search_document

D.classification

AnswerD

classification input type is designed to produce embeddings that improve classifier performance.

Why this answer

For training a classifier, the 'classification' input type is optimized to generate embeddings that work well for classification tasks. Other types are for different use cases.

Practice this question →

MCQhard

A team fine-tuned a Cohere Command R model using the T-Few technique on a dataset of JSONL prompt/completion pairs. After deployment, they observe that the model's responses are too repetitive. Which parameter adjustment in the OCI Generative AI Playground would BEST address this issue?

A.Increase frequency penalty

B.Increase max tokens

C.Decrease temperature

D.Add a stop sequence

AnswerA

Frequency penalty reduces the likelihood of repeating the same tokens, directly mitigating repetition.

Why this answer

Increasing the frequency penalty discourages the model from repeating tokens that have already appeared, reducing repetitive outputs. Temperature and max tokens address randomness and length, not repetition specifically. Stop sequences are for terminating generation early.

Practice this question →

Multi-Selecthard

A machine learning engineer is fine-tuning a Cohere Command R model using OCI Generative AI. They want to evaluate the fine-tuned model's performance before deploying. Which TWO methods can they use?

Select 2 answers

A.Check the fine-tuning job's status in the OCI Console for validation metrics

B.Use the OCI Generative AI Playground to send test prompts to the fine-tuned model endpoint

C.Provision a dedicated AI cluster and monitor the cluster's latency metrics

D.Use the OCI CLI to call the model inference endpoint with test data

E.Use the Python SDK's InferenceClient to programmatically send test prompts and analyze responses

AnswersB, E

If the model is hosted on a dedicated cluster, the Playground can be configured to point to that endpoint for interactive testing.

Why this answer

Option B is correct because the OCI Generative AI Playground provides a user-friendly interface to directly test prompts against a deployed fine-tuned model endpoint, allowing you to evaluate responses interactively without writing code. Option D is correct because the OCI CLI allows you to call the model inference endpoint with test data, enabling automated or scripted evaluation of the fine-tuned model's performance.

Exam trap

Cisco often tests the distinction between training/validation metrics shown during fine-tuning (Option A) and post-deployment inference evaluation (Options B and D), leading candidates to mistakenly think job status metrics are sufficient for model evaluation.

Practice this question →

Multi-Selectmedium

A company is using OCI Generative AI Agents to build a customer support assistant. They have uploaded product manuals to OCI Object Storage. Which two components are required to create the agent? (Select TWO)

Select 2 answers

A.Create a Generative AI Agent using the knowledge base

B.Create an endpoint via InferenceClient

C.Create a knowledge base from the Object Storage bucket

D.Upload data directly to the agent (not via knowledge base)

E.Provision a Dedicated AI Cluster

AnswersA, C

The agent uses the knowledge base to answer questions.

Why this answer

Option A is correct because a Generative AI Agent requires a knowledge base to provide domain-specific context for answering queries. The knowledge base is the source of truth that the agent uses to ground its responses, and it must be explicitly created and associated with the agent. Without a knowledge base, the agent would lack the product manual data needed for customer support.

Exam trap

Cisco often tests the misconception that you can directly upload data to an agent or that an endpoint must be created manually, when in fact the agent relies on a knowledge base for data ingestion and uses a managed endpoint automatically.

Practice this question →

MCQhard

During fine-tuning a model using T-Few in OCI Generative AI, the job fails with a 'dataset format error'. The training dataset is a JSONL file. Which of the following is the MOST likely cause?

A.The file contains more than 1000 lines

B.Some lines have an extra 'metadata' key

C.The completion field contains more than 1024 tokens

D.The file uses Unix line endings

AnswerB

The dataset format expects only 'prompt' and 'completion'. Extra keys like 'metadata' are not allowed and cause validation errors.

Why this answer

Each JSONL line must contain exactly 'prompt' and 'completion' keys. Extra keys or missing keys cause format errors.

Practice this question →

Multi-Selecthard

An administrator is creating IAM policies for OCI Generative AI. They want to allow a group of developers to use (invoke) models and manage endpoints, but NOT create or delete Dedicated AI Clusters. Which TWO policy statements should be combined?

Select 2 answers

A.Allow group developers to use generative-ai-clusters in compartment dev

B.Allow group developers to manage generative-ai-family in compartment dev

C.Allow group developers to use generative-ai-family in compartment dev

D.Allow group developers to manage generative-ai-clusters in compartment dev

E.Allow group developers to manage generative-ai-endpoints in compartment dev

AnswersC, E

Why this answer

Option C is correct because the 'use' verb on 'generative-ai-family' grants permission to invoke models and perform read-only operations on all Generative AI resources, including endpoints, without allowing create or delete actions on Dedicated AI Clusters. Option E is correct because 'manage' on 'generative-ai-endpoints' allows full control over endpoints (create, update, delete, use) while still not granting any permissions on Dedicated AI Clusters. Together, these two statements give developers the ability to use models and manage endpoints but explicitly exclude create/delete on Dedicated AI Clusters.

Exam trap

Cisco often tests the misconception that 'manage generative-ai-family' is required to manage endpoints, but this over-permits by also allowing cluster management; candidates must remember that specific resource-type verbs can be combined with family-level 'use' to achieve precise access control.

Practice this question →

MCQhard

An organization requires low-latency inference for a custom fine-tuned model deployed on OCI Generative AI. The model must be isolated from other tenants. Which infrastructure choice meets these requirements?

A.Use the shared infrastructure endpoint for the fine-tuned model

B.Deploy the model on OCI Data Science model deployment

C.Provision a dedicated AI cluster with the required model units

D.Use an on-premises GPU server and call OCI GenAI via API gateway

AnswerC

A dedicated AI cluster provides isolated compute resources, ensuring low latency and no interference from other workloads.

Why this answer

A dedicated AI cluster provides isolated, low-latency inference for custom models. Shared infrastructure does not guarantee isolation and may have higher latency due to contention.

Practice this question →

MCQmedium

A.Use Retrieval-Augmented Generation (RAG) with the policy documents indexed in a vector store

B.Train a custom model from scratch on the policy documents each month

C.Use a larger foundation model with a longer context window and paste all documents into each prompt

D.Fine-tune a base LLM on the policy documents monthly

AnswerA

RAG retrieves relevant document chunks at query time, ensuring the chatbot always answers from the latest uploaded documents without any model retraining.

Why this answer

RAG (Retrieval-Augmented Generation) allows the LLM to retrieve relevant document sections at inference time, so knowledge stays current without retraining. The other options either require expensive retraining for each update or lack document grounding.

Practice this question →

MCQhard

A developer is using the OCI Generative AI Chat API with a system prompt to guide the assistant's behavior. They notice that after a few turns, the assistant starts ignoring the system instructions. What is the MOST likely cause?

A.The temperature is set too high, causing the model to ignore instructions

B.The system prompt (preamble) is only sent once and may be truncated as the conversation grows

C.The model is not fine-tuned to follow system prompts

D.The conversation history is not being sent with subsequent requests, so the model loses context

AnswerB

In the Chat API, the preamble is sent only at the start; if the conversation history exceeds the context window, the preamble may be dropped, causing the model to lose instructions.

Why this answer

The preamble is sent only on the first turn; it may be truncated or lost as conversation history grows. The developer should inject the preamble in each request or manage history carefully.

Practice this question →

MCQeasy

Which OCI Generative AI API should be used to generate a summary of a long legal document?

A.Embedding API

B.Rerank API

C.Chat API

D.Generate API

AnswerD

The Generate API can perform text generation tasks including summarization.

Why this answer

The OCI Generative AI Generate API (option D) is specifically designed for text generation tasks, including summarization, by taking a prompt and returning a coherent response. For a long legal document, you would provide the document text as part of the prompt and instruct the model to generate a summary, making the Generate API the correct choice for this use case.

Exam trap

Cisco often tests the distinction between generation and embedding/reranking APIs, trapping candidates who confuse summarization (a generation task) with retrieval or ranking tasks.

How to eliminate wrong answers

Option A is wrong because the Embedding API converts text into numerical vectors for semantic similarity or search, not for generating summaries. Option B is wrong because the Rerank API reorders a list of documents based on relevance to a query, but it does not generate new text like a summary. Option C is wrong because the Chat API is optimized for multi-turn conversational interactions, not for single-prompt summarization tasks where the Generate API provides more direct control over the output.

Practice this question →

Multi-Selecthard

A developer is building a multi-turn chatbot using the OCI Generative AI Chat API. Which THREE parameters or features should they configure to maintain coherent conversation history?

Select 3 answers

A.Temperature

B.Stop sequences

C.System prompt

D.Multi-turn conversation history

E.Preamble override

AnswersC, D, E

Defines the assistant's persona and behavior.

Why this answer

System prompt sets the assistant's behavior. Preamble override can customize the initial context. Multi-turn conversation history is managed by sending previous messages in the API call.

Temperature and stop sequences affect generation but not history management.

Practice this question →

MCQhard

A company is building a document summarization pipeline using OCI Generative AI. They need to summarize thousands of legal documents efficiently. Which approach minimizes cost while maintaining quality?

A.Use the OCI Generative AI Summarisation API with a prebuilt model

B.Fine-tune a model on legal documents and deploy on a dedicated cluster

C.Use the Chat API with a system prompt instructing summarization

D.Use the Embedding API to vectorize documents and then use a custom summarization algorithm

AnswerA

The Summarisation API is purpose-built for summarization, uses on-demand pricing, and is cost-efficient for large volumes.

Why this answer

The Summarisation API is designed for efficient, cost-effective summarization. Dedicated clusters are expensive for batch workloads, and embedding-based approaches add complexity and cost.

Practice this question →

MCQmedium

A data scientist needs to create vector embeddings for a multilingual customer feedback dataset to perform clustering analysis. Which OCI Generative AI embedding model should they choose?

A.Cohere Command R+

B.Cohere embed-english-v3.0

C.Meta Llama 3 70B

D.Cohere embed-multilingual-v3.0

AnswerD

This embedding model supports over 100 languages, suitable for multilingual clustering.

Why this answer

The embed-multilingual-v3.0 model is designed for multilingual content, making it the best choice for a multilingual dataset. The other options are either English-only or not embedding models.

Practice this question →

MCQeasy

A developer wants to interactively test different prompts and parameters (temperature, top_p, frequency_penalty) with a Cohere Command R model before integrating it into an application. Which tool should they use?

A.OCI Console Resource Manager

B.OCI SDK for Python

C.OCI Data Science Notebooks

D.OCI Generative AI Playground

AnswerD

The Playground allows interactive testing of models with adjustable parameters.

Why this answer

The OCI Generative AI Playground provides a no-code interface to test models, adjust parameters, and see real-time outputs. It is designed for experimentation before production use.

Practice this question →

MCQeasy

Which API should a developer use to send a multi-turn conversation history to an LLM, including a system message and previous user/assistant exchanges, using OCI Generative AI?

A.Summarisation API

B.Generate API

C.Chat API

D.Embedding API

AnswerC

Chat API supports system prompts and conversation history.

Why this answer

The Chat API is designed for multi-turn conversations with system prompts and history management.

Practice this question →

MCQmedium

A.Fine-tune a base LLM on the policy documents monthly

B.Use Retrieval-Augmented Generation (RAG) with the policy documents indexed in a vector store

C.Use a larger foundation model with a longer context window and paste all documents into each prompt

D.Train a custom model from scratch on the policy documents each month

AnswerB

RAG retrieves relevant document chunks at query time, ensuring the chatbot always answers from the latest uploaded documents without any model retraining.

Why this answer

Practice this question →

Multi-Selecteasy

Which TWO statements accurately describe the OCI Generative AI Playground? (Choose two.)

Select 2 answers

A.It provides a dashboard to compare outputs from multiple models side-by-side

B.It allows you to upload training datasets for fine-tuning

C.It supports both Cohere and Meta Llama models

D.It allows you to test different models interactively with adjustable parameters like temperature and max tokens

E.It can be used to provision and manage dedicated AI clusters

AnswersC, D

The Playground supports available models including Cohere and Meta Llama.

Why this answer

Option C is correct because the OCI Generative AI Playground provides access to both Cohere and Meta Llama models, allowing users to experiment with these supported model families interactively. This is a core feature of the Playground, enabling users to test and compare outputs from these models without needing to write code.

Exam trap

Cisco often tests the distinction between the Playground's interactive testing capabilities and the service's backend management features, so candidates may mistakenly think the Playground includes fine-tuning or cluster management, which are separate functionalities.

Practice this question →

MCQeasy

You want to test different prompts and parameters (temperature, max tokens) for a summarization task using a foundation model without writing any code. Which OCI tool should you use?

A.OCI Console -> Generative AI -> Models

B.OCI Generative AI Playground

C.OCI CLI with the 'oci generative-ai' commands

D.Python SDK with InferenceClient

AnswerB

The Playground is a web-based interface that allows you to select models, adjust parameters, and see outputs instantly.

Why this answer

The OCI Generative AI Playground provides an interactive UI to experiment with models and parameters. CLI, SDK, and console navigation are not for interactive experimentation.

Practice this question →

MCQhard

An organization wants to deploy a chatbot that uses a custom fine-tuned model. They have provisioned a Dedicated AI Cluster with 4 model units. During peak hours, they observe high latency and want to reduce it. What is the most cost-effective change?

A.Increase the number of model units in the dedicated cluster

B.Reduce the max_tokens parameter in the application

C.Switch to shared infrastructure to get more capacity

D.Enable response streaming

AnswerA

More model units provide additional compute capacity, reducing latency.

Why this answer

Adding more model units increases throughput and reduces latency by distributing the load. This is the direct way to improve performance on a dedicated cluster.

Practice this question →

MCQeasy

Which OCI Generative AI model family is specifically designed to convert text into vector embeddings for semantic search and clustering tasks?

A.Cohere Embed

B.Meta Llama 3

C.Cohere Command R

D.Cohere Rerank

AnswerA

Embed models produce vector embeddings for semantic search, clustering, and classification.

Why this answer

Cohere Embed models (e.g., embed-english-v3.0, embed-multilingual-v3.0) are explicitly designed for generating text embeddings. Cohere Command R and R+ are for generation, Meta Llama 3 is a general-purpose LLM, and Cohere Rerank is for re-ranking search results.

Practice this question →

MCQmedium

A.Use a larger foundation model with a longer context window and paste all documents into each prompt

B.Use Retrieval-Augmented Generation (RAG) with the policy documents indexed in a vector store

C.Train a custom model from scratch on the policy documents each month

D.Fine-tune a base LLM on the policy documents monthly

AnswerB

RAG retrieves relevant document chunks at query time, ensuring the chatbot always answers from the latest uploaded documents without any model retraining.

Why this answer

Practice this question →

MCQmedium

A company needs to generate vector embeddings for a multilingual document set to support semantic search across English and French documents. Which embedding model should they use?

A.Cohere embed-english-v3.0

B.Cohere Command R

C.Cohere embed-multilingual-v3.0

D.Meta Llama 3 8B

AnswerC

This model supports multiple languages including French and English.

Why this answer

The embed-multilingual-v3.0 model supports multiple languages, including English and French.

Practice this question →

MCQeasy

In the OCI Generative AI Playground, a developer wants to control how creative the model responses are. Which parameter should they adjust?

A.Presence penalty

B.Temperature

C.Max tokens

D.Stop sequences

AnswerB

Temperature directly controls randomness and creativity in model outputs.

Why this answer

Temperature controls randomness; higher values produce more creative outputs. The other parameters control other aspects of generation.

Practice this question →

Multi-Selectmedium

A company wants to use OCI Generative AI Agents to create a RAG-powered customer support system. Which THREE components are essential for the agent to work?

Select 3 answers

A.A knowledge base

B.A Dedicated AI Cluster

C.A data source (e.g., OCI Object Storage bucket)

D.A base LLM (e.g., Cohere Command R)

E.A fine-tuned embedding model

AnswersA, C, D

Why this answer

OCI Generative AI Agents require a knowledge base, a data source (like Object Storage), and a base LLM to generate responses. The other options are optional.

Practice this question →

MCQmedium

A data scientist needs to fine-tune a Llama 3 model for a legal document classification task. They have a dataset of 10,000 labeled examples. Which fine-tuning technique available in OCI Generative AI is most suitable for efficiently adapting the model with limited computational overhead?

A.Full fine-tuning all model parameters

B.LoRA (Low-Rank Adaptation)

C.Prefix tuning

D.T-Few fine-tuning

AnswerD

T-Few is an efficient parameter-update technique designed for fine-tuning with limited compute, available in OCI GenAI.

Why this answer

T-Few fine-tuning is a parameter-efficient technique that updates only a small number of weights, making it suitable for fine-tuning large models with limited compute. It is the technique offered by OCI GenAI for fine-tuning.

Practice this question →

MCQhard

An organization needs to deploy a fine-tuned model for real-time inference with strict latency requirements. They have provisioned a Dedicated AI Cluster with 2 model units. Which statement about this setup is accurate?

A.The cluster provides low-latency, dedicated inference for the fine-tuned model, and you are billed per model unit.

B.You must use the shared infrastructure for fine-tuned models; dedicated clusters are only for base models.

C.The cluster automatically scales model units based on request load.

D.The cluster can only host OCI’s built-in models, not custom fine-tuned models.

AnswerA

Dedicated clusters offer dedicated compute for low-latency inference, and costs are based on model units provisioned.

Why this answer

Dedicated AI Clusters provide low-latency dedicated inference, and model units are used to allocate capacity for running models, including custom fine-tuned ones. The cluster is specifically designed for hosting fine-tuned models with consistent performance.

Practice this question →

MCQeasy

Which OCI Generative AI model family is optimized for generating text embeddings that capture semantic meaning for tasks like clustering and classification?

A.Cohere Embed

B.Cohere Rerank

C.Meta Llama 3

D.Cohere Command R

AnswerA

Cohere Embed models are optimized for text embeddings, supporting tasks like clustering, classification, and search.

Why this answer

Cohere Embed models are specifically designed for embedding text into vectors. Cohere Command and Meta Llama are generative models, not embedding models.

Practice this question →

Multi-Selectmedium

A company wants to implement a retrieval-augmented generation (RAG) chatbot using OCI Generative AI Agents. Which TWO services or components are required for this solution?

Select 2 answers

A.Cohere Embed model

B.Fine-tuned model

C.Knowledge Base

D.OCI Object Storage

E.Dedicated AI Cluster

AnswersC, D

The knowledge base indexes documents and enables retrieval.

Why this answer

Option C is correct because a Knowledge Base is the core repository that stores the documents or data sources the RAG chatbot retrieves from. OCI Generative AI Agents use a knowledge base to perform retrieval-augmented generation, where the agent first retrieves relevant chunks from the knowledge base and then passes them to the generative model to produce a grounded, context-aware response.

Exam trap

The trap here is that candidates often assume a dedicated AI cluster or a specific embedding model is mandatory for RAG, when in fact OCI Generative AI Agents abstract away these details and only require a knowledge base and a data source like Object Storage.

Practice this question →

Multi-Selectmedium

A developer is using the OCI Generative AI Chat API to create a customer support bot. They want the bot to maintain a consistent personality and follow specific guidelines. Which TWO settings should they use?

Select 2 answers

A.System prompt

B.Preamble override

C.Temperature

D.Frequency penalty

E.Max tokens

AnswersA, B

System prompt defines the assistant's behavior and constraints.

Why this answer

A system prompt sets the bot's behavior and guidelines, and preamble override allows customizing the model's initial instructions. Temperature and max tokens do not define personality or guidelines.

Practice this question →