How should I use these LLM Fundamentals practice questions?

Read each scenario carefully and choose your answer before revealing the explanation. Then check why your choice was right or wrong. Repeat until the reasoning feels automatic.

Can I practise just LLM Fundamentals questions in a focused session?

Yes — use the session launcher on this page to start a 10-, 20-, 30- or 50-question session drawn entirely from the LLM Fundamentals domain.

1Z0-1127 · topic practice

LLM Fundamentals practice questions

Practise Oracle Cloud Infrastructure Generative AI Professional 1Z0-1127 LLM Fundamentals practice questions — original exam-style scenarios with answer choices, explanations, and analysis of common mistakes.

Courseiva uses original exam-style practice questions designed for learning and revision. The goal is to understand the concepts, recognise exam patterns, and improve through explanations — not memorise copied exam dumps.

Reviewed byJohnson Ajibi· MSc IT Security

20 questionsDomain: LLM Fundamentals

Practice 10 questions Browse domain →

What the exam tests

What to know about LLM Fundamentals

LLM Fundamentals questions test whether you can apply the concept in context, not just recognise a definition.

How the topic appears in realistic exam-style scenarios.

Which detail in the question changes the correct answer.

How to eliminate plausible but wrong options.

How to connect the question back to the wider exam objective.

Watch out for

Common LLM Fundamentals exam traps

▸Answering from memory before reading the full scenario.
▸Missing a constraint such as cost, availability, security, scope or command context.
▸Choosing a broad answer when the question asks for the most specific fix.
▸Ignoring why the wrong options are tempting.

Practice set

LLM Fundamentals questions

20 questions · select your answer, then reveal the explanation

Question 1easymultiple choice

Read the full LLM Fundamentals explanation →

What is the primary purpose of the self-attention mechanism in a Transformer model?

Trap 1: To generate token embeddings in parallel

Parallel generation is a property of the Transformer architecture, but not specific to self-attention's purpose.

Trap 2: To reduce the dimensionality of token embeddings

Dimensionality reduction is not the role of self-attention.

Trap 3: To encode positional information of tokens

Positional encoding, not self-attention, provides positional information.

Study all LLM Fundamentals common traps →

A
To generate token embeddings in parallel
Why wrong: Parallel generation is a property of the Transformer architecture, but not specific to self-attention's purpose.
B
To reduce the dimensionality of token embeddings
Why wrong: Dimensionality reduction is not the role of self-attention.
C
To encode positional information of tokens
Why wrong: Positional encoding, not self-attention, provides positional information.
D
To compute a weighted sum of all token representations based on pairwise relevance
Self-attention computes attention scores between all pairs and aggregates information.

LLM Fundamentals practice questions

What to know about LLM Fundamentals

Common LLM Fundamentals exam traps

LLM Fundamentals questions

What is the primary purpose of the self-attention mechanism in a Transformer model?

Which of the following best describes the difference between an encoder-only model (e.g., BERT) and a decoder-only model (e.g., GPT)?

A practitioner wants to evaluate an LLM-generated summary against a human-written reference using a metric that focuses on recall of key information. Which metric is most appropriate?

A company needs to generate embeddings for a large corpus of legal documents to enable semantic search. Which type of model should they use?

Which of the following sampling strategies selects tokens based on a cumulative probability threshold from the highest probability tokens?

An OCI Generative AI practitioner observes that a Cohere Command model generates responses with outdated information about a recent event. The model was fine-tuned six months ago. Which technique should be applied to incorporate new knowledge without retraining the model?

What is the main advantage of using Byte-Pair Encoding (BPE) over word-level tokenization?

When using an LLM for code generation, a developer notices the model occasionally produces syntactically incorrect code. Which approach is most likely to reduce syntax errors while still allowing diverse output?

In a Transformer model, what is the role of positional encoding?

An LLM is being used to answer customer queries about a product catalog. The answers are fluent but sometimes include plausible-sounding but incorrect product details. What is this phenomenon called, and which technique is most effective to mitigate it?

Which of the following metrics is most suitable for evaluating a translation model's output against multiple reference translations?

An OCI user is comparing two embedding models: one with 768 dimensions and another with 1024 dimensions. Which of the following trade-offs is most relevant?

A data scientist is building a RAG pipeline on OCI. Which TWO components are essential for the retrieval step?

A team wants to reduce hallucinations in their LLM-powered question-answering system. Which TWO techniques are most effective?

An OCI practitioner is comparing BERTScore with traditional n-gram metrics (ROUGE, BLEU) for evaluating summarization. Which THREE statements about BERTScore are true?

A data scientist wants to compare the semantic similarity between two sentences generated by an LLM. Which evaluation metric is most suitable for this purpose?

Which component of the Transformer architecture allows the model to focus on different parts of the input sequence when generating each output token?

An OCI user notices that their Llama 3 model generates the same output sequence regardless of the input prompt when using default generation parameters. Which setting is most likely causing this lack of diversity?

A developer is building a code generation assistant and wants to minimize the number of API calls to the OCI Generative AI service. Which tokenization approach results in the lowest token count for a given code snippet?

An organization wants to deploy a model that can summarize long financial reports (5000+ tokens) without losing context. Which model architecture is best suited for this requirement?

Track your progress over time

Start a LLM Fundamentals only practice session

Related 1Z0-1127 topic practice pages

Prompt Engineering practice questions

OCI Generative AI Service practice questions

LLM Fundamentals practice questions

LangChain and AI Application Development practice questions

Fundamentals of Large Language Models practice questions

Using OCI Generative AI Service practice questions

Building LLM Applications with RAG and Vector Search practice questions

Deploying and Managing Generative AI on OCI practice questions

1Z0-1127 fundamentals practice questions

1Z0-1127 scenario practice questions

1Z0-1127 troubleshooting practice questions

Frequently asked questions

Track your progress

Study resources

Exam traps to avoid