How many Fundamentals of Large Language Models questions are on the 1Z0-1127 exam?

The Fundamentals of Large Language Models domain is one of the weighted domains on the 1Z0-1127 exam. The Courseiva question bank has 128 practice questions for this domain.

Free 1Z0-1127 Fundamentals of Large Language Models Practice Questions (2026)

Q: How can I practice Fundamentals of Large Language Models questions for 1Z0-1127?

Click any of the 128 questions listed on this page to see the full question and explanation, or use the session launcher to start a focused practice session of 10, 20, 30 or 50 questions drawn only from the Fundamentals of Large Language Models domain.

Practice Fundamentals of Large Language Models questions

10Q 20Q 30Q 50Q

All 1Z0-1127 Fundamentals of Large Language Models questions (128)

Start session

Click any question to see the full explanation and answer options, or start a focused practice session above.

A company is deploying a large language model for a customer service chatbot. The model needs to understand industry-specific jargon and maintain low latency. Which approach best balances these requirements?

A data scientist observes that their fine-tuned LLM performs well on training data but generates repetitive and dull responses in production. What is the most likely cause and best solution?

An organization wants to use an LLM to summarize legal documents. Which consideration is most important for ensuring accurate summaries?

A developer is building a code generation assistant. The model occasionally produces syntactically correct but semantically wrong code. Which technique directly addresses semantic correctness?

A company fine-tunes an LLM on internal support tickets. After deployment, the model hallucinates company-specific product names. What is the most effective mitigation?

A team wants to evaluate an LLM's performance on a text classification task. Which metric is most appropriate for a balanced dataset?

An LLM-based application must comply with data privacy regulations by not memorizing personally identifiable information (PII). Which technique best reduces memorization of PII?

Which TWO factors most significantly influence the computational cost of fine-tuning a large language model?

Which TWO techniques are commonly used to reduce the memory footprint of LLM inference?

Which THREE are essential steps in the prompt engineering process for an LLM?

A data scientist is using a large language model to summarize customer support tickets. The model occasionally generates summaries that include hallucinated details not present in the original ticket. Which technique would best reduce hallucinations while maintaining summary quality?

An enterprise is deploying a chat application using a large language model. Users report that the model sometimes generates toxic or biased responses. Which best practice should be applied to mitigate this issue?

A team is fine-tuning a large language model for a domain-specific Q&A application. After fine-tuning, they observe that the model performs well on the training distribution but struggles with out-of-distribution (OOD) questions. Which approach would best improve OOD robustness?

A developer is using a large language model to generate code snippets. The model often produces code that is syntactically correct but functionally incorrect. What is the most effective way to improve the functional correctness of the generated code?

A company is deploying a large language model in a customer-facing chatbot. The model's responses must be both accurate and safe. Which combination of techniques should be employed?

A research team is experimenting with few-shot prompting to improve a model's performance on a complex reasoning task. They find that the model's performance degrades when the few-shot examples are too similar to each other. What is the likely cause and best remedy?

Which TWO of the following are common applications of large language models in enterprise settings?

Which THREE of the following are known limitations of large language models that practitioners must consider?

Refer to the exhibit. A developer ran the OCI CLI command shown and received the JSON output. What does the output indicate about the model's confidence and why?

Refer to the exhibit. A developer runs the OCI CLI command and receives the output. However, the text "Hello, how are you?" is actually a mix of English and French words. Why does the model assign only 0.03 to French?

You are a machine learning engineer at a large e-commerce company. You have been tasked with deploying a large language model to power a customer service chatbot that handles product returns and refunds. The model will answer customer queries based on a knowledge base of return policies and FAQs. The company has strict requirements: (1) responses must be factually accurate and grounded in the knowledge base, (2) the system must be cost-effective, and (3) latency should be under 2 seconds per response. You decide to use a pre-trained LLM from OCI Data Science and implement retrieval-augmented generation (RAG). You have two options for the retriever: a dense embedding-based retriever (e.g., using OCI AI Language embeddings) or a sparse keyword-based retriever (e.g., BM25). You also need to decide on the generation model size: a 7B parameter model or a 70B parameter model. You run a pilot test: with the dense retriever + 7B model, average latency is 1.8 seconds and accuracy is 85%. With the sparse retriever + 7B model, latency is 1.2 seconds but accuracy drops to 75%. With the 70B model (any retriever), latency exceeds 5 seconds. Which combination should you choose to meet all requirements?

A healthcare startup is building an AI assistant to help doctors draft clinical notes from patient-physician conversations. They have a large language model that is fine-tuned on medical data. During testing, they notice the model occasionally generates plausible-sounding but incorrect medical recommendations. The startup wants to deploy the assistant to assist doctors, not replace them. They have the following options: (A) Deploy the model as-is and rely on doctors to catch errors, (B) Add a disclaimer that the model may make mistakes, (C) Implement a fact-checking pipeline that cross-references outputs with a trusted medical knowledge base before presenting to doctors, (D) Reduce the model's temperature to 0 to ensure deterministic outputs. Which option best balances safety and utility?

A healthcare startup is building a chatbot to answer patient inquiries using a large language model (LLM) deployed on OCI Data Science AI Quick Actions. The chatbot must comply with HIPAA regulations, so all patient data must remain within the OCI tenancy and never be sent to third-party APIs. The team has fine-tuned a Llama 2 7B model on de-identified medical records using OCI Data Science notebooks. The model is deployed as a managed endpoint via AI Quick Actions. Early testing shows that the chatbot sometimes generates responses containing specific patient names or dates of birth that were present in the fine-tuning dataset. Moreover, the model occasionally hallucinates medication dosages that are not medically accurate. Which course of action should the team take to address both issues while maintaining HIPAA compliance?

A company wants to build a customer support chatbot using OCI Generative AI. They have a large number of historical support tickets. Which approach is most effective for leveraging this data to improve the chatbot's responses?

A developer is using the OCI Generative AI API to generate text. The responses are often too short and incomplete. Which parameter adjustment is most likely to produce longer, more complete responses?

During fine-tuning of a large language model on OCI, you notice that the model's performance on the validation set is not improving after several epochs, but the training loss continues to decrease. What is the most likely cause?

A user wants to use OCI Generative AI to generate marketing copy. They want the output to be more creative and varied. Which parameter should they adjust?

An organization is concerned about the safety of generated content. Which OCI feature allows them to define custom policies to block inappropriate outputs?

A data scientist is using the OCI Generative AI SDK to create embeddings for a large corpus of legal documents. They want to perform semantic search. Which endpoint should they use?

Which of the following best describes the role of attention in transformer models?

A company wants to deploy a private instance of a large language model on OCI for sensitive data processing. What is the recommended approach?

During inference with OCI Generative AI, you notice that the model is generating repetitive phrases. Which combination of parameters can help reduce repetition?

Which TWO statements about tokens in large language models are correct?

A data scientist is evaluating different models for a summarization task. Which two metrics are commonly used to evaluate the quality of generated summaries?

An organization is planning to use OCI Generative AI for sensitive customer data. Which three OCI services or features should they consider for data governance and security?

Refer to the exhibit. A developer runs this command and sees that the 'cohere.embed-english-v3.0' model is INACTIVE. What is the most likely cause?

Refer to the exhibit. A developer encounters this error. Which action should they take to resolve the issue?

Refer to the exhibit. A user in group GenAIUsers reports that they cannot call the OCI Generative AI API. What is the most likely issue?

A startup needs to deploy a large language model for a customer support chatbot that requires low latency and cost efficiency. They are evaluating OCI Generative AI models. Which model type is most appropriate?

An AI engineer is testing a large language model on OCI Generative AI and receives this error: 'Token limit exceeded. Maximum context length is 4096 tokens.' The prompt is 4000 tokens long. What is the most effective way to resolve the issue without losing important context?

A data scientist is designing a prompt to extract structured information (e.g., JSON) from text using an instruct model on OCI Generative AI. The model sometimes outputs additional text beyond the JSON, breaking parsing. Which prompt engineering technique is most effective to enforce structured output?

A company wants to build a retrieval-augmented generation (RAG) system using OCI Generative AI and a vector database. Which model type should they use to convert documents into vector embeddings?

A developer is reviewing the model card for an LLM on OCI Generative AI and notices it was trained on a dataset that is predominantly English. The application will serve users in multiple languages. What is the most likely limitation of using this model without additional steps?

An architect is optimizing an LLM application that processes long documents. The model has a 4096 token limit, but the documents are often 8000 tokens. They are using a chunking strategy. However, model responses sometimes miss key information that spans across chunks. Which technique most directly addresses this issue?

A researcher wants to compare the performance of two LLMs on OCI Generative AI: a base model and an instruct model. They notice the instruct model often refuses to generate certain types of content. Which factor most likely explains this behavior?

A team is fine-tuning an LLM on OCI Generative AI for a domain-specific task. They have a dataset of 10,000 labeled examples. What is a best practice to avoid catastrophic forgetting during fine-tuning?

A company uses an LLM to generate product descriptions. The outputs are consistently too verbose and include irrelevant details. The prompt includes a simple instruction: 'Describe the product.' Which adjustment to the prompt is most likely to yield concise, relevant descriptions?

Which TWO factors most directly impact the consistency of text generated by an LLM when the same prompt is used multiple times?

Which THREE are known challenges when deploying large language models in production?

Which TWO are advantages of using retrieval-augmented generation (RAG) over fine-tuning for incorporating new knowledge?

Based on the exhibit, which model is best suited for a conversational chatbot that needs to handle multi-turn dialogues?

A developer in the GenAIDevelopers group tries to call the OCI Generative AI inference API but receives an unauthorized error. Which statement best explains the issue?

Based on the exhibit, what is the primary action the developer must take to successfully make the inference request?

A company uses OCI Generative AI service to power a chatbot. After deployment, the chatbot starts generating inappropriate responses. Which action should be taken first?

A data scientist wants to improve the accuracy of a summarization model on medical texts. Which OCI service feature is most suitable?

An architect needs to ensure that an LLM deployed in OCI does not reveal sensitive information in its outputs. Which technique should be used?

A developer notices that an LLM's responses are too verbose. Which parameter adjustment would most effectively reduce verbosity?

A team wants to deploy an LLM for real-time inference with low latency. Which OCI deployment option is best?

An AI specialist is troubleshooting why a fine-tuned model produces inconsistent results across different inference calls. What is the most likely cause?

A company uses RAG (Retrieval-Augmented Generation) with OCI OpenSearch and OCI Generative AI. The system retrieves irrelevant documents. What is the first step to debug?

An OCI administrator wants to limit which users can invoke a specific LLM endpoint. Which resource type should be used?

A developer is using the OCI Generative AI Python SDK. They receive a 400 error 'InvalidParameter'. What is the most likely reason?

Which TWO of the following are benefits of using OCI Generative AI service compared to self-hosting an LLM?

Which THREE factors should be considered when choosing between a fine-tuning and a prompt engineering approach?

Which TWO techniques can help reduce bias in LLM outputs?

Refer to the exhibit. What is the primary reason the response is incomplete?

Refer to the exhibit. What is the solution?

Refer to the exhibit. A user in GenAI-Users group tries to run a text generation inference but gets permission denied. What is the most likely issue?

A company wants to use OCI Generative AI to summarize customer reviews. Which model parameter should be adjusted to control the creativity of the summary?

A machine learning engineer is fine-tuning a model on OCI Data Science and notices that the training loss decreases but then suddenly increases. What is the most likely cause?

An engineer sets beam search width to 1 during inference on OCI Generative AI. What is the most likely effect on output?

Which technique allows an LLM to be adapted to a new task with only a few examples?

A model generates code with security issues. Which approach is best to mitigate this?

During multi-turn conversation with an OCI GenAI model, the model repeats user messages from earlier turns. What is the most likely cause?

What is the role of the softmax function in the output layer of an LLM?

A developer uses OCI Generative AI's chat endpoint with a system message placed after user messages. The model ignores the system message. What is the most likely reason?

An OCI GenAI model generates English to French translation. Which metric is most appropriate to evaluate its quality?

Which TWO factors are most likely to cause hallucinations in LLMs?

Which THREE techniques are commonly used to improve the quality of text generation?

Which TWO are advantages of using LoRA for fine-tuning?

Refer to the exhibit. A user in group GenAIGroup cannot see models in the Production compartment using OCI Generative AI. What is the most likely issue?

Refer to the exhibit. The output is very short and cuts off mid-sentence. Which parameter is most likely the cause?

Refer to the exhibit. A user deployed a custom model via OCI Data Science and registered it in the Model Catalog. They use the correct OCID but get this error. What is the most likely issue?

A data scientist is using OCI Data Science to fine-tune a Cohere command model on domain-specific documents. They observe that the fine-tuned model generates repetitive text. What is the most likely cause?

A company uses OCI Generative AI to power a chatbot for customer support. They notice that the model's responses sometimes contain factual inaccuracies. Which strategy would best reduce hallucination?

An architect is designing a multi-tenant application using OCI Generative AI. Each tenant has custom instructions and data. To minimize cost while maintaining isolation, which deployment approach is recommended?

Which two factors are essential for calculating the cost of using OCI Generative AI for text generation? (Choose two.)

Which three techniques are commonly used to reduce the risk of prompt injection in LLM applications? (Choose three.)

Which three statements about transformer architecture are correct? (Choose three.)

An OCI AI Language text classification request returns the output shown. Which conclusion is most accurate?

A developer sends this request but receives an error: "modelId not found". Which is the most likely cause?

The job fails with "InvalidParameter: trainingDatasetUri". What should the administrator check first?

An OCI GenAI practitioner wants to deploy a model that can generate code from natural language descriptions. Which type of model is most suitable?

During fine-tuning of a Cohere model on OCI Data Science, the loss curve shows a sharp spike after epoch 3. What is the most appropriate action?

A team uses OCI Generative AI's summarization feature to condense legal documents. The summaries sometimes omit critical clauses. Which parameter adjustment is most likely to improve completeness?

Which OCI service provides pre-trained models for custom text classification without requiring fine-tuning?

A developer runs an OCI GenAI chat request with system prompt "You are a sarcastic assistant." The output is offensive. How can the developer enforce safety policies?

A data engineer wants to migrate a large corpus of PDFs to OCI for use with GenAI. Which storage and preprocessing approach is most efficient for RAG?

A developer is using OCI GenAI to generate structured data. They often get responses that include additional commentary or markdown. Which prompt engineering technique should they use to ensure only JSON output?

100

A user has a prompt that exceeds the model's token limit. What is the best practice to handle this?

101

A startup needs to deploy an LLM for a simple FAQ chatbot on OCI with low latency. Which model choice is most appropriate?

102

A company wants to create a chatbot that answers questions based on a large internal document set that is updated weekly. They have limited ML expertise. Which approach is recommended?

103

An application using OCI GenAI experiences high response times. Which change will most directly reduce latency?

104

A multi-turn chatbot needs to maintain context across user queries. The context window is limited. What design should be used?

105

A financial institution uses an LLM for generating investment advice. They are concerned about hallucinations. Which method is most effective?

106

A development team wants to generate code snippets from natural language. Which model strategy should they adopt?

107

An AI assistant needs to solve complex math word problems step by step. Which prompting technique is most suitable?

108

Which two are essential components of the Transformer architecture? (Select TWO)

109

Which three factors most significantly affect the quality of an LLM's output? (Select THREE)

110

Which three characteristics of LLMs can lead to hallucinations? (Select THREE)

111

A developer integrates OCI GenAI into a mobile app to provide product descriptions. The responses sometimes include explanations or questions instead of the requested format. The developer is using a simple prompt: 'Describe product X.' The app expects a single paragraph. Which corrective action should the developer take?

112

A customer support company uses Cohere Command on OCI to answer user queries. They have enabled grounding with a knowledge base of product manuals. However, for about 20% of queries, the model provides incorrect product recommendations that are not in the manuals. The team has verified the knowledge base is up to date. What is the most likely cause and solution?

113

A machine learning team is fine-tuning a 7B parameter Llama 2 model on a custom dataset of 10,000 documents using OCI Data Science and GPU instances. They encounter out-of-memory (OOM) errors during the fine-tuning process. They are using a batch size of 8 and a sequence length of 2048. They cannot increase the GPU memory. Which change should they prioritize to resolve the OOM?

114

A company is building a chatbot using OCI Generative AI service. They want to ensure that the model responses are grounded in their internal knowledge base. Which approach should they use?

115

A data scientist is using OCI Data Science with the Generative AI service to fine-tune a Cohere Command model on a custom dataset of customer support tickets. After training, the model produces poor, irrelevant responses. What is the most likely cause?

116

An enterprise wants to deploy a large language model for processing sensitive internal documents. They must ensure that data does not leave their OCI tenancy. Which OCI GenAI deployment option meets this requirement?

117

An organization is implementing a RAG system using OCI GenAI. Which two are best practices for optimizing retrieval and generation? (Choose two.)

118

A developer is evaluating OCI GenAI model families. Which three are correct characteristics of the available models? (Choose three.)

119

A healthcare company is using OCI GenAI to generate patient summaries from clinical notes. The model output sometimes includes hallucinated medical facts, such as incorrect dosages or diagnoses, which could be dangerous. The team needs to improve factual accuracy while maintaining data privacy. They have a large collection of internal medical knowledge bases (clinical guidelines, drug databases) that are stored in OCI Object Storage. The current implementation uses a zero-shot prompt with the base Cohere Command model. The data science team has limited GPU resources and wants to avoid building a complex pipeline. Which course of action best addresses the hallucination problem?

120

An e-commerce company fine-tuned a Cohere Command model on their product catalog to generate product descriptions. During inference, they notice the model outputs are too repetitive: it often repeats similar phrases across different products, and the descriptions lack diversity. The team wants to increase the variety of the generated text without sacrificing relevance. They are currently using temperature=0.8, top_p=0.9, frequency_penalty=0, and presence_penalty=0. Which parameter adjustment should they make to most effectively increase diversity?

121

A financial institution uses OCI GenAI to power a customer support chatbot. The compliance team requires that responses are strictly consistent with regulatory guidelines and approved responses. The company has a curated set of question-answer pairs that cover common scenarios. They want to ensure that the chatbot never deviates from these approved answers. The data science team is considering various approaches to enforce this consistency. Which approach is most effective?

122

A research team is using OCI Data Science and OCI GenAI to build a multilingual chatbot for customer service. They have training data in English, Spanish, and French. The model currently struggles with code-switching—users often mix languages in a single query (e.g., 'Quiero cancel my order'), and the model responds inconsistently, sometimes in English, sometimes mixing incorrectly. The team wants to improve performance on code-switching while maintaining fluency in each language. They have limited compute resources and cannot deploy separate models per language. Which approach should they take?

123

A company is using OCI GenAI with a Dedicated AI Cluster to serve a large language model for real-time chat applications. They notice high inference latency (average 2 seconds per response) and want to reduce it to under 500 milliseconds without significantly degrading the quality of responses. The cluster is configured with NVIDIA A100 GPUs. The model is the base Cohere Command model (52B parameters). They have explored increasing batch size, but that increases latency for interactive use cases. Which action should they take?

124

A developer is testing the OCI Generative AI API by sending a request to generate text using the Cohere Command R model. The request returns the following error: 'The model 'cohere.command-r-08-2024' is not available in this region. Please check the model availability in your region.' The developer is using the us-ashburn-1 region. What is the most likely cause of this error?

125

A company uses OCI GenAI to build a content moderation system that filters toxic language in user-generated comments. They have a small labeled dataset of 1,000 comments (500 toxic, 500 non-toxic) and need an efficient solution that balances accuracy, cost, and latency. They are considering different model options: fine-tuning a large LLM (e.g., Cohere Command), using a pre-trained LLM with prompting, fine-tuning a smaller BERT-based classifier, or building a rule-based system. The team has moderate ML experience and wants to deploy using OCI Data Science. Which approach is most efficient for this binary classification task?

126

Which TWO statements about large language model (LLM) capabilities are correct?

127

Refer to the exhibit. A data scientist runs this inference request and receives a response that is incomplete and seems to stop mid-sentence. Which parameter should be adjusted to allow the model to generate longer outputs?

128

A multinational corporation is deploying a generative AI chatbot for customer support using Oracle Cloud Infrastructure's Generative AI service. The chatbot is powered by a large language model (LLM) accessed via the on-demand serving mode. During initial testing, the chatbot provides accurate answers for well-known products but frequently hallucinates or gives incorrect specifications for niche products. The company maintains a comprehensive internal database of product specifications, updated daily. The support team prefers not to fine-tune the LLM due to cost and maintenance overhead. Additionally, the chatbot must respond within 2 seconds to maintain a good customer experience. The team considers several approaches: A. Increasing the 'temperature' parameter to make the model more creative, hoping it will generate more accurate responses when unsure. B. Using few-shot prompting with three manually curated examples of correct product specifications included in every prompt. C. Implementing a Retrieval Augmented Generation (RAG) pipeline that retrieves relevant product documents from the internal database and prepends them to the prompt before inference. D. Reducing the 'topP' parameter to 0.1 to force the model to sample only from the highest probability tokens, thereby reducing randomness. Which approach best meets the requirements of improving factual accuracy while maintaining low latency?

Practice all 128 Fundamentals of Large Language Models questions

Other 1Z0-1127 exam domains

Using OCI Generative AI Service Building LLM Applications with RAG and Vector Search Deploying and Managing Generative AI on OCI

Frequently asked questions

What does the Fundamentals of Large Language Models domain cover on the 1Z0-1127 exam?

The Fundamentals of Large Language Models domain covers the key concepts tested in this area of the 1Z0-1127 exam blueprint published by Oracle. Courseiva provides free domain-focused practice, mock exams, missed-question review, and readiness tracking across all 1Z0-1127 domains — no account required.

How many Fundamentals of Large Language Models questions are in the 1Z0-1127 question bank?

The Courseiva 1Z0-1127 question bank contains 128 questions in the Fundamentals of Large Language Models domain. Click any question to see the full explanation and answer breakdown.

What is the best way to practice Fundamentals of Large Language Models for 1Z0-1127?

Start with a 10-question focused session to identify your baseline accuracy in this domain. Read every explanation — even for questions you answer correctly — to understand the reasoning. Once you score consistently above 80%, move to a 20–30 question session to confirm depth before moving to the next domain.

Can I practice only Fundamentals of Large Language Models questions for 1Z0-1127?

Yes — the session launcher on this page draws questions exclusively from the Fundamentals of Large Language Models domain. Choose 10, 20, 30, or 50 questions for a focused session, or click individual questions to review them one by one.

Free forever · No credit card required

Track your 1Z0-1127 domain progress

Save your results, see per-domain analytics, and get readiness scores — free, for every certification.

Free forever · Every certification included