Practice 1Z0-1127 Fundamentals of Large Language Models questions with full explanations on every answer.
Start practicing
Fundamentals of Large Language Models — choose a session length
Free · No account required
Click any question to see the full explanation and answer options, or start a focused practice session above.
A company is deploying a large language model for a customer service chatbot. The model needs to understand industry-specific jargon and maintain low latency. Which approach best balances these requirements?
2A data scientist observes that their fine-tuned LLM performs well on training data but generates repetitive and dull responses in production. What is the most likely cause and best solution?
3An organization wants to use an LLM to summarize legal documents. Which consideration is most important for ensuring accurate summaries?
4A developer is building a code generation assistant. The model occasionally produces syntactically correct but semantically wrong code. Which technique directly addresses semantic correctness?
5A company fine-tunes an LLM on internal support tickets. After deployment, the model hallucinates company-specific product names. What is the most effective mitigation?
6A team wants to evaluate an LLM's performance on a text classification task. Which metric is most appropriate for a balanced dataset?
7An LLM-based application must comply with data privacy regulations by not memorizing personally identifiable information (PII). Which technique best reduces memorization of PII?
8Which TWO factors most significantly influence the computational cost of fine-tuning a large language model?
9Which TWO techniques are commonly used to reduce the memory footprint of LLM inference?
10Which THREE are essential steps in the prompt engineering process for an LLM?
11A data scientist is using a large language model to summarize customer support tickets. The model occasionally generates summaries that include hallucinated details not present in the original ticket. Which technique would best reduce hallucinations while maintaining summary quality?
12An enterprise is deploying a chat application using a large language model. Users report that the model sometimes generates toxic or biased responses. Which best practice should be applied to mitigate this issue?
13A team is fine-tuning a large language model for a domain-specific Q&A application. After fine-tuning, they observe that the model performs well on the training distribution but struggles with out-of-distribution (OOD) questions. Which approach would best improve OOD robustness?
14A developer is using a large language model to generate code snippets. The model often produces code that is syntactically correct but functionally incorrect. What is the most effective way to improve the functional correctness of the generated code?
15A company is deploying a large language model in a customer-facing chatbot. The model's responses must be both accurate and safe. Which combination of techniques should be employed?
16A research team is experimenting with few-shot prompting to improve a model's performance on a complex reasoning task. They find that the model's performance degrades when the few-shot examples are too similar to each other. What is the likely cause and best remedy?
17Which TWO of the following are common applications of large language models in enterprise settings?
18Which THREE of the following are known limitations of large language models that practitioners must consider?
19Refer to the exhibit. A developer ran the OCI CLI command shown and received the JSON output. What does the output indicate about the model's confidence and why?
20Refer to the exhibit. A developer runs the OCI CLI command and receives the output. However, the text "Hello, how are you?" is actually a mix of English and French words. Why does the model assign only 0.03 to French?
21You are a machine learning engineer at a large e-commerce company. You have been tasked with deploying a large language model to power a customer service chatbot that handles product returns and refunds. The model will answer customer queries based on a knowledge base of return policies and FAQs. The company has strict requirements: (1) responses must be factually accurate and grounded in the knowledge base, (2) the system must be cost-effective, and (3) latency should be under 2 seconds per response. You decide to use a pre-trained LLM from OCI Data Science and implement retrieval-augmented generation (RAG). You have two options for the retriever: a dense embedding-based retriever (e.g., using OCI AI Language embeddings) or a sparse keyword-based retriever (e.g., BM25). You also need to decide on the generation model size: a 7B parameter model or a 70B parameter model. You run a pilot test: with the dense retriever + 7B model, average latency is 1.8 seconds and accuracy is 85%. With the sparse retriever + 7B model, latency is 1.2 seconds but accuracy drops to 75%. With the 70B model (any retriever), latency exceeds 5 seconds. Which combination should you choose to meet all requirements?
22A healthcare startup is building an AI assistant to help doctors draft clinical notes from patient-physician conversations. They have a large language model that is fine-tuned on medical data. During testing, they notice the model occasionally generates plausible-sounding but incorrect medical recommendations. The startup wants to deploy the assistant to assist doctors, not replace them. They have the following options: (A) Deploy the model as-is and rely on doctors to catch errors, (B) Add a disclaimer that the model may make mistakes, (C) Implement a fact-checking pipeline that cross-references outputs with a trusted medical knowledge base before presenting to doctors, (D) Reduce the model's temperature to 0 to ensure deterministic outputs. Which option best balances safety and utility?
23A healthcare startup is building a chatbot to answer patient inquiries using a large language model (LLM) deployed on OCI Data Science AI Quick Actions. The chatbot must comply with HIPAA regulations, so all patient data must remain within the OCI tenancy and never be sent to third-party APIs. The team has fine-tuned a Llama 2 7B model on de-identified medical records using OCI Data Science notebooks. The model is deployed as a managed endpoint via AI Quick Actions. Early testing shows that the chatbot sometimes generates responses containing specific patient names or dates of birth that were present in the fine-tuning dataset. Moreover, the model occasionally hallucinates medication dosages that are not medically accurate. Which course of action should the team take to address both issues while maintaining HIPAA compliance?
24A company wants to build a customer support chatbot using OCI Generative AI. They have a large number of historical support tickets. Which approach is most effective for leveraging this data to improve the chatbot's responses?
25A developer is using the OCI Generative AI API to generate text. The responses are often too short and incomplete. Which parameter adjustment is most likely to produce longer, more complete responses?
26During fine-tuning of a large language model on OCI, you notice that the model's performance on the validation set is not improving after several epochs, but the training loss continues to decrease. What is the most likely cause?
27A user wants to use OCI Generative AI to generate marketing copy. They want the output to be more creative and varied. Which parameter should they adjust?
28An organization is concerned about the safety of generated content. Which OCI feature allows them to define custom policies to block inappropriate outputs?
29A data scientist is using the OCI Generative AI SDK to create embeddings for a large corpus of legal documents. They want to perform semantic search. Which endpoint should they use?
30Which of the following best describes the role of attention in transformer models?
31A company wants to deploy a private instance of a large language model on OCI for sensitive data processing. What is the recommended approach?
32During inference with OCI Generative AI, you notice that the model is generating repetitive phrases. Which combination of parameters can help reduce repetition?
33Which TWO statements about tokens in large language models are correct?
34A data scientist is evaluating different models for a summarization task. Which two metrics are commonly used to evaluate the quality of generated summaries?
35An organization is planning to use OCI Generative AI for sensitive customer data. Which three OCI services or features should they consider for data governance and security?
36Refer to the exhibit. A developer runs this command and sees that the 'cohere.embed-english-v3.0' model is INACTIVE. What is the most likely cause?
37Refer to the exhibit. A developer encounters this error. Which action should they take to resolve the issue?
38Refer to the exhibit. A user in group GenAIUsers reports that they cannot call the OCI Generative AI API. What is the most likely issue?
39A startup needs to deploy a large language model for a customer support chatbot that requires low latency and cost efficiency. They are evaluating OCI Generative AI models. Which model type is most appropriate?
40An AI engineer is testing a large language model on OCI Generative AI and receives this error: 'Token limit exceeded. Maximum context length is 4096 tokens.' The prompt is 4000 tokens long. What is the most effective way to resolve the issue without losing important context?
41A data scientist is designing a prompt to extract structured information (e.g., JSON) from text using an instruct model on OCI Generative AI. The model sometimes outputs additional text beyond the JSON, breaking parsing. Which prompt engineering technique is most effective to enforce structured output?
42A company wants to build a retrieval-augmented generation (RAG) system using OCI Generative AI and a vector database. Which model type should they use to convert documents into vector embeddings?
43A developer is reviewing the model card for an LLM on OCI Generative AI and notices it was trained on a dataset that is predominantly English. The application will serve users in multiple languages. What is the most likely limitation of using this model without additional steps?
44An architect is optimizing an LLM application that processes long documents. The model has a 4096 token limit, but the documents are often 8000 tokens. They are using a chunking strategy. However, model responses sometimes miss key information that spans across chunks. Which technique most directly addresses this issue?
45A researcher wants to compare the performance of two LLMs on OCI Generative AI: a base model and an instruct model. They notice the instruct model often refuses to generate certain types of content. Which factor most likely explains this behavior?
46A team is fine-tuning an LLM on OCI Generative AI for a domain-specific task. They have a dataset of 10,000 labeled examples. What is a best practice to avoid catastrophic forgetting during fine-tuning?
47A company uses an LLM to generate product descriptions. The outputs are consistently too verbose and include irrelevant details. The prompt includes a simple instruction: 'Describe the product.' Which adjustment to the prompt is most likely to yield concise, relevant descriptions?
48Which TWO factors most directly impact the consistency of text generated by an LLM when the same prompt is used multiple times?
49Which THREE are known challenges when deploying large language models in production?
50Which TWO are advantages of using retrieval-augmented generation (RAG) over fine-tuning for incorporating new knowledge?
51Based on the exhibit, which model is best suited for a conversational chatbot that needs to handle multi-turn dialogues?
52A developer in the GenAIDevelopers group tries to call the OCI Generative AI inference API but receives an unauthorized error. Which statement best explains the issue?
53Based on the exhibit, what is the primary action the developer must take to successfully make the inference request?
54A company uses OCI Generative AI service to power a chatbot. After deployment, the chatbot starts generating inappropriate responses. Which action should be taken first?
55A data scientist wants to improve the accuracy of a summarization model on medical texts. Which OCI service feature is most suitable?
56An architect needs to ensure that an LLM deployed in OCI does not reveal sensitive information in its outputs. Which technique should be used?
57A developer notices that an LLM's responses are too verbose. Which parameter adjustment would most effectively reduce verbosity?
58A team wants to deploy an LLM for real-time inference with low latency. Which OCI deployment option is best?
59An AI specialist is troubleshooting why a fine-tuned model produces inconsistent results across different inference calls. What is the most likely cause?
60A company uses RAG (Retrieval-Augmented Generation) with OCI OpenSearch and OCI Generative AI. The system retrieves irrelevant documents. What is the first step to debug?
61An OCI administrator wants to limit which users can invoke a specific LLM endpoint. Which resource type should be used?
62A developer is using the OCI Generative AI Python SDK. They receive a 400 error 'InvalidParameter'. What is the most likely reason?
63Which TWO of the following are benefits of using OCI Generative AI service compared to self-hosting an LLM?
64Which THREE factors should be considered when choosing between a fine-tuning and a prompt engineering approach?
65Which TWO techniques can help reduce bias in LLM outputs?
66Refer to the exhibit. What is the primary reason the response is incomplete?
67Refer to the exhibit. What is the solution?
68Refer to the exhibit. A user in GenAI-Users group tries to run a text generation inference but gets permission denied. What is the most likely issue?
69A company wants to use OCI Generative AI to summarize customer reviews. Which model parameter should be adjusted to control the creativity of the summary?
70A machine learning engineer is fine-tuning a model on OCI Data Science and notices that the training loss decreases but then suddenly increases. What is the most likely cause?
71An engineer sets beam search width to 1 during inference on OCI Generative AI. What is the most likely effect on output?
72Which technique allows an LLM to be adapted to a new task with only a few examples?
73A model generates code with security issues. Which approach is best to mitigate this?
74During multi-turn conversation with an OCI GenAI model, the model repeats user messages from earlier turns. What is the most likely cause?
75What is the role of the softmax function in the output layer of an LLM?
76A developer uses OCI Generative AI's chat endpoint with a system message placed after user messages. The model ignores the system message. What is the most likely reason?
77An OCI GenAI model generates English to French translation. Which metric is most appropriate to evaluate its quality?
78Which TWO factors are most likely to cause hallucinations in LLMs?
79Which THREE techniques are commonly used to improve the quality of text generation?
80Which TWO are advantages of using LoRA for fine-tuning?
81Refer to the exhibit. A user in group GenAIGroup cannot see models in the Production compartment using OCI Generative AI. What is the most likely issue?
82Refer to the exhibit. The output is very short and cuts off mid-sentence. Which parameter is most likely the cause?
83Refer to the exhibit. A user deployed a custom model via OCI Data Science and registered it in the Model Catalog. They use the correct OCID but get this error. What is the most likely issue?
84A data scientist is using OCI Data Science to fine-tune a Cohere command model on domain-specific documents. They observe that the fine-tuned model generates repetitive text. What is the most likely cause?
85A company uses OCI Generative AI to power a chatbot for customer support. They notice that the model's responses sometimes contain factual inaccuracies. Which strategy would best reduce hallucination?
86An architect is designing a multi-tenant application using OCI Generative AI. Each tenant has custom instructions and data. To minimize cost while maintaining isolation, which deployment approach is recommended?
87Which two factors are essential for calculating the cost of using OCI Generative AI for text generation? (Choose two.)
88Which three techniques are commonly used to reduce the risk of prompt injection in LLM applications? (Choose three.)
89Which three statements about transformer architecture are correct? (Choose three.)
90An OCI AI Language text classification request returns the output shown. Which conclusion is most accurate?
91A developer sends this request but receives an error: "modelId not found". Which is the most likely cause?
92The job fails with "InvalidParameter: trainingDatasetUri". What should the administrator check first?
93An OCI GenAI practitioner wants to deploy a model that can generate code from natural language descriptions. Which type of model is most suitable?
94During fine-tuning of a Cohere model on OCI Data Science, the loss curve shows a sharp spike after epoch 3. What is the most appropriate action?
95A team uses OCI Generative AI's summarization feature to condense legal documents. The summaries sometimes omit critical clauses. Which parameter adjustment is most likely to improve completeness?
96Which OCI service provides pre-trained models for custom text classification without requiring fine-tuning?
97A developer runs an OCI GenAI chat request with system prompt "You are a sarcastic assistant." The output is offensive. How can the developer enforce safety policies?
98A data engineer wants to migrate a large corpus of PDFs to OCI for use with GenAI. Which storage and preprocessing approach is most efficient for RAG?
99A developer is using OCI GenAI to generate structured data. They often get responses that include additional commentary or markdown. Which prompt engineering technique should they use to ensure only JSON output?
100A user has a prompt that exceeds the model's token limit. What is the best practice to handle this?
101A startup needs to deploy an LLM for a simple FAQ chatbot on OCI with low latency. Which model choice is most appropriate?
102A company wants to create a chatbot that answers questions based on a large internal document set that is updated weekly. They have limited ML expertise. Which approach is recommended?
103An application using OCI GenAI experiences high response times. Which change will most directly reduce latency?
104A multi-turn chatbot needs to maintain context across user queries. The context window is limited. What design should be used?
105A financial institution uses an LLM for generating investment advice. They are concerned about hallucinations. Which method is most effective?
106A development team wants to generate code snippets from natural language. Which model strategy should they adopt?
107An AI assistant needs to solve complex math word problems step by step. Which prompting technique is most suitable?
108Which two are essential components of the Transformer architecture? (Select TWO)
109Which three factors most significantly affect the quality of an LLM's output? (Select THREE)
110Which three characteristics of LLMs can lead to hallucinations? (Select THREE)
111A developer integrates OCI GenAI into a mobile app to provide product descriptions. The responses sometimes include explanations or questions instead of the requested format. The developer is using a simple prompt: 'Describe product X.' The app expects a single paragraph. Which corrective action should the developer take?
112A customer support company uses Cohere Command on OCI to answer user queries. They have enabled grounding with a knowledge base of product manuals. However, for about 20% of queries, the model provides incorrect product recommendations that are not in the manuals. The team has verified the knowledge base is up to date. What is the most likely cause and solution?
113A machine learning team is fine-tuning a 7B parameter Llama 2 model on a custom dataset of 10,000 documents using OCI Data Science and GPU instances. They encounter out-of-memory (OOM) errors during the fine-tuning process. They are using a batch size of 8 and a sequence length of 2048. They cannot increase the GPU memory. Which change should they prioritize to resolve the OOM?
114A company is building a chatbot using OCI Generative AI service. They want to ensure that the model responses are grounded in their internal knowledge base. Which approach should they use?
115A data scientist is using OCI Data Science with the Generative AI service to fine-tune a Cohere Command model on a custom dataset of customer support tickets. After training, the model produces poor, irrelevant responses. What is the most likely cause?
116An enterprise wants to deploy a large language model for processing sensitive internal documents. They must ensure that data does not leave their OCI tenancy. Which OCI GenAI deployment option meets this requirement?
117An organization is implementing a RAG system using OCI GenAI. Which two are best practices for optimizing retrieval and generation? (Choose two.)
118A developer is evaluating OCI GenAI model families. Which three are correct characteristics of the available models? (Choose three.)
119A healthcare company is using OCI GenAI to generate patient summaries from clinical notes. The model output sometimes includes hallucinated medical facts, such as incorrect dosages or diagnoses, which could be dangerous. The team needs to improve factual accuracy while maintaining data privacy. They have a large collection of internal medical knowledge bases (clinical guidelines, drug databases) that are stored in OCI Object Storage. The current implementation uses a zero-shot prompt with the base Cohere Command model. The data science team has limited GPU resources and wants to avoid building a complex pipeline. Which course of action best addresses the hallucination problem?
120An e-commerce company fine-tuned a Cohere Command model on their product catalog to generate product descriptions. During inference, they notice the model outputs are too repetitive: it often repeats similar phrases across different products, and the descriptions lack diversity. The team wants to increase the variety of the generated text without sacrificing relevance. They are currently using temperature=0.8, top_p=0.9, frequency_penalty=0, and presence_penalty=0. Which parameter adjustment should they make to most effectively increase diversity?
121A financial institution uses OCI GenAI to power a customer support chatbot. The compliance team requires that responses are strictly consistent with regulatory guidelines and approved responses. The company has a curated set of question-answer pairs that cover common scenarios. They want to ensure that the chatbot never deviates from these approved answers. The data science team is considering various approaches to enforce this consistency. Which approach is most effective?
122A research team is using OCI Data Science and OCI GenAI to build a multilingual chatbot for customer service. They have training data in English, Spanish, and French. The model currently struggles with code-switching—users often mix languages in a single query (e.g., 'Quiero cancel my order'), and the model responds inconsistently, sometimes in English, sometimes mixing incorrectly. The team wants to improve performance on code-switching while maintaining fluency in each language. They have limited compute resources and cannot deploy separate models per language. Which approach should they take?
123A company is using OCI GenAI with a Dedicated AI Cluster to serve a large language model for real-time chat applications. They notice high inference latency (average 2 seconds per response) and want to reduce it to under 500 milliseconds without significantly degrading the quality of responses. The cluster is configured with NVIDIA A100 GPUs. The model is the base Cohere Command model (52B parameters). They have explored increasing batch size, but that increases latency for interactive use cases. Which action should they take?
124A developer is testing the OCI Generative AI API by sending a request to generate text using the Cohere Command R model. The request returns the following error: 'The model 'cohere.command-r-08-2024' is not available in this region. Please check the model availability in your region.' The developer is using the us-ashburn-1 region. What is the most likely cause of this error?
125A company uses OCI GenAI to build a content moderation system that filters toxic language in user-generated comments. They have a small labeled dataset of 1,000 comments (500 toxic, 500 non-toxic) and need an efficient solution that balances accuracy, cost, and latency. They are considering different model options: fine-tuning a large LLM (e.g., Cohere Command), using a pre-trained LLM with prompting, fine-tuning a smaller BERT-based classifier, or building a rule-based system. The team has moderate ML experience and wants to deploy using OCI Data Science. Which approach is most efficient for this binary classification task?
126Which TWO statements about large language model (LLM) capabilities are correct?
127Refer to the exhibit. A data scientist runs this inference request and receives a response that is incomplete and seems to stop mid-sentence. Which parameter should be adjusted to allow the model to generate longer outputs?
128A multinational corporation is deploying a generative AI chatbot for customer support using Oracle Cloud Infrastructure's Generative AI service. The chatbot is powered by a large language model (LLM) accessed via the on-demand serving mode. During initial testing, the chatbot provides accurate answers for well-known products but frequently hallucinates or gives incorrect specifications for niche products. The company maintains a comprehensive internal database of product specifications, updated daily. The support team prefers not to fine-tune the LLM due to cost and maintenance overhead. Additionally, the chatbot must respond within 2 seconds to maintain a good customer experience. The team considers several approaches: A. Increasing the 'temperature' parameter to make the model more creative, hoping it will generate more accurate responses when unsure. B. Using few-shot prompting with three manually curated examples of correct product specifications included in every prompt. C. Implementing a Retrieval Augmented Generation (RAG) pipeline that retrieves relevant product documents from the internal database and prepends them to the prompt before inference. D. Reducing the 'topP' parameter to 0.1 to force the model to sample only from the highest probability tokens, thereby reducing randomness. Which approach best meets the requirements of improving factual accuracy while maintaining low latency?
The Fundamentals of Large Language Models domain covers the key concepts tested in this area of the 1Z0-1127 exam blueprint published by Oracle. Courseiva provides free domain-focused practice, mock exams, missed-question review, and readiness tracking across all 1Z0-1127 domains — no account required.
The Courseiva 1Z0-1127 question bank contains 128 questions in the Fundamentals of Large Language Models domain. Click any question to see the full explanation and answer breakdown.
Start with a 10-question focused session to identify your baseline accuracy in this domain. Read every explanation — even for questions you answer correctly — to understand the reasoning. Once you score consistently above 80%, move to a 20–30 question session to confirm depth before moving to the next domain.
Yes — the session launcher on this page draws questions exclusively from the Fundamentals of Large Language Models domain. Choose 10, 20, 30, or 50 questions for a focused session, or click individual questions to review them one by one.
Save your results, see per-domain analytics, and get readiness scores — free, for every certification.
Sign Up FreeFree forever · Every certification included