A developer is using the OCI Generative AI API to generate text. The responses are often too short and incomplete. Which parameter adjustment is most likely to produce longer, more complete responses?
Increasing max_tokens gives the model more room to generate a complete response, directly addressing the issue of short outputs.
Why this answer
The max_tokens parameter controls the maximum number of tokens (words or subwords) the model can generate in a single response. By increasing max_tokens, the model is allowed to produce longer sequences, which directly addresses the issue of responses being too short and incomplete. In the OCI Generative AI API, this is the primary parameter for capping output length.
Exam trap
Oracle often tests the distinction between parameters that control output length (max_tokens) versus those that control output diversity or repetition (top_p, frequency_penalty), leading candidates to confuse 'more complete' with 'more creative' or 'less repetitive'.
How to eliminate wrong answers
Option A is wrong because decreasing max_tokens would further restrict the output length, making responses even shorter and more incomplete. Option C is wrong because increasing top_p adjusts nucleus sampling (the cumulative probability threshold for token selection) to control randomness and diversity, not the length of the output. Option D is wrong because decreasing frequency_penalty reduces the penalty for repeating tokens, which may increase repetition but does not directly extend the overall length or completeness of the response.