A data scientist is using OCI Generative AI to process a large batch of legal documents. The total cost is higher than expected. Which factor is most likely the primary driver of cost?
Pricing is typically per token; longer documents mean more tokens and higher cost.
Why this answer
In OCI Generative AI, pricing is primarily based on the total number of tokens processed, which includes both input (prompt) and output (generated) tokens. Processing large batches of legal documents generates high token counts due to lengthy text inputs and verbose outputs, directly increasing cost. The number of API requests alone does not determine cost—a single request with many tokens costs more than many requests with few tokens.
Exam trap
Cisco often tests the misconception that API request count is the primary cost driver, when in reality token-based pricing means a single large request can cost more than hundreds of tiny requests.
How to eliminate wrong answers
Option A is wrong because OCI Generative AI charges per token, not per API request; a single request with a large prompt and long response incurs higher cost than many small requests. Option C is wrong because sampling strategy (e.g., top-k vs greedy) affects output diversity and quality, not the token count or pricing model. Option D is wrong because latency of the inference endpoint impacts response time and throughput, not the cost per token or total cost.