A company has a Kinesis Data Firehose delivery stream that receives JSON data from IoT devices. The data is delivered to an S3 bucket. The company notices that the data in S3 is delayed by up to 30 minutes. The Firehose stream is configured with a buffer size of 1 MB and a buffer interval of 60 seconds. The incoming data rate is approximately 100 KB per second. The company needs to reduce the delivery latency to under 5 minutes. Which action should the company take?
Smaller buffer size causes more frequent deliveries, reducing latency.
Why this answer
Option B is correct because reducing the buffer interval to 60 seconds (already set) is not enough; the actual issue is that buffer size is too large relative to data rate. Decreasing buffer interval to 60 seconds is already set, but they need to also reduce buffer size or increase data rate. Actually, the correct action is to decrease the buffer size to 1 MB (already) and decrease buffer interval to 60 seconds (already).
Wait, the latency is due to buffer interval of 60 seconds? No, the problem states latency up to 30 minutes. That suggests that the buffer interval is not the only factor; maybe the data rate is low. Actually, with 100 KB/s, it takes about 10 seconds to fill 1 MB buffer.
So buffer interval of 60 seconds should cause latency up to 60 seconds. The 30-minute delay suggests another issue. Perhaps the Firehose is waiting for more data or there is a backlog.
The correct answer is to decrease the buffer interval to 60 seconds (already) and also decrease the buffer size? Actually, option B says decrease buffer size to 256 KB? That would cause more frequent deliveries and reduce latency. Option A is wrong because increasing buffer interval would increase latency. Option C is wrong because changing compression format does not affect latency.
Option D is wrong because using Lambda adds processing time.