This chapter covers Azure Data Factory (ADF) triggers: Scheduled, Tumbling Window, and Event-based. Understanding triggers is essential for automating data pipelines, and the DP-900 exam tests your ability to choose the right trigger type for a given scenario. Expect 2-3 questions on triggers, focusing on their differences, use cases, and configuration. You will need to know when to use each trigger, how they handle dependencies, and how they interact with pipeline parameters.
Jump to a section
Imagine a factory assembly line that produces widgets. You have three ways to start production: (1) A Scheduled trigger is like a wall clock that tells the assembly line to start every day at 9 AM sharp. It doesn't care if raw materials arrived or if the previous batch finished—it just starts at that exact time. (2) A Tumbling window trigger is like a timed conveyor belt that moves a fresh batch of materials onto the line every hour, but it only starts after the previous batch has been fully processed. If the previous batch takes 55 minutes, the next batch starts at minute 60—no overlap. It also remembers if a batch failed and can reprocess it. (3) An Event-based trigger is like a sensor at the loading dock that detects when a new pallet of materials arrives. The moment a pallet is placed, the sensor sends a signal to start the assembly line for that specific batch. No pallet, no production. Each trigger type solves a different scheduling need: time-based, windowed time-based, or event-driven.
What are ADF Triggers?
Triggers in Azure Data Factory are the mechanism that initiates a pipeline run. They define when and why a pipeline should execute. Without triggers, pipelines must be started manually or via API calls. The DP-900 exam expects you to understand three trigger types: Schedule Trigger, Tumbling Window Trigger, and Event-based Trigger (specifically Storage Event and Custom Event triggers). Each has distinct characteristics, scheduling semantics, and best-fit scenarios.
Schedule Trigger
A Schedule Trigger is the simplest. It runs a pipeline on a recurring schedule, similar to a cron job. You define a start time, end time (optional), and recurrence pattern (e.g., every 5 minutes, hourly, daily, weekly). The trigger does not wait for the previous run to finish before starting the next. If a pipeline takes longer than the interval, multiple runs can overlap. This is the default behavior and is acceptable for idempotent pipelines.
Key properties:
- Recurrence: Defined using a cron expression or a visual builder. Minimum interval is 1 minute.
- Start time: The first time the trigger fires.
- End time: Optional. If omitted, the trigger runs indefinitely.
- Time zone: Must be specified. Default is UTC.
- Pipeline parameters: Can pass static values or dynamic expressions using @trigger().scheduledTime.
Example cron expression for every 15 minutes: 0 */15 * * * * (seconds, minutes, hours, day of month, month, day of week).
Behavior: - If the pipeline fails, the trigger does not automatically retry. You must configure retry policies on the pipeline activity. - If the trigger is paused, all future runs are skipped. When resumed, it does not backfill missed intervals. - Overlapping runs are allowed. This can cause contention if the pipeline writes to the same resource.
Tumbling Window Trigger
A Tumbling Window Trigger is a stateful trigger that runs pipelines in fixed-size, non-overlapping time intervals (windows). It is designed for scenarios where you need to process data in discrete chunks, such as hourly aggregations. Unlike Schedule Trigger, Tumbling Window ensures that each window runs exactly once and in order. It can also automatically retry failed windows and backfill past windows.
Key properties:
- Window size: Must be between 5 minutes and 14 days. Can be expressed in minutes, hours, or days.
- Start time: The beginning of the first window. The trigger will fire at end of each window (i.e., at start + window size).
- End time: Optional. If omitted, runs indefinitely.
- Delay: A - negative offset to delay the start of processing. For example, a 1-hour window with a 5-minute delay will fire at 1:05 AM for the 12:00 AM-1:00 AM window. This allows late-arriving data.
- Max concurrency: Limits how many windows can run simultaneously. Default is 1 (sequential). You can increase to process windows in parallel.
- Retry policy: Can automatically retry a failed window up to a specified number of times, with a backoff interval.
- Dependency: Can depend on the successful completion of another tumbling window trigger (e.g., process 1-hour windows only after the corresponding 5-minute windows are done).
How it works internally:
1. The trigger divides time into fixed windows starting from the start time.
2. At the end of each window (plus any delay), the trigger fires a pipeline run for that window.
3. The pipeline receives the window start and end times as parameters (@trigger().outputs.windowStartTime and @trigger().outputs.windowEndTime).
4. If the pipeline fails, the trigger marks the window as failed and can retry based on its retry policy.
5. If the trigger is paused and later resumed, it will backfill all windows from the last successful run to the current time (unless you specify a different backfill behavior).
Example: - Window size: 1 hour - Start time: 2024-01-01T00:00:00Z - Delay: 5 minutes - The first window (00:00-01:00) fires at 01:05. The second (01:00-02:00) fires at 02:05, etc.
Event-based Trigger
Event-based triggers respond to events in Azure services. The two main types are Storage Event Trigger and Custom Event Trigger.
Storage Event Trigger:
- Fires when a blob is created, deleted, or updated in an Azure Storage account (Blob Storage or Azure Data Lake Storage Gen2).
- Uses Azure Event Grid to listen for events.
- The trigger passes the blob path (@triggerBody().folderPath and @triggerBody().fileName) to the pipeline, allowing it to process that specific file.
- Supports filtering by blob name prefix and suffix (e.g., only process .csv files in the incoming folder).
- Does not support ordering guarantees. If multiple blobs arrive simultaneously, the trigger fires for each event, but the pipeline runs may execute in any order.
- The trigger does not automatically retry on failure. You must configure retry on the pipeline.
- Limitations: The storage account must be in the same region as the data factory. The trigger cannot be used with Azure Data Lake Storage Gen1.
Custom Event Trigger: - Processes custom events from Azure Event Grid topics. - You define the event schema and the trigger fires when a matching event is published. - Useful for integrating with custom applications or third-party systems.
How Event Grid integration works: 1. ADF registers an event subscription on the storage account or custom topic. 2. When an event occurs (e.g., blob created), Event Grid sends an HTTP POST to ADF's trigger endpoint. 3. ADF validates the event and starts a pipeline run, passing the event data as parameters. 4. The pipeline executes and can use the file path to read the blob.
Comparison and Exam Focus
The DP-900 exam will ask you to identify which trigger type to use in a given scenario. Key differentiators: - Schedule Trigger: Fixed time schedule, no state, overlapping allowed. Use for periodic batch jobs that don't depend on data arrival. - Tumbling Window Trigger: Stateful, non-overlapping windows, order guarantee, retry and backfill. Use for time-series data processing where you need exactly-once processing per window. - Event-based Trigger: Reacts to events. Use for file arrival scenarios, real-time processing, or when you need to process data as soon as it appears.
Common exam traps:
Choosing Schedule Trigger when Tumbling Window is needed for exactly-once processing.
Thinking Event-based triggers provide ordering guarantees (they don't).
Forgetting that Tumbling Window can backfill missed windows, while Schedule Trigger cannot.
Assuming all triggers support automatic retry (only Tumbling Window has built-in retry; others rely on pipeline activity retry).
Define the trigger type
First, decide which trigger best fits your scenario. For time-based recurring jobs without dependency on data, choose Schedule Trigger. For processing time-windowed data with exactly-once semantics, choose Tumbling Window. For reacting to blob creation or events, choose Event-based. This decision is critical because the trigger type cannot be changed after creation without deleting and recreating it.
Configure trigger properties
For Schedule Trigger: set start time, recurrence (e.g., every hour), time zone, and optionally end time. For Tumbling Window: set window size (must be 5 min to 14 days), start time, delay (negative offset), max concurrency, and retry policy. For Event-based: select the storage account or custom topic, define event types (Blob Created, etc.), and optionally set prefix/suffix filters. Ensure the storage account is in the same region as ADF.
Link trigger to pipeline
In the ADF UI, under Triggers, select your trigger and click 'Link to pipeline'. You must pass pipeline parameters that the trigger will populate. For Schedule and Tumbling Window, you can use `@trigger().scheduledTime` or window start/end times. For Event-based, use `@triggerBody().folderPath` and `@triggerBody().fileName`. Without parameter mapping, the pipeline cannot access trigger metadata.
Publish and start the trigger
After linking, you must publish the changes (click 'Publish all'). The trigger will not run until it is started. You can start the trigger immediately or schedule a future start. For Tumbling Window, if you start it after the start time, it will backfill all missed windows automatically. For Schedule Trigger, missed windows are skipped. For Event-based, it will listen for new events from the moment it is started.
Monitor and handle failures
Use ADF Monitor to view trigger runs. For Tumbling Window, you can see which windows succeeded or failed and manually rerun failed windows. For Schedule and Event-based, you must rely on pipeline activity retry policies. Event-based triggers do not have a built-in rerun mechanism; you would need to re-upload the file or trigger the event again. Tumbling Window triggers can be configured to automatically retry failed windows up to a specified count.
Scenario 1: Hourly Sales Aggregation A retail company processes sales transactions every hour. They need to aggregate sales data from raw JSON files into a summary table. Using a Tumbling Window trigger with a 1-hour window size ensures each hour's data is processed exactly once. They set a 10-minute delay to accommodate late-arriving transactions. If a window fails, the trigger retries up to 3 times with a 5-minute interval. In production, they run 24 windows per day, each processing about 10 GB of data. Monitoring shows that less than 1% of windows require retries, usually due to transient storage errors. Misconfiguration: If they used a Schedule Trigger instead, overlapping runs could corrupt the summary table because two runs might try to write to the same partition simultaneously.
Scenario 2: Real-time File Ingestion
A financial services firm receives trade files from multiple exchanges. They need to process each file as soon as it arrives. They deploy an Event-based trigger on a Blob Storage container. The trigger filters for .csv files with prefix trades/. Each file triggers a pipeline that validates, transforms, and loads the data into Azure SQL Database. In production, hundreds of files arrive per minute, so the trigger fires rapidly. However, the pipeline must be idempotent because if the same file is uploaded twice (e.g., due to retry), it will be processed again. They also set a high concurrency limit on the pipeline to handle bursts. A common mistake is using a Schedule Trigger to poll the container every minute, which introduces latency and misses files that arrive between polls.
Scenario 3: Daily Batch ETL A media company runs a nightly ETL job that processes the previous day's logs. They use a Schedule Trigger set to run once per day at 2:00 AM. The pipeline takes about 30 minutes. If the pipeline fails, they have manual intervention to rerun. Since the job is idempotent (it reads all logs from the previous day), overlapping runs are not an issue. In production, they use a Tumbling Window trigger only when they need to process data in discrete windows (e.g., hourly). For daily full refresh, Schedule Trigger is simpler and sufficient. A common error is using Tumbling Window for a daily job without needing window-based processing, which adds unnecessary complexity.
DP-900 Objective: 3.2 - Analytics The exam tests your ability to identify the correct trigger type for a given scenario. You will NOT be asked to write trigger definitions or configure them in detail. Instead, you must understand the core differences.
Common Wrong Answers: 1. Choosing Schedule Trigger when Tumbling Window is needed – Candidates often pick Schedule because it's familiar, but the question may mention 'process data for each hour without overlap' or 'ensure exactly-once processing'. The key is that Schedule allows overlapping runs; Tumbling Window does not. 2. Selecting Event-based trigger for time-based scheduling – If the question says 'run every 5 minutes', the answer is Schedule or Tumbling Window, not Event-based. Event-based is for reacting to events, not time. 3. Thinking Tumbling Window triggers can run on a custom time (e.g., every 3 minutes) – The minimum window size is 5 minutes. Any interval smaller than that requires Schedule Trigger. 4. Assuming all triggers support backfill – Only Tumbling Window automatically backfills missed windows. Schedule Trigger skips missed runs. Event-based triggers only process events that occur after they are started.
Specific Numbers and Terms:
- Minimum Tumbling Window size: 5 minutes
- Maximum Tumbling Window size: 14 days
- Schedule Trigger minimum recurrence: 1 minute
- Tumbling Window delay: expressed as a negative time offset (e.g., -00:05:00)
- Event-based trigger uses Azure Event Grid
- Storage Event trigger cannot be used with Azure Data Lake Storage Gen1
Edge Cases:
- If a Tumbling Window trigger is paused and then resumed, it will backfill all windows from the last successful run to the current time. This can cause a large number of pipeline runs if the pause was long.
- Event-based triggers do not guarantee order. If you need ordered processing of files, you must implement ordering logic in the pipeline (e.g., use a queue).
- Schedule Trigger can pass @trigger().scheduledTime to the pipeline, which is the time the trigger was scheduled to run. For Tumbling Window, you get window start and end times.
How to Eliminate Wrong Answers: - If the scenario mentions 'process data for each hour' and 'no overlap', eliminate Schedule and Event-based. The answer is Tumbling Window. - If the scenario says 'run a job every day at midnight', eliminate Event-based. Both Schedule and Tumbling Window can do this, but Schedule is simpler. - If the scenario says 'process a file as soon as it arrives in blob storage', eliminate Schedule and Tumbling Window. The answer is Event-based. - If the scenario mentions 'retry automatically on failure', Tumbling Window has built-in retry; others do not, but you can configure retry on the pipeline activity.
Schedule Trigger: fixed schedule, overlapping allowed, no backfill, minimum interval 1 minute.
Tumbling Window Trigger: fixed-size windows, non-overlapping, exactly-once, backfill, minimum window 5 minutes.
Event-based Trigger: reacts to events (blob create/delete/update or custom events), no ordering guarantee, uses Event Grid.
Tumbling Window triggers can have a delay (negative offset) to accommodate late-arriving data.
Only Tumbling Window triggers have built-in retry for failed windows.
Event-based triggers require the storage account to be in the same region as the data factory.
Schedule and Tumbling Window triggers can pass scheduled time or window boundaries to pipelines.
These come up on the exam all the time. Here's how to tell them apart.
Schedule Trigger
Runs on a fixed schedule (e.g., every 5 minutes, hourly).
Allows overlapping runs if the pipeline takes longer than the interval.
No built-in retry for missed runs; no backfill.
Passes `@trigger().scheduledTime` to pipeline.
Minimum interval is 1 minute.
Tumbling Window Trigger
Runs in fixed, non-overlapping time windows (e.g., every hour).
Ensures exactly-once processing per window; no overlap.
Built-in retry and automatic backfill of missed windows.
Passes `@trigger().outputs.windowStartTime` and `windowEndTime`.
Minimum window size is 5 minutes.
Tumbling Window Trigger
Time-based, not event-driven.
Processes data in discrete windows; suitable for batch processing.
Supports dependencies on other tumbling window triggers.
Can backfill missed windows.
Ideal for scenarios where data arrives with some delay.
Event-based Trigger
Event-driven (e.g., blob created, custom event).
Processes data as soon as the event occurs; real-time.
No built-in ordering or dependencies.
No backfill; only processes events after trigger is started.
Ideal for reactive processing of file arrivals.
Mistake
All triggers can automatically retry failed pipeline runs.
Correct
Only Tumbling Window triggers have built-in retry policy for the window itself. Schedule and Event-based triggers rely on the pipeline activity retry settings, which are separate from the trigger.
Mistake
Event-based triggers guarantee that files are processed in the order they arrive.
Correct
Event-based triggers fire for each event, but there is no ordering guarantee. Multiple events can trigger pipeline runs that execute concurrently or out of order.
Mistake
Tumbling Window triggers can run every 1 minute.
Correct
The minimum window size for a Tumbling Window trigger is 5 minutes. For intervals less than 5 minutes, you must use a Schedule Trigger.
Mistake
Schedule triggers can backfill missed runs if the trigger is paused and resumed.
Correct
Schedule triggers do not backfill. When resumed, they only fire for future scheduled times. Missed intervals are skipped.
Mistake
Event-based triggers can be used with any Azure storage account.
Correct
Event-based triggers (Storage Event) can only be used with Azure Blob Storage and Azure Data Lake Storage Gen2. They are not supported for ADLS Gen1 or other storage types.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
A Schedule trigger runs a pipeline on a fixed schedule and allows overlapping runs if the pipeline takes longer than the interval. A Tumbling Window trigger runs a pipeline for fixed, non-overlapping time windows and ensures exactly-once processing per window. Tumbling Window also supports automatic retry and backfill of missed windows, while Schedule does not.
No. The minimum window size for a Tumbling Window trigger is 5 minutes. For intervals less than 5 minutes, you must use a Schedule trigger.
Event-based triggers use Azure Event Grid to listen for events such as blob creation or deletion in Azure Storage. When an event occurs, Event Grid sends a notification to ADF, which then starts a pipeline run. The trigger passes event data (e.g., file path) to the pipeline. It does not guarantee ordering and cannot backfill missed events.
No. Schedule triggers do not backfill. When the trigger is resumed, it only fires for future scheduled times. Any intervals that were missed while paused are skipped. For backfill capability, use a Tumbling Window trigger.
For Schedule triggers, you can pass `@trigger().scheduledTime`. For Tumbling Window triggers, you can pass `@trigger().outputs.windowStartTime` and `@trigger().outputs.windowEndTime`. For Event-based triggers, you can pass `@triggerBody().folderPath` and `@triggerBody().fileName` (for blob events).
Use an Event-based trigger (Storage Event). It will fire immediately when a blob is created, deleted, or updated. A Schedule trigger would introduce latency because it polls on a fixed interval.
No. Event-based triggers fire for each event, but if the pipeline fails, the event is not retriggered automatically. You must configure retry on the pipeline activity. Also, duplicate events can occur if the same file is uploaded multiple times.
You've just covered ADF Triggers: Scheduled, Tumbling, Event-Based — now see how well it sticks with free DP-900 practice questions. Full explanations included, no account needed.
Done with this chapter?