You are building a Vertex AI pipeline using the KFP SDK v2. One component processes a large dataset and outputs a metrics artifact. You notice that the component is being cached even when the dataset changes, because the component code and image remain the same. How can you force the component to always re-execute when the dataset changes?
Setting max_cache_staleness to 0s disables caching for that component, forcing re-execution.
Why this answer
In KFP SDK v2, caching is keyed on the component code, image digest, and all input values. If the dataset is passed as a URI string, changing the URI will invalidate the cache. If the dataset changes without a URI change, you can disable caching per component by setting caching_strategy.max_cache_staleness to 0 or using the disable_cache method.