Fix: Continuous Aggregate watermark in the future on full-refresh#17
Open
rogiervandergeer wants to merge 1 commit intosdebruyn:mainfrom
Open
Fix: Continuous Aggregate watermark in the future on full-refresh#17rogiervandergeer wants to merge 1 commit intosdebruyn:mainfrom
rogiervandergeer wants to merge 1 commit intosdebruyn:mainfrom
Conversation
When performing a full-refresh on a continuous aggregate with a refresh policy, the watermark was sometimes incorrectly set to a future timestamp. This occurred because the `do_refresh_continuous_aggregate` macro was not passing the `end_offset` from the model's configuration to the `refresh_continuous_aggregate` function, defaulting it to `null`. This commit addresses the issue by: - Modifying the `do_refresh_continuous_aggregate` macro to correctly pass the `end_offset` from the model's configuration. If `end_offset` is not explicitly defined, it defaults to `null`. - Adding a new functional test `test_continuous_aggregate_watermark.py` to verify that the continuous aggregate watermark is always less than the current time after a full refresh, even when no data has been materialized. This ensures that continuous aggregates behave as expected during full refreshes, preventing data visibility issues caused by a future watermark.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When performing a full-refresh on a continuous aggregate with a refresh policy, the watermark is sometimes incorrectly set to a future timestamp. This is because the
do_refresh_continuous_aggregatemacro isn't not passing theend_offsetfrom the model's configuration to therefresh_continuous_aggregatefunction, defaulting it tonull. See alsoThis PR addresses the issue by:
do_refresh_continuous_aggregatemacro to correctly pass theend_offsetfrom the model's refresh policy configuration. If no policy is defined, orend_offsetis not set, it defaults tonull.test_continuous_aggregate_watermark.pyto verify that the continuous aggregate watermark is always less than the current time after a full refresh, even when no data has been materialized.This ensures that continuous aggregates behave as expected during full refreshes, preventing data visibility issues caused by a future watermark.