You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have searched the existing issues, and I could not find an existing issue for this feature
I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion
Describe the feature
Add week as a batch size option
Also allow the user to set what the start day of the week is.
Describe alternatives you've considered
I hack my through the microbatch - executing a daily microbatch - but with a filter to only include Mondays.
This works fine for my daily job
{# Custom microbatch selection for windowing #}
{%- set microbatch_day = model.config.__dbt_internal_microbatch_event_time_start|default(modules.datetime.datetime(1899, 1, 1,0,0,0))-%}
{% set batch_7_days_ago = ( microbatch_day - modules.datetime.timedelta(days=7)).strftime("%Y-%m-%d") %}
{% set batch_1_days_ago = ( microbatch_day - modules.datetime.timedelta(days=1)).strftime("%Y-%m-%d") %}
with
filter_microbatch__sessions as (
select * from {{ ref('our__sessions').render() }}
--override default microbatch behavior with 7 day window
where date BETWEEN '{{ batch_7_days_ago }}' AND '{{ batch_1_days_ago }}'
-- only run 1 day per week -- Mondays
and EXTRACT(dayofweek FROM DATE('{{ batch_7_days_ago }}')) = 2
)
select * from filter_microbatch__sessions
But that wastes a lot of execution time during full-backfills. And my users want to do a lot of backfills.
I probably should use smarter jinja, instead of sql, to figure out when not to execute.
Who will this benefit?
Any user of microbatch incremental strategy
This could also help with large backfills -
For the normal incremental run - Users could set an environment variable for 'day' for their daily incremental with a 3 day lookback.
But if I'm processing 3-7 years of data - shouldn't I query at the week level? Running 1K queries can take a very long time just to process 1K btache. Wouldn't 150 be much more efficient for larger windows of time?
Specifically to bigquery - what about table quota limits? Is some method necessary to reduce the number of table operations?
Are you interested in contributing this feature?
I think so
Anything else?
Microbatch has helped my org a lot - thanks for this feature!
The text was updated successfully, but these errors were encountered:
Is this your first time submitting a feature request?
Describe the feature
Add week as a batch size option
Also allow the user to set what the start day of the week is.
Describe alternatives you've considered
I hack my through the microbatch - executing a daily microbatch - but with a filter to only include Mondays.
This works fine for my daily job
But that wastes a lot of execution time during full-backfills. And my users want to do a lot of backfills.
I probably should use smarter jinja, instead of sql, to figure out when not to execute.
Who will this benefit?
Any user of microbatch incremental strategy
This could also help with large backfills -
For the normal incremental run - Users could set an environment variable for 'day' for their daily incremental with a 3 day lookback.
But if I'm processing 3-7 years of data - shouldn't I query at the week level? Running 1K queries can take a very long time just to process 1K btache. Wouldn't 150 be much more efficient for larger windows of time?
Specifically to bigquery - what about table quota limits? Is some method necessary to reduce the number of table operations?
Are you interested in contributing this feature?
I think so
Anything else?
Microbatch has helped my org a lot - thanks for this feature!
The text was updated successfully, but these errors were encountered: