-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix erroneous additional batch execution #11113
Fix erroneous additional batch execution #11113
Conversation
Previously if there was only one batch, we would try to execute _two_ batches. The first batch, and a "last" non existent batch. This would result in an unhandled exception.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #11113 +/- ##
==========================================
- Coverage 88.96% 88.94% -0.03%
==========================================
Files 183 183
Lines 23933 23934 +1
==========================================
- Hits 21291 21287 -4
- Misses 2642 2647 +5
Flags with carried forward coverage won't be shown. Click here to find out more.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change looks mostly good!
From organizing code perspective what's your thoughts on hide all logic here in the _submit_batch
function? It seems like we also have some special logic to handle the last batch in that function. I think it's good to have all these in a single logic layer vs multiple. WDYT?
Lines 741 to 785 in 5942aa5
relation_exists = self._submit_batch( | |
node=node, | |
adapter=runner.adapter, | |
relation_exists=relation_exists, | |
batches=batches, | |
batch_idx=batch_idx, | |
batch_results=batch_results, | |
pool=pool, | |
force_sequential_run=True, | |
) | |
batch_idx += 1 | |
skip_batches = batch_results[0].status != RunStatus.Success | |
# Run all batches except first and last batch, in parallel if possible | |
while batch_idx < len(runner.batches) - 1: | |
relation_exists = self._submit_batch( | |
node=node, | |
adapter=runner.adapter, | |
relation_exists=relation_exists, | |
batches=batches, | |
batch_idx=batch_idx, | |
batch_results=batch_results, | |
pool=pool, | |
skip=skip_batches, | |
) | |
batch_idx += 1 | |
# Wait until all submitted batches have completed | |
while len(batch_results) != batch_idx: | |
pass | |
# Only run "last" batch if there is more than one batch | |
if len(batches) != 1: | |
# Final batch runs once all others complete to ensure post_hook runs at the end | |
self._submit_batch( | |
node=node, | |
adapter=runner.adapter, | |
relation_exists=relation_exists, | |
batches=batches, | |
batch_idx=batch_idx, | |
batch_results=batch_results, | |
pool=pool, | |
force_sequential_run=True, | |
skip=skip_batches, | |
) |
@ChenyuLInx I'm down to refactor this code 🙂 I'd definitely like to simplify things here a fair bit. However, this change is going to need to be backported to 1.9.latest for 1.9.1. As such I think a refactor here is out of scope, as such work should probably be forward facing. |
My preference is do the refactor as you change the code if it is not too complex
Why this changes whether we should do the refactor? |
My preference is to keep refactors as separate as possible from changes in logic. I've found it's desirable to not do a refactor as part of work which is being backported. This is because:
Now, I'm generally of the opinion that refactors should be separated from logical changes or bug fixes, unless it is impossible to separate the two. I'm a little looser on that when it is a forward-looking change only (i.e. solely going into an alpha/beta). With backports though, I take it a bit more seriously If the worry is that the refactor won't get done unless it comes alongside the bug fix / logic change, then I'll promise to do the refactor as the next PR I open 🙂 |
I am happy this goes in as is with a fast follow-up PR of the refactor. One thing to note is If you don't do the refactor as part of this change and get it backported, when we have to have some other fixes around this go in, you will have to backport the refactor and then the fix(or do a custom fix). |
* Update single batch test case to check for generic exceptions * Explicitly skip last final batch execution when there is only one batch Previously if there was only one batch, we would try to execute _two_ batches. The first batch, and a "last" non existent batch. This would result in an unhandled exception. * Changie doc (cherry picked from commit c9582c2)
* Update single batch test case to check for generic exceptions * Explicitly skip last final batch execution when there is only one batch Previously if there was only one batch, we would try to execute _two_ batches. The first batch, and a "last" non existent batch. This would result in an unhandled exception. * Changie doc (cherry picked from commit c9582c2) Co-authored-by: Quigley Malcolm <QMalcolm@users.noreply.github.com>
Resolves #11112
Problem
Microbatch models that only ran on batch (either due to
lookback=0
or setting--event-time-start
+--event-time-end
) would raise anUnhandled exception
like the following:Solution
Ensure we skip the last batch execution path when only one batch needs to be run in total for a microbatch model
Checklist