-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create a no-op exposure runner #11082
Changes from all commits
e6325c6
f41ef0b
adc6af2
e1969eb
38358b1
d3eda93
503ddac
878d5da
f6c8715
e0281a6
4b8cb16
ef3d8d4
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
kind: Under the Hood | ||
body: Create a no-op exposure runner | ||
time: 2024-12-02T16:47:15.766574Z | ||
custom: | ||
Author: aranke | ||
Issue: ' ' |
Original file line number | Diff line number | Diff line change | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -1378,7 +1378,7 @@ def group(self): | |||||||||||
|
||||||||||||
|
||||||||||||
@dataclass | ||||||||||||
class Exposure(GraphNode, ExposureResource): | ||||||||||||
class Exposure(NodeInfoMixin, GraphNode, ExposureResource): | ||||||||||||
@property | ||||||||||||
def depends_on_nodes(self): | ||||||||||||
return self.depends_on.nodes | ||||||||||||
|
@@ -1441,6 +1441,12 @@ def same_contents(self, old: Optional["Exposure"]) -> bool: | |||||||||||
def group(self): | ||||||||||||
return None | ||||||||||||
|
||||||||||||
def __post_serialize__(self, dct: Dict, context: Optional[Dict] = None): | ||||||||||||
dct = super().__post_serialize__(dct, context) | ||||||||||||
if "_event_status" in dct: | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why is this necessary? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You're right, this field is never written out. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wasn't implying this was unnecessary, just moreso trying to understand what added _event_status and why the change was made. Digging into this I see its because of the dbt-core/core/dbt/contracts/graph/nodes.py Lines 282 to 286 in 1b7d9b5
I think keeping the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. no sweat, just reverted the commit here |
||||||||||||
del dct["_event_status"] | ||||||||||||
return dct | ||||||||||||
|
||||||||||||
|
||||||||||||
# ==================================== | ||||||||||||
# Metric node | ||||||||||||
|
@@ -1659,6 +1665,12 @@ def same_contents(self, old: Optional["SavedQuery"]) -> bool: | |||||||||||
and True | ||||||||||||
) | ||||||||||||
|
||||||||||||
def __post_serialize__(self, dct: Dict, context: Optional[Dict] = None): | ||||||||||||
dct = super().__post_serialize__(dct, context) | ||||||||||||
if "_event_status" in dct: | ||||||||||||
del dct["_event_status"] | ||||||||||||
return dct | ||||||||||||
|
||||||||||||
|
||||||||||||
# ==================================== | ||||||||||||
# Patches | ||||||||||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
from .exposure_runner import ExposureRunner | ||
from .saved_query_runner import SavedQueryRunner |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
from dbt.runners.no_op_runner import NoOpRunner | ||
|
||
|
||
class ExposureRunner(NoOpRunner): | ||
@property | ||
def description(self) -> str: | ||
return f"exposure {self.node.name}" |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
import threading | ||
|
||
from dbt.artifacts.schemas.results import RunStatus | ||
from dbt.artifacts.schemas.run import RunResult | ||
from dbt.contracts.graph.manifest import Manifest | ||
from dbt.events.types import LogNodeNoOpResult | ||
from dbt.task.base import BaseRunner | ||
from dbt_common.events.functions import fire_event | ||
|
||
|
||
class NoOpRunner(BaseRunner): | ||
@property | ||
def description(self) -> str: | ||
raise NotImplementedError("description not implemented") | ||
|
||
def before_execute(self) -> None: | ||
pass | ||
|
||
def compile(self, manifest: Manifest): | ||
return self.node | ||
|
||
def after_execute(self, result) -> None: | ||
fire_event( | ||
LogNodeNoOpResult( | ||
description=self.description, | ||
index=self.node_index, | ||
total=self.num_nodes, | ||
node_info=self.node.node_info, | ||
) | ||
) | ||
|
||
def execute(self, compiled_node, manifest): | ||
# no-op | ||
return RunResult( | ||
node=compiled_node, | ||
status=RunStatus.NoOp, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This will have the effect of changing the existing result behavior of saved queries. Currently, their |
||
timing=[], | ||
thread_id=threading.current_thread().name, | ||
execution_time=0, | ||
message="NO-OP", | ||
adapter_response={}, | ||
failures=0, | ||
batch_results=None, | ||
agate_table=None, | ||
) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
from dbt.runners.no_op_runner import NoOpRunner | ||
|
||
|
||
class SavedQueryRunner(NoOpRunner): | ||
@property | ||
def description(self) -> str: | ||
return f"saved query {self.node.name}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aranke For the purposes of
dbt retry
, shouldNoOp
be included in RETRYABLE_STATUSES?Put differently: If my selection criteria includes a set of saved queries and
--no-export-saved-queries
(docs), those saved queries will all have no-op statuses inrun_results.json
. If I rerun withdbt retry --export-saved-queries
, should dbt reprocess those saved queries, this time with triggers? (The same thing will go for active exposure.)In order to make that happen, I think we'd also need to include
--export-saved-queries
(and the analogous config for exposures) in ALLOW_CLI_OVERRIDE_FLAGS.My first intuition here was: Because
NoOp
is not a failure (or aSkip
due to an upstream failure), it should not be retryable. But it also is notSuccess
.NoOp
retryable (and adding those CLI flags) is a slightly more ergonomic mechanism to quickly "heal" a run that accidentally missed including that config.When we discussed this the other week, I think we landed on including
RunStatus.NoOp
in the retryable statuses — but after reasoning through this above, I don't think we should include it (= leave the code as it is)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with this assessment because:
This is also a change we can implement fairly easily if we see a need for it, so I’ll merge the code in as-is for now.