Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tags to SavedQueries #10987

Merged
merged 4 commits into from
Dec 19, 2024
Merged

Add tags to SavedQueries #10987

merged 4 commits into from
Dec 19, 2024

Conversation

theyostalservice
Copy link
Contributor

@theyostalservice theyostalservice commented Nov 11, 2024

Resolves #11155

Problem

Folks would like to be able to execute exports based on tags. We can't easily support individual exports in the dbt dag right now because the exports aren't proper nodes in the dbt graph, but we can provide similar functionality by allowing them to tag the saved queries that declare those exports.

This provides several types of related functionality -

  • Add tags directly to the saved query in the way described in our docs for non-config objects. These can be arranged in several different forms -
    • tags: ["my_tag_1", "my_tag_2"]
    • tags: "my_tag"
    • tags:
         - "my_tag_1"
         - "my_tag_2"
      
  • Add tags may also be set and inherited from config objects

Solution

To do this, updates were made to the schemas and pydantic class for saved queries in dsi in a preceding PR. In this PR, we update how the saved queries are parsed in dbt-core (for the manifest, not the semantic-manifest). We add a little special logic to handle the case where the tags argument is just a single string and to sort and de-duplicate tags because that's just nicer.

(Linked PR in schemas.dbt.com)

Manual Testing

To test this locally, I ran make dev and then used the local version of dbt-core to parse a test project we have.

I added this to the dbt_project.yml:

saved-queries:
  +cache:
    enabled: true
  +tags: 
    - tag_config_1
  enabled: false

(only the tags lines were new)

And I added this line to a yml file's saved query :
tags: ['tag_b', 'tag_d', 'tag_a']

I ran ~/git/dbt-core/venv/bin/dbt parse (i had some weirdness when i didn't point at my venv), and checked the manifest file and semantic manifest. they both had sections like this for the saved query:
"tags": ["tag_a", "tag_b", "tag_config_1", "tag_d"]}]}

I tried it again with lists of tags in my saved queries structured like

  • tags: 
        - "tag_A"
        - "tag_2"
    

    resulting in "tags": ["tag_2", "tag_A", "tag_config_1"]}]}

  • and

      tags: "tag_A"
    

    resulting in "tags": ["tag_A", "tag_config_1"]}]}

Running with these tags selected
The difference in dbt_core is actually minimal.

If I run this with a selector on tags that doesn't exist, I see:

─ ~/git/dbt-core/venv/bin/dbt build --select tag:tag_45
╰─ ~/git/dbt-core/venv/bin/dbt build --select tag:tag_45
16:35:49 Running with dbt=1.10.0-a1
16:35:49 Registered adapter: snowflake=1.8.4
16:35:50 Found 10 models, 6 seeds, 18 data tests, 15 sources, 18 metrics, 683 macros, 1 group, 5 semantic models, 1 saved query
16:35:50 The selection criterion 'tag:tag_45' does not match any enabled nodes
16:35:50 The selection criterion 'tag:tag_45' does not match any enabled nodes
16:35:50 The selection criterion 'tag:tag_45' does not match any enabled nodes
16:35:50 Nothing to do. Try checking your model configs and model specification args

Whereas with a real tag, we build as expected:

~/git/dbt-core/venv/bin/dbt build --select tag:tag_a
16:35:55 Running with dbt=1.10.0-a1
16:35:55 Registered adapter: snowflake=1.8.4
16:35:55 Found 10 models, 6 seeds, 18 data tests, 15 sources, 18 metrics, 683 macros, 1 group, 5 semantic models, 1 saved query
16:35:55
16:35:55 Concurrency: 8 threads (target='dev')
16:35:55
16:35:57 1 of 1 NO-OP saved query with_list_tags_v2 ..................................... [NO-OP in 0.00s]
16:35:57
16:35:57 Finished running 1 saved query in 0 hours 0 minutes and 1.42 seconds (1.42s).
16:35:57
16:35:57 Completed successfully
16:35:57
16:35:57 Done. PASS=0 WARN=0 ERROR=0 SKIP=0 NO-OP=1 TOTAL=1

Checklist

  • I have read the contributing guide and understand what's expected of me.
  • I have run this code in development, and it appears to resolve the stated issue.
  • This PR includes tests, or tests are not required or relevant for this PR.
  • This PR has no interface changes (e.g., macros, CLI, logs, JSON artifacts, config files, adapter interface, etc.) or this PR has already received feedback and approval from Product or DX.
  • This PR includes type annotations for new and modified functions.

@cla-bot cla-bot bot added the cla:yes label Nov 11, 2024
Copy link
Contributor

Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide.

Copy link
Contributor Author

theyostalservice commented Nov 11, 2024

This stack of pull requests is managed by Graphite. Learn more about stacking.

theyostalservice added a commit to dbt-labs/dbt-semantic-interfaces that referenced this pull request Nov 14, 2024
Resolves [#369](#369)

### Description

Addresses internal Linear issue SL-2896.

Users currently can execute parts of their DAG conditionally based on tags added to individual nodes (see [documentation](https://docs.getdbt.com/reference/resource-configs/tags)).  This brings SavedQueries in line with other similar nodes and allows the use of tags as described in that documentation.  (See the added tests for examples.)

It does not add any hierarchical behaviors for these tags.

A related [PR #10987](dbt-labs/dbt-core#10987) is in progress in dbt-core.

### Checklist

- [x] I have read [the contributing guide](https://github.com/dbt-labs/dbt-semantic-interfaces/blob/main/CONTRIBUTING.md) and understand what's expected of me
- [x] I have signed the [CLA](https://docs.getdbt.com/docs/contributor-license-agreements)
- [x] This PR includes tests, or tests are not required/relevant for this PR
- [x] I have run `changie new` to [create a changelog entry](https://github.com/dbt-labs/dbt-semantic-interfaces/blob/main/CONTRIBUTING.md#adding-a-changelog-entry)
@theyostalservice theyostalservice force-pushed the patricky__add_tags_to_saved_query branch from 2d39824 to 56d3899 Compare November 14, 2024 20:17
@theyostalservice theyostalservice force-pushed the patricky__add_tags_to_saved_query branch from 54b5b3c to f2b8786 Compare December 16, 2024 17:18
Copy link

codecov bot commented Dec 16, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 88.87%. Comparing base (6c61cb7) to head (06ba8be).
Report is 5 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #10987      +/-   ##
==========================================
- Coverage   88.90%   88.87%   -0.04%     
==========================================
  Files         187      187              
  Lines       24001    24018      +17     
==========================================
+ Hits        21338    21345       +7     
- Misses       2663     2673      +10     
Flag Coverage Δ
integration 86.18% <100.00%> (-0.10%) ⬇️
unit 61.98% <55.55%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
Unit Tests 61.98% <55.55%> (-0.01%) ⬇️
Integration Tests 86.18% <100.00%> (-0.10%) ⬇️

@theyostalservice theyostalservice marked this pull request as ready for review December 16, 2024 17:55
@theyostalservice theyostalservice requested review from a team as code owners December 16, 2024 17:55
@theyostalservice theyostalservice requested review from wpowers-dbt and removed request for a team December 16, 2024 17:55
@github-actions github-actions bot added the community This PR is from a community member label Dec 16, 2024
@theyostalservice theyostalservice changed the title InProgress: Add tags to SavedQueries Add tags to SavedQueries Dec 16, 2024
@theyostalservice theyostalservice added the artifact_minor_upgrade To bypass the CI check by confirming that the change is not breaking label Dec 16, 2024
Copy link
Contributor

@ChenyuLInx ChenyuLInx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Might be worth having folks on metadata side confirm the schema change is safe for them.

@@ -55,7 +55,8 @@
"success",
"error",
"skipped",
"partial success"
"partial success",
"no-op"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this had already been merged as I see it here: https://schemas.getdbt.com/dbt/run-results/v6.json

theyostalservice added a commit to dbt-labs/schemas.getdbt.com that referenced this pull request Dec 18, 2024
This PR copies over the changes from [this change ](dbt-labs/dbt-core#10987)
to dbt-core.
theyostalservice added a commit to dbt-labs/schemas.getdbt.com that referenced this pull request Dec 18, 2024
This PR copies over the changes from [this change ](dbt-labs/dbt-core#10987)
to dbt-core.
@theyostalservice theyostalservice added the semantic Issues related to the semantic layer label Dec 19, 2024
Copy link
Contributor

@courtneyholcomb courtneyholcomb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!!

theyostalservice added a commit to dbt-labs/schemas.getdbt.com that referenced this pull request Dec 19, 2024
This PR copies over the changes from [this change ](dbt-labs/dbt-core#10987)
to dbt-core.
@ChenyuLInx ChenyuLInx merged commit 97ffc37 into main Dec 19, 2024
64 of 69 checks passed
@ChenyuLInx ChenyuLInx deleted the patricky__add_tags_to_saved_query branch December 19, 2024 18:18
theyostalservice added a commit to dbt-labs/dbt-jsonschema that referenced this pull request Dec 19, 2024
Title says it all.

Linked PRs:
* the starting point at [dbt-semantic-interfaces](https://github.com/dbt-labs/dbt-semantic-interfaces/pull/366/files)
* [dbt-core](dbt-labs/dbt-core#10987)
* [schemas.dbt.com](dbt-labs/schemas.getdbt.com#75)

Example of how this would look in an actual saved
query in a yml file:
```
saved_queries:
- name: with_list_tags_v2
    tags: ['tag_b', 'tag_d', 'tag_a']
    description: New customer orders by name and time
    query_params:
      metrics:
        - orders
      group_by:
        - Dimension('customer__customer_name')
        - TimeDimension('metric_time', 'day')
      where:
        - "{{ Dimension('customer__customer_type') }}  = 'new'"
    exports:
      - name: new_customer_orders_table
        config:
          export_as: table
      - name: new_customer_orders_view
        config:
          export_as: view
          alias: new_customer_orders_export_alias
          testing: "what"
          breaking: 'break it now'
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
artifact_minor_upgrade To bypass the CI check by confirming that the change is not breaking cla:yes community This PR is from a community member semantic Issues related to the semantic layer
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature] Support use of "tags" field for SavedQuery
4 participants