Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Same temp table is used for different unit tests #222

Open
2 tasks done
afillatre opened this issue May 22, 2024 · 6 comments
Open
2 tasks done

[Bug] Same temp table is used for different unit tests #222

afillatre opened this issue May 22, 2024 · 6 comments
Labels
feature:unit-tests Issues related to built-in dbt unit testing functionality Stale Mark an issue or PR as stale, to be closed type:bug Something isn't working as documented

Comments

@afillatre
Copy link

Is this a new bug in dbt-bigquery?

  • I believe this is a new bug in dbt-bigquery
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

When some unit tests in different models have the same name (like test_compute_one_line), some tests can fail depending on racing conditions (threads > 1).
The temp table name is <test_name>__dbt_tmp, so 1 test can override the content of another one.

Expected Behavior

Unit tests should run in parallel, without impacting other one, regardless their naming.
ATM I'm forced to run my tests with --threads 1 to prevent the issue

Steps To Reproduce

  1. Create 2 models
  2. Create a unit test within each model, both with the same name
  3. Run the test with --threads > 1

Relevant log output

No response

Environment

- OS: OS X 14.4.1
- Python: 3.11
- dbt-core: 1.8.0
- dbt-bigquery: 1.8.1

Additional Context

No response

@afillatre afillatre added type:bug Something isn't working as documented triage:product In Product's queue labels May 22, 2024
@jtcohen6
Copy link
Contributor

jtcohen6 commented May 23, 2024

Thanks @afillatre, I'm going to transfer this to dbt-adapters because I believe it's relevant to adapters beyond just BigQuery.

Here's where we generate the temp name:

{%- set target_relation = this.incorporate(type='table') -%}
{%- set temp_relation = make_temp_relation(target_relation)-%}

I think the options are:

  1. Create a temporary table name including both the unit test name and the model it's testing, to avoid this collision
  2. Document that this is a known limitation of unit tests, and each unit test should have a globally unique name

@jtcohen6 jtcohen6 removed the triage:product In Product's queue label May 23, 2024
@jtcohen6 jtcohen6 transferred this issue from dbt-labs/dbt-bigquery May 23, 2024
@afillatre
Copy link
Author

Thanks @jtcohen6. I had that first solution in mind as well. However I do not know if there's a limitation regarding the table's name length in any supported database.

@jtcohen6
Copy link
Contributor

@afillatre On Postgres the max identifier length is 63 (which is very short!), so the postgres__make_relation_with_suffix macro (called by make_temp_relation) will hash names longer than 63.

If this is just a matter of passing both the model name + unit test name into make_temp_relation, a naïve version of that change might look like:

  {%- set model_test_identifier = model['unique_id'].split('.')[2:] | join('__') -%} -- returns model_name__unit_test_name
  {%- set target_relation = this.incorporate(type='table', path={"identifier": model_test_identifier}) -%}
  {%- set temp_relation = make_temp_relation(target_relation)-%}

@afillatre
Copy link
Author

Wonderful, thanks for the tips.

I made it work like that (overriding a macro):

{% macro bigquery__make_temp_relation(base_relation, suffix) %}
    {% set model_test_identifier = model['unique_id'].split('.')[2:] | join('__') %} -- returns model_name__unit_test_name
    {% set target_relation = this.incorporate(type='table', path={"identifier": model_test_identifier}) %}
    {{ return(target_relation) }}
{% endmacro %}

Should I create a PR in the bigQuery connector ?

Copy link

github-actions bot commented Dec 4, 2024

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

@github-actions github-actions bot added the Stale Mark an issue or PR as stale, to be closed label Dec 4, 2024
@tallamohan
Copy link

dbt fails to drop the temporary table (__dbt_tmp) when a Runtime Error occurs during unit tests.

(venv) PS C:\GitHub\dbt-teradata\jaffle_shop-dev> dbt test --select "dummy_model,test_type:unit"
06:59:54  Running with dbt=1.8.9
06:59:58  Registered adapter: teradata=1.0.0
07:00:01  Found 12 models, 4 seeds, 20 data tests, 575 macros, 1 unit test
07:00:01
07:00:11  Concurrency: 1 threads (target='dev')
07:00:11
07:00:11  1 of 1 START unit_test dummy_model::test_does_location_opened_at_trunc_to_date . [RUN]
07:00:22  1 of 1 ERROR dummy_model::test_does_location_opened_at_trunc_to_date ........... [ERROR in 11.33s]
07:00:22
07:00:22  Finished running 1 unit test in 0 hours 0 minutes and 20.66 seconds (20.66s).
07:00:22  
07:00:22  Completed with 1 error and 0 warnings:
07:00:22
07:00:22    Runtime Error in unit_test test_does_location_opened_at_trunc_to_date (models\staging\schema.yml)
  An error occurred during execution of unit test 'test_does_location_opened_at_trunc_to_date'. There may be an error in the unit test definition: check the data types.
   Compilation Error in unit_test test_does_location_opened_at_trunc_to_date (models\staging\schema.yml)
    Invalid column name: 'opened_dated' in unit test fixture for expected output.
    Accepted columns for expected output are: ['location_id', 'location_name', 'tax_rate', 'opened_date']

    > in macro format_row (macros\unit_test_sql\get_fixture_sql.sql)
    > called by macro teradata__get_expected_sql (macros\materializations\unit\get_fixture_sql.sql)
    > called by macro get_expected_sql (macros\materializations\unit\get_fixture_sql.sql)
    > called by macro materialization_unit_teradata (macros\materializations\unit\unit.sql)
    > called by unit_test test_does_location_opened_at_trunc_to_date (models\staging\schema.yml)
07:00:22
07:00:22  Done. PASS=0 WARN=0 ERROR=1 SKIP=0 TOTAL=1
(venv) PS C:\GitHub\dbt-teradata\jaffle_shop-dev> 
(venv) PS C:\GitHub\dbt-teradata\jaffle_shop-dev>
(venv) PS C:\GitHub\dbt-teradata\jaffle_shop-dev>
(venv) PS C:\GitHub\dbt-teradata\jaffle_shop-dev> dbt test --select "dummy_model,test_type:unit"
07:09:05  Running with dbt=1.8.9
07:09:05  Registered adapter: teradata=1.0.0
07:09:06  Found 12 models, 4 seeds, 20 data tests, 575 macros, 1 unit test
07:09:06
07:09:16  Concurrency: 1 threads (target='dev')
07:09:16
07:09:16  1 of 1 START unit_test dummy_model::test_does_location_opened_at_trunc_to_date . [RUN]
07:09:25  1 of 1 ERROR dummy_model::test_does_location_opened_at_trunc_to_date ........... [ERROR in 9.00s]
07:09:25
07:09:25  Finished running 1 unit test in 0 hours 0 minutes and 19.13 seconds (19.13s).
07:09:25  
07:09:25  Completed with 1 error and 0 warnings:
07:09:25
07:09:25    Runtime Error in unit_test test_does_location_opened_at_trunc_to_date (models\staging\schema.yml)
  An error occurred during execution of unit test 'test_does_location_opened_at_trunc_to_date'. There may be an error in the unit test definition: check the data types.
   Database Error
    [Version 20.0.0.20] [Session 22498] [Teradata Database] [Error 3803] Table 'test_does_location_opened_at_trunc_to_date__dbt_tmp' already exists.
     at gosqldriver/teradatasql.formatError ErrorUtil.go:91
     at gosqldriver/teradatasql.(*teradataConnection).formatDatabaseError ErrorUtil.go:251
     at gosqldriver/teradatasql.(*teradataConnection).makeChainedDatabaseError ErrorUtil.go:267
     at gosqldriver/teradatasql.(*teradataConnection).processErrorParcel TeradataConnection.go:751
     at gosqldriver/teradatasql.(*TeradataRows).processResponseBundle TeradataRows.go:2308
     at gosqldriver/teradatasql.(*TeradataRows).executeSQLRequest TeradataRows.go:874
     at gosqldriver/teradatasql.newTeradataRows TeradataRows.go:720
     at gosqldriver/teradatasql.(*teradataStatement).QueryContext TeradataStatement.go:122
     at gosqldriver/teradatasql.(*teradataConnection).QueryContext TeradataConnection.go:1261
     at database/sql.ctxDriverQuery ctxutil.go:48
     at database/sql.(*DB).queryDC.func1 sql.go:1776
     at database/sql.withLock sql.go:3530
     at database/sql.(*DB).queryDC sql.go:1771
     at database/sql.(*Conn).QueryContext sql.go:2027
     at main.createRows goside.go:1080
     at main.goCreateRows goside.go:959
     at _cgoexp_e3ee842aae7c_goCreateRows _cgo_gotypes.go:414
     at runtime.cgocallbackg1 cgocall.go:403
     at runtime.cgocallbackg cgocall.go:322
     at runtime.cgocallback asm_amd64.s:1079
     at runtime.goexit asm_amd64.s:1695
07:09:25  
07:09:25  Done. PASS=0 WARN=0 ERROR=1 SKIP=0 TOTAL=1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature:unit-tests Issues related to built-in dbt unit testing functionality Stale Mark an issue or PR as stale, to be closed type:bug Something isn't working as documented
Projects
None yet
Development

No branches or pull requests

4 participants