You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Apr 8, 2024. It is now read-only.
How to reproduce
Pretty much just run dbt run with your python model for Athena
Expected behavior
A new table in the specified location in Athena
Actual behavior
I get this error
sqlalchemy.exc.OperationalError: (pyathena.error.OperationalError) [ErrorCategory:USER_ERROR, ErrorCode:SYNTAX_ERROR], Detail:FAILED: SemanticException [Error 10072]: Database does not exist: awsdatacatalog
What I noticed
So I went to the athena.py script in the athena/fal_experimental/support path and started tinkering with the code. I was actually able to get it to "work" by changing some of the variables. I put work in quotes because I don't know if there are any other effects that my code has changed.
But pretty much needed to change references to "schema" to "table" in some places and then I needed to add a blackslash to the staging directory.
I don't mind doing this work and creating a PR, but I've never contributed to someone else's github before, so I'm a little apprehensive to do so.
The text was updated successfully, but these errors were encountered:
Hi @jmon-fluent ! Feel free to do a PR and we will help you merge it.
Thanks! I actually just realized some things that maybe I should discuss first
The final table (after the temp table), seems to get placed in a the s3_stagng_directory instead of the full table path, like the sql tables
E.g. SQL table location= s3://staging_directory/tables/table_name vs Python table location = s3://staging_directory
Would it make sense to update the code so that it matches the SQL s3 location?
I realize that with python, the temp table is created by inserting the data frame values in 1 by 1. One alternative would be to have the data frame write directly to the s3 location as a csv or parquet file, and the temp Athena table just needs to point at this location. Are there any reasons the latter method would be undesirable?
One reason I ask is because "Drop Table" in Athena just removes the table reference, but the data still persists, so when we call drop_relation_if_it_exists(adapter, temp_relation) and then dataframe.to_sql(...), the temp table has all the old records and the new records
If we just overwrite the file, we'd make sure the temp table has only the new records
I haven't checked what incremental mode looks like, so I'll probably ask about that when I get to that
Please let me know any thoughts or concerns any of you might have
Just for update, dbt-fal-1.5.9 is still unable to work with dbt-athena-community. Currently problem appears on resolving database/schema/table in ref/source functions. For me it claims table does not exist and in compiled code it shows quoted reference: refs = {"v_enrolled": "\"awsdatacatalog\".\"partner_dashboards\".\"v_enrolled\""}
or sources = {"analytics.user_programs": "\"awsdatacatalog\".\"analytics\".\"user_programs\""}
Here's a link to the post I made on the DBT help site:
https://discourse.getdbt.com/t/python-dbt-fal-with-athena-error-database-does-not-exist/8051/4
Describe the bug
I tried following the directions here(dbt with AWS Athena and Python models. 1), but I’m getting an error.
I have dbt-fal (v 1.4.7) and dbt-athena-community (v1.4.3) installed in my conda environment
Your environment
How to reproduce
Pretty much just run dbt run with your python model for Athena
Expected behavior
A new table in the specified location in Athena
Actual behavior
I get this error
sqlalchemy.exc.OperationalError: (pyathena.error.OperationalError) [ErrorCategory:USER_ERROR, ErrorCode:SYNTAX_ERROR], Detail:FAILED: SemanticException [Error 10072]: Database does not exist: awsdatacatalog
What I noticed
So I went to the athena.py script in the athena/fal_experimental/support path and started tinkering with the code. I was actually able to get it to "work" by changing some of the variables. I put work in quotes because I don't know if there are any other effects that my code has changed.
But pretty much needed to change references to "schema" to "table" in some places and then I needed to add a blackslash to the staging directory.
I don't mind doing this work and creating a PR, but I've never contributed to someone else's github before, so I'm a little apprehensive to do so.
The text was updated successfully, but these errors were encountered: