Skip to content

Commit

Permalink
add download of queries report
Browse files Browse the repository at this point in the history
  • Loading branch information
wchen4 committed Feb 29, 2024
1 parent 9644a3a commit 64d74de
Show file tree
Hide file tree
Showing 5 changed files with 54 additions and 2 deletions.
1 change: 1 addition & 0 deletions scenarios/account_reporting/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ This scenario shows how you can ingest account data from Treasure Data API's for
- The API base urls for your region can be found here: https://api-docs.treasuredata.com/en/overview/aboutendpoints/#treasure-data-api-baseurls
- `target.database` - Sets the database the data will be ingested to. This database must exist prior to running the workflow.
- `target.tables` - Sets the table names for each report.
- `reports_to_run` - [true|false] Enable/disable the download of a report

2. Upload the workflow with TD CLI.
```
Expand Down
15 changes: 14 additions & 1 deletion scenarios/account_reporting/account_reporting.dig
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,25 @@ _export:
if>: ${reports_to_run.activations_list}
_do:
+get_activations:
py>: activations.get_list
py>: scripts.activations.get_list
destination_db: ${target.database}
destination_tbl: ${target.tables.activations_list}
_env:
TD_API_KEY: ${secret:td.apikey}
TD_API_BASEURL: ${td_api_baseurl}
CDP_API_BASEURL: ${cdp_api_baseurl}
docker:
image: "digdag/digdag-python:3.9"

+get_queries:
if>: ${reports_to_run.queries_list}
_do:
+get_activations:
py>: scripts.queries.get_list
destination_db: ${target.database}
destination_tbl: ${target.tables.queries_list}
_env:
TD_API_KEY: ${secret:td.apikey}
TD_API_BASEURL: ${td_api_baseurl}
docker:
image: "digdag/digdag-python:3.9"
4 changes: 3 additions & 1 deletion scenarios/account_reporting/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@ target:
database: account_reporting
tables:
activations_list: activations
queries_list: queries

reports_to_run:
activations_list: true
activations_list: true
queries_list: true
File renamed without changes.
36 changes: 36 additions & 0 deletions scenarios/account_reporting/scripts:queries.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
import os
import pytd
import requests
import pandas as pd
import json

def get_list(destination_db, destination_tbl):
# get all queries
queries_list = get_queries_list()
print(f'{len(queries_list)} queries found')

# remove sql from result
queries_df = pd.DataFrame(queries_list)
queries_df.drop(['query'],axis=1,inplace=True)

if queries_df.empty:
print('No queries found on account')
else:
# write queries to db
apikey = os.environ["TD_API_KEY"]
td_api_baseurl = os.environ["TD_API_BASEURL"]
client = pytd.Client(endpoint=td_api_baseurl,apikey=apikey,database=destination_db,default_engine='presto')
client.load_table_from_dataframe(queries_df,destination=destination_tbl,writer='bulk_import',if_exists='overwrite')

def get_queries_list():
return td_get("/v3/schedule/list")

def td_get(endpoint):
apikey = os.environ["TD_API_KEY"]
headers = {'Authorization': 'TD1 ' + apikey}

td_api_baseurl = os.environ["TD_API_BASEURL"]
request_url = td_api_baseurl + endpoint

response = requests.get(url = request_url, headers = headers)
return (response.json())['schedules']

0 comments on commit 64d74de

Please sign in to comment.