- Overview
- Setup
- Write or modify an integration test
- Implement the functionality
- Commit your changes and open a pull request
- python3
- make (Optional, but recommended for better development experience)1
- Docker (Optional, but recommended for using Postgres as your target database easily)2
Packages in dbt are actually dbt projects themselves, you write SQL and Jinja, sometimes in macros, to add new functionality or models to another dbt project. As SQL and Jinja rely on input data, it's essential to have a functioning project to be able to test that the code works as expected. Constantly running the code, loading data, running bits and pieces, and hoping for the best is not a good development flow though, nor is it a reliable way to ensure that everything works. This is why our dbt packages have integration tests. These tests run all of the data loading, model building, and tests that are defined in the package inside testing environments, and check that the results are as expected.
If you add or modify functionality in any codegen macros, there should be corresponding changes to the integration tests. This README will walk you through this process. Let's outline the basic steps first:
- Set up your environment (credentials, virtual environment, dependencies, test database(s))
- Write or modify an integration test (you should expect this to fail as you haven't implemented the functionality yet!)
- Implement the functionality in the new or modified macro, and run the tests to get them to pass.
- Commit your changes and open a pull request.
You'll need to set environment variables with the credentials to access your target database. If you're using the recommended local development path of Postgres in Docker, these values are already filled in as they are generic. For the cloud warehouses listed, you'll need real credentials. You probably want to ensure you're building into a testing schema as well to keep the output of this codegen separate from any production data. We run against all the warehouses listed in the CI (implmented via CircleCI) when you open a PR, so feel free to test against Postgres while developing, and we'll ensure the code works against all the other targets.
You can set these env vars in a couple ways:
- Temporary: Set these environment variables in your shell before running the tests. This is the easiest way to get started, but you'll have to set them every time you open a new terminal.
- Reusable: If you anticipate developing for multiple sessions, set these environment variables in your shell profile (like
~/.bashrc
or~/.zshrc
). This way, you won't have to set them every time you open a new terminal.
The environment variables you'll need to set for each adapter can be found in integration_tests/.env/.
As mentioned, you'll need a target database to run the integration tests and develop against. You can use a cloud warehouse, but the easiest and free way to work is to use Postgres locally. We include a docker-compose.yml
file that will spin up a Postgres container for you to make this easy.
To run the Postgres container, just run:
make setup-db
Or, alternatively:
docker-compose up --detach postgres
Note
make
is a venerable build tool that is included in most Unix-like operating systems. It's not strictly necessary to use make
to develop on this project, but there are several make
commands that wrap more complex commands and make development easier. If you don't have make
installed or don't want to use it, you can just run the commands in the Makefile
directly. All the examples will show both options.
We strongly recommend using virtual environments when developing code in dbt-codegen
. We recommend creating this virtual environment in the root of the dbt-codegen
repository. To create a new virtual environment, run:
python3 -m venv .venv
source .venv/bin/activate
This will create and activate a new Python virtual environment.
First make sure that you set up your virtual environment as described above. Also ensure you have the latest version of pip and setuptools installed:
python3 -m pip install --upgrade pip setuptools
Next, install dbt-core
(and its dependencies) with:
make dev target=[postgres|redshift|...]
# or
python3 -m pip install dbt-core dbt-[postgres|redshift|...]
Or more specific:
make dev target=postgres
# or
python3 -m pip install dbt-core dbt-postgres
Make sure to reload your virtual environment after installing the dependencies:
source .venv/bin/activate
Run all the tests before you start developing to make sure everything is working as expected before you start making changes. Nothing is worse than spending a ton of time troubleshooting a failing test, only to realize it was failing before you touched anything. This will also ensure that you have the correct environment variables set up and that your database is running.
To run all the integration tests on your local machine like they will get run in CI:
make test target=[postgres|redshift|...]
# or
./run_test.sh [postgres|redshift|...]
Or more specific:
make test target=postgres
# or
./run_test.sh postgres
To run all the integration tests on your local machine like they will get run in the CI (using GitHub workflows with tox):
make test_tox target=postgres
Adding integration tests for new functionality typically involves making one or more of the following:
- a new seed file of fixture data
- a new model file to test against
- a new test to assert anticipated behaviour
Once you've added and/or edited the necessary files, assuming you are in the sub-project in the integration_tests
folder, you should be able to run and test your new additions specifically by running:
dbt deps --target {your_target}
dbt build --target {your_target} --select +{your_selection_criteria}
The dbt build
command will handle seeding, running, and testing the selection in a single command. The +
operator in the --select
flag indicates we also want to build everything that this selection depends on.
Or simply make dev target={your_target}
and then make test target={your_target}
if you're okay with running the entire project and all tests.
Remember, typically you'll want to create a failing test first, then implement the functionality to make it pass. This is called "test-driven development" (TDD) and it's a great way to ensure that your code really does what you expect it to. For example, let's imagine you wrote a test expecting it to fail, but it passed before you even implemented your logic! That would mean the test is not actually testing what you want, and you'd need to re-evaluate your assumptions. That's something you want to catch early in the development process, and what TDD is all about. So, expect this run of tests after you add your new logic to fail.
Okay finally, this is the fun part! You can now implement the functionality in the macro you're working on.The development flow should be something like:
- You've got a failing test, so you know what you need to implement.
- Implement some logic in the macro you're working on.
- Run the relevant tests to see if they pass.
- Repeat until the tests pass.
- Run the full test suite to ensure you didn't break anything else by accident.
Once your tests are passing and you're happy with the code, you'll want to commit it and open a new PR on GitHub. Don't forget to run the full test suite against your target database before you open a PR to make sure you didn't accidentally break any existing functionality. When you open a PR, CircleCI will run the same test suite against all the database targets. If they're passing, we'll triage and review the code as soon as we can! Thank you for contributing to dbt-codegen!
Footnotes
-
If you're on a Mac,
make
is probably best installed with the XCode Command Line Tools, or you can installmake
via Homebrew withbrew install cmake
. If you're on Windows, you can either use the Windows Subsystem for Linux (WSL) or usescoop
orchocolatey
to installmake
. If you're on Linux, you probably already havemake
installed. ↩ -
Specific instructions on installing and getting started with Docker for your OS can be found here. ↩