title |
---|
Submitting a PR for a new driver |
If you want to submit a PR to add a driver plugin to the Metabase repo (as opposed to keeping it in a separate repo), you'll need to:
- Be able to run your database locally with Docker.
- Make sure your driver passes Metabase's core test suite.
To test your driver, you'll need to:
- Move your plugin into the
modules/drivers
directory in the Metabase repository. - Add test extensions to your driver.
- Edit
.github/workflows/drivers.yml
to tell GitHub Actions how to set up a Docker image for your database and run tests against it.
Test extensions do things like create new databases and load data for given database definitions. Metabase defines a huge suite of tests that automatically run against all drivers, including your new driver.
To run the test suite with your driver, you'll need to write a series of method implementations for special test extension multimethods. Test extensions do things like create new databases and load data for database definitions.
These test extensions will tell Metabase how to create new databases and load them with test data, and provide information about what Metabae can expect from the created database. Test extensions are simply additional multimethods used only by tests. Like the core driver multimethods, they dispatch on the driver name as a keyword, e.g. :mysql
.
Test extensions for a driver usually live in a namespace called metabase.test.data.<driver>
. If your driver is for SQLite, your files should look something like:
metabase/modules/drivers/sqlite/deps.edn ; <- deps go in here
metabase/modules/drivers/sqlite/resources/metabase-plugin.yaml ; <- plugin manifest
metabase/modules/drivers/sqilte/src/metabase/driver/sqlite.clj ; <- main driver namespace
So you'll create a new directory and file to house your text extension method implementations.
metabase/modules/drivers/sqlite/test/metabase/test/data/sqlite.clj ; <- test extensions
Metabase test extensions live in the metabase.test.data.interface
namespace. Like the core driver methods, :sql
and :jdbc-sql
implement some of the test extensions themselves, but define additional methods you must implement to use them; see the metabase.test.data.sql
and metabase.test.data.sql-jdbc
namespaces.
You'll need to require the following namespaces, aliased like so:
(require '[metabase.test.data.interface :as tx]) ; tx = test extensions
(require '[metabase.test.data.sql :as sql.tx]) ; sql test extensions
(require '[metabase.test.data.sql-jdbc :as sql-jdbc.tx])
Like the driver itself, you need to register the fact that your driver has test extensions, so Metabase knows it doesn't need to try to load them a second time. (If they're not loaded yet, Metabase will load them when needed by looking for a namespace named metabase.test.data.<driver>
, which is why you need to follow that naming pattern.) The :sql
and :sql-jdbc
drivers have their own sets of test extensions, so depending on which parent you're using for your driver, register test extensions with:
# Non-SQL drivers
(tx/add-test-extensions! :mongo)
# non-JDBC SQL
(sql/add-test-extensions! :bigquery)
# JDBC SQL
(sql-jdbc.tx/add-test-extensions! :mysql)
You only need one call -- there's no need to do all three for a :sql-jdbc
driver. This call should go at the beginning of your test extension namespace, like this:
(ns metabase.test.data.mysql
(:require [metabase.test.data.sql-jdbc :as sql-jdbc.tx]))
(sql-jdbc.tx/register-test-extensions! :mysql)
Let's look at an real-life Metabase test so we can understand how it works and what exactly we need to do to power it:
;; expect-with-non-timeseries-dbs = run against all drivers listed in `DRIVERS` env var except timeseries ones like Druid
(expect-with-non-timeseries-dbs
;; expected results
[[ 5 "Brite Spot Family Restaurant" 20 34.0778 -118.261 2]
[ 7 "Don Day Korean Restaurant" 44 34.0689 -118.305 2]
[17 "Ruen Pair Thai Restaurant" 71 34.1021 -118.306 2]
[45 "Tu Lan Restaurant" 4 37.7821 -122.41 1]
[55 "Dal Rae Restaurant" 67 33.983 -118.096 4]]
;; actual results
(-> (data/run-mbql-query venues
{:filter [:ends-with $name "Restaurant"]
:order-by [[:asc $id]]})
rows formatted-venues-rows))
Let's say we launch tests with
DRIVERS=mysql clojure -X:dev:drivers:drivers-dev:test`.
- Metabase will check and see if test extensions for
:mysql
are loaded. If not, it will(require 'metabase.test.data.mysql)
. - Metabase will check to see if the default
test-data
database has been created for MySQL, loaded with data, and synced. If not, it will call the test extension methodtx/load-data!
to create thetest-data
database and load data into it. After loading the data, Metabase syncs the test database. (This is discussed in more detail below.) - Metabase runs an MBQL query against the
venues
table of the MySQLtest-data
database. Therun-mbql-query
macro is a helper for writing tests that looks up Field IDs based on names for symbols that have$
in from of them. Don't worry too much about that right now; just know the actual query that is ran will look something like:{:database 100 ; ID of MySQL test-data database :type :query :query {:source-table 20 ; Table 20 = MySQL test-data.venues :filter [:ends-with [:field-id 555] "Restaurant"] ; Field 555 = MySQL test-data.venues.name :order-by [[:asc [:field-id 556]]]}} ; Field 556 = MySQL test-data.venues.id
- The results are ran through helper functions
rows
andformatted-venues-rows
which return only the parts of the query results we care about - Those results are compared against the expected results.
That's about as much as you'd need to know about the internals of how Metabase tests work; now that we've covered that, let's take a look at how we can empower Metabase to do what it needs to do.
In order to ensure consistent behavior across different drivers, the Metabase test suite creates new databases and load datas into them from a set of shared Database Definitions. That means whether we're running a test against MySQL, Postgres, SQL Server, or MongoDB, a single test can check that we get the exact same results for every driver!
Most of these database definitions live in EDN files; the majority of tests run against a test database named "test data", whose definition can be found here. Take a look at that file -- it's just a simple set of tables names, column names and types, and then a few thousand rows of data to load into those tables.
Like test extension method definitions, schemas for DatabaseDefinition
live in metabase.test.data.interface
-- you can take a look and see exactly what a database definition is supposed to look like.
Your biggest job as a writer of test definitions is to write the methods needed to take a database definition, create a new database with the appropriate tables and columns, and load data into it. For non-SQL drivers, you'll need to implement tx/load-data!
; :sql
and :sql-jdbc
have a shared implementation used by child drivers, but define their own set of test extension methods. For example, :sql
(and :sql-jdbc
) will handle the DDL statements for creating tables, but need to know what type it should use for the primary key, so you'll need to implement sql.tx/pk-sql-type
:
(defmethod sql.tx/pk-sql-type :mysql [_] "INTEGER NOT NULL AUTO_INCREMENT")
I'd like to document every single test extension method in detail here, but until I find the time to do that, the methods are all documented in the codebase itself; take a look at the appropriate test extension namespaces and see which methods you'll need to implement. You can also refer to the test extensions written for other similar drivers to get a picture of what exactly it is you need to be doing.
Of course, Metabase also needs to know how it can connect to your newly created database. Specifically, it needs to know what it should save as part of the connection :details
map when it saves the newly created database as a Database
object. All drivers with test extensions need to implement tx/dbdef->connection-details
to return an appropriate set of :details
for a given database definition. For example:
(defmethod tx/dbdef->connection-details :mysql [_ context {:keys [database-name]}]
(merge
{:host (tx/db-test-env-var :mysql :host "localhost")
:port (tx/db-test-env-var :mysql :port 3306)
:user (tx/db-test-env-var :mysql :user "root")
;; :timezone :America/Los_Angeles
:serverTimezone "UTC"}
(when-let [password (tx/db-test-env-var :mysql :password)]
{:password password})
(when (= context :db)
{:db database-name})))
Let's take a look at what's going on here.
tx/dbdef->connection-details
is called in two different contexts:
- When creating a database,
- And when loading data into one and syncing.
Most databases won't let you connect to a database that hasn't been created yet, meaning something like a CREATE DATABASE "test-data";
statement would have to be ran without specifying test-data
as part of the connection. Thus, the context
parameter. context
is either :server
, meaning "give me details for connecting to the DBMS server, but not to a specific database", or :db
, meaning "give me details for connecting to a specific database". In MySQL's case, it adds the :db
connection property whenever context is :db
.
You'll almost certainly be running your database in a local Docker container. Rather than hardcode the connection details (the username, host, port...) for the Docker container, we'd like to be flexible, and let people specify those in environment variables, in case they're running against a different container or are just running the database outside of a container, or on another computer entirely. You can use tx/db-test-env-var
to get details from environment variables. For example,
(tx/db-test-env-var :mysql :user "root")
Tells Metabase to look for the environment variable MB_MYSQL_TEST_USER
; if not found, default to "root"
. The name of the environment variable follows the pattern MB_<driver>_TEST_<property>
, as passed into the function as first and second args, respectively. You don't need to specify a default value for tx/db-test-env-var
; perhaps user
is an optional parameter; and if MB_MYSQL_TEST_USER
isn't specified, you don't need to specify it in the connection details.
But what about properties you want to require, but do not have sane defaults? In those cases, you can use tx/db-test-env-var-or-throw
. It the corresponding enviornment variable isn't set, these will throw an Exception, ultimately causing tests to fail.
;; If MB_SQLSERVER_TEST_USER is unset, the test suite will quit with a message saying something like
;; "MB_SQLSERVER_TEST_USER is required to run tests against :sqlserver"
(tx/db-test-env-var-or-throw :sqlserver :user)
Note that tx/dbdef->connection-details
won't get called in the first place for drivers you aren't running tests against (i.e., drivers not listed in the DRIVERS
env var), so you wouldn't see that SQL Server error message when running tests against Mongo, for example.
Besides tx/db-test-env-var
, metabase.test.data.interface
has several other helpful utility functions. Take a good look at that namespace as well as metabase.test.data.sql
if your database uses SQL and metabase.test.data.sql-jdbc
if your database uses a JDBC driver.
There's a few other things Metabase needs to know when comparing test results. For example, different databases name tables and columns in different ways; methods exist to let Metabase know it should expect something like the venues
table in the test-data
Database Definition to come back as VENUES
for a database that uppercases everything. (We consider such minor variations in naming to still mean the same thing.) Take a look at tx/format-name
and other methods like that and see which ones you need to implement.
This is actually a common problem, and luckily we have figured out how to work around it. The solution is usually something like using different schemas in place of different databases, or prefixing table names with the database name, and creating everything in the same database. For SQL-based databases, you can implement sql.tx/qualified-name-components
to have tests use a different identifier instead of what they would normally use, for example "shared_db"."test-data_venues".id
instead of "test-data".venues.id
. The SQL Server and Oracle test extensions are good examples of such black magic in action.
Once you have all the tests passing, you'll need to set up GitHub Actions to run those tests against your driver. You'll need to add a new job to .github/workflows/drivers.yml
to run tests against your database.
Here is an example configuration for PostgreSQL.
be-tests-postgres-latest-ee:
needs: files-changed
if: github.event.pull_request.draft == false && needs.files-changed.outputs.backend_all == 'true'
runs-on: ubuntu-22.04
timeout-minutes: 60
env:
CI: 'true'
DRIVERS: postgres
MB_DB_TYPE: postgres
MB_DB_PORT: 5432
MB_DB_HOST: localhost
MB_DB_DBNAME: circle_test
MB_DB_USER: circle_test
MB_POSTGRESQL_TEST_USER: circle_test
MB_POSTGRES_SSL_TEST_SSL: true
MB_POSTGRES_SSL_TEST_SSL_MODE: verify-full
MB_POSTGRES_SSL_TEST_SSL_ROOT_CERT_PATH: 'test-resources/certificates/us-east-2-bundle.pem'
services:
postgres:
image: circleci/postgres:latest
ports:
- "5432:5432"
env:
POSTGRES_USER: circle_test
POSTGRES_DB: circle_test
POSTGRES_HOST_AUTH_METHOD: trust
steps:
- uses: actions/checkout@v4
- name: Test Postgres driver (latest)
uses: ./.github/actions/test-driver
with:
junit-name: 'be-tests-postgres-latest-ee'
For more on what it is you're doing here and how all this works, see Workflow syntax for GitHub Actions.