Universal Q&A database tool
AskDB is a Q&A database tool that allows to ask questions to your relational data in natural language and get answers, using the OpenAI LLMs.
In very brief, what it does is:
- Connect to the database of choice
- Extract the metadata (tables schemas and example rows)
- Prompt the LLM with the metadata togheter with the desired question
- The LLM responds with the corresponding SQL query
- The query is executed
- The LLM is called a second time, converting the result set in a natural language answer
For now, supported databases are: PostgreSQL, MySql, Sqlite. AskDB, however, is designed to be easly extended to new databases (including data wharehouse platforms such as BigQuery) and will be in the future.
Keep in mind that this is a project in its very early stages and for sure many features are missing.
AskDB is written in Python. The design philosophy is to keep things simple and to create from scratch, avoiding frameworks such as LangChain (powerful but bloated and in my opinion full of questionable abstractions). Thus, this project is built on top of just a few libraries:
- SqlAlchemy for database handling
- NiceGUI for the web based graphical interface
- Click for the CLI
And a couple of others.
You need Python (>= 3.10), Pip and Git installed on your system.
Installation is very straightforward:
pip install -U pip
pip install git+https://github.com/marcolardera/askdb
Before using Askdb, you need to configure the connections to your databases. In order to do so follow these steps:
- Create a
askdb
folder inside your configuration folder as defined by the XDG Base Directory Specification. On Linux/MacOS, by default, it is the~/.config
folder but can be changed using theXDG_CONFIG_HOME
environment variable. On Windows it should beC:/Users/<USERNAME>/.config
. - Create a
connections.json
file inside that folder. The full path should be something like~/.config/askdb/connections.json
orC:/Users/<USERNAME>/.config/askdb/connections.json
- Fill the file with connections informations as defined below
Example of connections.json
:
[
{
"name": "Postgres",
"connection_string": "postgresql://<USER>:<PASSWORD>@<HOST>:<PORT>/database"
},
{
"name": "MySql",
"connection_string": "mysql://<USER>:<PASSWORD>@<HOST>:<PORT>/database"
},
{
"name": "Sqlite",
"connection_string": "sqlite:////path/to/my/database.db"
}
]
The name
field is just the id of the connection, you can put whatever you want in it. The connection_string
field follows the standard db connection string convention. In general, if the string is valid for SqlAlchemy, then it is valid for AskDB. You can have as many connections as you want inside this file.
Keep in mind that the supported (and tested) databases are PostgreSQL, MySql and Sqlite but the tool should work even with other DBMS supported by SqlAlchemy, with the only difference being that in this case it would use a generic prompt instead of a customised one.
Once the connections have been configured, one final step remains. Since AskDB uses the OpenAI models, an Openai API Key must be provided. To do so, simply set the environment variable OPENAI_API_KEY
.
THIS IS VERY IMPORTANT: Letting a LLM create a query and executing it without human validation may be dangerous for your data. What if the model hallucinates a INSERT/UPDATE/DELETE statement instead of a SELECT? The probability of this happening is low but not zero. In order to avoid this risk remember to always create users with SELECT privileges only to use for this tool.
Also keep in mind the privacy implications of sending database schemas and examples of data to a remote LLM. Always check regulations, policies, etc.
In order to run AskDB just type:
askdb
The web interface will open in the browser.
The use should be self-evident: simply select the database to be queried (defined in the connections.json
file) and the OpenAI model, enter the question and press 'Submit'. After a while you will get the answer. By clicking on "Show SQL" you can view the SQL query generated by the LLM and a table with the results.
Although all the OpenAI chat models are available, it is recommended to use the gpt-4/gpt-4-turbo family, as they are able to follow instructions much more accurately.
There is also a command-line interface, designed primarily for testing purposes, which can be invoked with the command askdb-cli
. The syntax is the following:
Usage: askdb-cli [OPTIONS] QUESTION
Options:
-d, --database TEXT Name of the database (from connections.json)
-m, --model TEXT OpenAI model to use
--help Show this message and exit.
For example:
askdb-cli -d "Postgres local" "How many users we have?"
If the database option (-d
) is omitted it will use, by default, the first entry in the connections.json
file.