-
Notifications
You must be signed in to change notification settings - Fork 684
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Start Maintenance Daemon for Main DB at the server start. #7254
Conversation
@@ -146,6 +147,12 @@ InitializeMaintenanceDaemon(void) | |||
void | |||
InitializeMaintenanceDaemonForAdminDB(void) | |||
{ | |||
if (strcmp(ControlDbName, "") == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are GUCs already read in _PG_init? If not, this would always return true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it appears that they are read since this change is able to start a maintenance deamon in the controldb even if there is no connection.
It is also possible to move this check to CitusMaintenanceDaemonMain, Then a background worker will start but then quit immediately.
Codecov Report
@@ Coverage Diff @@
## main #7254 +/- ##
==========================================
- Coverage 93.22% 93.11% -0.11%
==========================================
Files 275 275
Lines 59491 61397 +1906
Branches 0 7421 +7421
==========================================
+ Hits 55459 57169 +1710
- Misses 4032 4163 +131
- Partials 0 65 +65 |
@@ -481,6 +481,7 @@ _PG_init(void) | |||
#endif | |||
|
|||
InitializeMaintenanceDaemon(); | |||
InitializeMaintenanceDaemonForMainDb(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you treat DB as a word? Should it be ...ForMainDb()
or...ForMainDB()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both styles seem to be used in the codebase. E.g:
I do not have a preference.
def test_set_maindb(coord): | ||
with coord.cur() as cur1: | ||
cur1.execute("DROP DATABASE IF EXISTS mymaindb;") | ||
cur1.execute("CREATE DATABASE mymaindb;") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's track the databases we create so we can clean them up automatically, just like we do for schemas. See create_schema
, cleanup_schemas
and cleanup_test_leftovers
in src/test/regress/citus_tests/common.py
That should also make the DROP DATABASE IF EXISTS
unnecessary.
SafeSnprintf(worker.bgw_name, sizeof(worker.bgw_name), | ||
"Citus Maintenance Daemon: %u/%u", | ||
0, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think instead of 0/0 it's more useful to users to write out what it is for.
SafeSnprintf(worker.bgw_name, sizeof(worker.bgw_name), | |
"Citus Maintenance Daemon: %u/%u", | |
0, 0); | |
strcpy_s(worker.bgw_name, sizeof(worker.bgw_name), | |
"Citus Maintenance Daemon for Main DB"); |
/* | ||
* When the database crashes, background workers are restarted, but | ||
* the state in shared memory is lost. In that case, we exit and wait | ||
* for Postmaster calling __PG_Init which in turn calls | ||
* InitializeMaintenanceDaemonForAdminDB. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is actually not true here. Here' we're checking for insert failure, i.e. MaintenaceDaemonHash being full. We should throw a useful error here about that, because that's quite unexpected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually think it might be good to move this insertion logic (including zero-ing) to a function that is called here and by InitializeMaintenanceDaemonBackend
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think my main source of finding this code confusing is the two branches (if (databaseOid == 0)
and else
) that seem similar, but are actually quite different. I think it would help readability to split these a bit better. Either by putting their bodies in a function, or creating a new CitusMaintenanceDaemonMain
function for the databaseOid == 0 case. These could then call some functions for the shared code. But I think having these long branches but quite different branches make the code hard to follow, since now there's two paths that you need to look at carefully.
b28a6c8
to
1e805da
Compare
== 0 | ||
) | ||
|
||
cluster.coordinator.cleanup_databases() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cleanup_databases
should be called in the cleanup_test_leftovers
function so that it is called even when the test fails.
* ConnectToDatabase connects to the database for the given databaseOid. | ||
* if databaseOid is 0, connects to MainDb and then creates a hash entry. | ||
* If a hash entry cannot be created for MainDb it exists the process requesting a restart. | ||
* However for regular databases, it exits without requesting a restart since another | ||
* subsequent backend should be starting the Maintenance Deamon. | ||
* If a hash entry found has a valid workerPid, it exits | ||
* without requesting a restart since there is already a deamon running. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rewording + typo fixes
* ConnectToDatabase connects to the database for the given databaseOid. | |
* if databaseOid is 0, connects to MainDb and then creates a hash entry. | |
* If a hash entry cannot be created for MainDb it exists the process requesting a restart. | |
* However for regular databases, it exits without requesting a restart since another | |
* subsequent backend should be starting the Maintenance Deamon. | |
* If a hash entry found has a valid workerPid, it exits | |
* without requesting a restart since there is already a deamon running. | |
* ConnectToDatabase connects to the database for the given databaseOid. | |
* if databaseOid is 0, connects to MainDb and then creates a hash entry. | |
* If a hash entry cannot be created for MainDb it exists the process requesting a restart. | |
* However for regular databases, it exits without requesting a restart since another | |
* subsequent backend should be starting the Maintenance Daemon. | |
* If the found hash entry has a valid workerPid, it exits | |
* without requesting a restart since there is already a daemon running. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a small suggestions about the wording in the codecomment. But other than that this looks good now.
a210359
to
2565296
Compare
DESCRIPTION: This change starts a maintenance deamon at the time of server start if there is a designated main database. This is the code flow: 1. User designates a main database: `ALTER SYSTEM SET citus.main_db = "myadmindb";` 2. When postmaster starts, in _PG_Init, citus calls `InitializeMaintenanceDaemonForMainDb` This function registers a background worker to run `CitusMaintenanceDaemonMain `with `databaseOid = 0 ` 3. `CitusMaintenanceDaemonMain ` takes some special actions when databaseOid is 0: - Gets the citus.main_db value. - Connects to the citus.main_db - Now the `MyDatabaseId `is available, creates a hash entry for it. - Then follows the same control flow as for a regular db,
DESCRIPTION: This change starts a maintenance deamon at the time of server start if there is a designated main database.
This is the code flow:
User designated an admin database:
ALTER SYSTEM SET citus.main_db = "myadmindb";
When postmaster starts, in _PG_Init, citus calls
InitializeMaintenanceDaemonForMainDb
This function registers a background worker to run
CitusMaintenanceDaemonMain
withdatabaseOid = 0
CitusMaintenanceDaemonMain
takes some special actions when databaseOid is 0:MyDatabaseId
is available, creates a hash entry for it.